• Home
  • Cloud Security
  • Devops
  • Cloud Consulting
    • Our Consultants
  • Careers
  • Contact Us
    • Hiring
  • HR
    • Job Postings
    • Payroll
    • Canada Visa
    • Requirements
    • Onboarding Process
  • Blog
    • What it needs to become Devops Engineer
    • Devops in Various Forms
    • Where our Consultants Work
    • Markets We Serve
    • E Verify
    • Global Talent
    • Offboarding Consultant
    • Devops Jobs Growth in Years to come
  • Discussion Board
  • Home
  • Cloud Security
  • Devops
  • Cloud Consulting
    • Our Consultants
  • Careers
  • Contact Us
    • Hiring
  • HR
    • Job Postings
    • Payroll
    • Canada Visa
    • Requirements
    • Onboarding Process
  • Blog
    • What it needs to become Devops Engineer
    • Devops in Various Forms
    • Where our Consultants Work
    • Markets We Serve
    • E Verify
    • Global Talent
    • Offboarding Consultant
    • Devops Jobs Growth in Years to come
  • Discussion Board
  • Forums
  • Members
  • Recent Posts
Notifications
Clear all

[Sticky] SRE Duties in your Organization

 
SRE
Last Post by Devops-admin 2 years ago
2 Posts
1 Users
0 Likes
282 Views
 Devops-admin
(@devops-admin)
Member Admin
Joined: 2 years ago
Posts: 32
Topic starter 22/11/2023 2:29 pm  

Site Reliability Engineering (SRE) is a discipline that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The primary goal of SRE is to create scalable and highly reliable software systems. The roles and responsibilities of SREs can vary across organizations, but generally, they involve the following:

  1. Service Reliability:

    • Ensure the reliability and availability of critical services and systems.
  2. Service Level Objectives (SLOs) and Service Level Indicators (SLIs):

    • Define and measure SLOs and SLIs to quantify the reliability of services.
  3. Automation:

    • Develop and maintain automation tools for deployment, monitoring, and incident response to minimize manual intervention.
  4. Capacity Planning:

    • Conduct capacity planning to ensure that systems can handle current and future loads.
  5. Incident Response:

    • Participate in on-call rotations and respond to incidents to minimize downtime and resolve issues promptly.
  6. Monitoring and Alerting:

    • Implement effective monitoring and alerting systems to detect and respond to anomalies and issues.
  7. Emergency Response:

    • Collaborate with development teams to conduct post-incident reviews and implement improvements to prevent future incidents.
  8. Performance Optimization:

    • Identify and address performance bottlenecks in systems to improve overall efficiency.
  9. Risk Management:

    • Assess risks to system reliability and implement mitigations to reduce the impact of potential issues.
  10. Infrastructure as Code (IaC):

    • Use IaC principles to manage and configure infrastructure, making it more scalable, version-controlled, and reproducible.
  11. Release Engineering:

    • Collaborate with development teams on the release process, ensuring smooth and reliable deployments.
  12. Security:

    • Work on security-related tasks, such as implementing secure coding practices, participating in security reviews, and ensuring compliance with security policies.
  13. Documentation:

    • Create and maintain documentation for operational procedures, configurations, and incident response playbooks.
  14. Cross-Functional Collaboration:

    • Collaborate with development, product, and other cross-functional teams to align reliability goals with overall business objectives.
  15. On-Call Responsibilities:

    • Share on-call responsibilities to respond to incidents and ensure 24/7 system reliability.

SREs focus on the intersection of software engineering and systems administration, applying software engineering principles to infrastructure and operations challenges. Their ultimate goal is to create scalable and reliable systems that meet or exceed service level objectives.

This topic was modified 2 years ago 3 times by Devops-admin

   
Quote
Topic Tags
#sre #security #oncall #i
 Devops-admin
(@devops-admin)
Member Admin
Joined: 2 years ago
Posts: 32
Topic starter 22/11/2023 2:32 pm  
  • Solving business and technical problems to maintain high available and reliable applications and infrastructure
  • Implementing monitoring solution or use existing monitoring platform to detect issues and create automated scripts that Acts to resolve the issues
  • Work towards reducing the error budget to minimum
  • Share workloads with the Devops team to resolve technical debts
  • Work on operations and on-call basis.

Technology Skills:

  • Scripting languages : ARM templates, Biceps, Terraform, Shell scripts
  • CICD tools: Github actions
  • OS: Linux (RHEL), Windows
  • Orchestrators : Kubernetes, Docker
  • Package managers: Helm, Nexus, Azure Artifacts
  • Observability : Azure monitor, Prometheus , ELK, Grafana
  • Service Management tool – Service Now
  • Source code Version control System – Github
  • Cloud knowledge- Azure IaaS, PaaS and SaaS solutions

   
ReplyQuote
Forum Jump:
Topic Tags:  #sre #security #oncall #i (1) ,

© 2023 — Devops. All Rights Reserved.

Disclaimer
Privacy Policy
Twitter Youtube

At Devopsjobs360.com, we are your trusted experts in DevOps, here to empower your organization with the tools and practices needed to drive innovation, accelerate deployment, and achieve operational excellence.

Contacts



3853472827

 Immigration_US@devopsjobs360.com

15517 Leadenhall street, Frisco, Texas-75036