Job Description
About Ascendion
Ascendion is a full-service digital engineering solutions company. We make and manage software platforms and products that power growth and deliver captivating experiences to consumers and employees. Our engineering, cloud, data, experience design, and talent solution capabilities accelerate transformation and impact for enterprise clients. Headquartered in New Jersey, our workforce of 6,000+ Ascenders delivers solutions from around the globe. Ascendion is built differently to engineer the next.
Ascendion | Engineering to elevate life
We have a culture built on opportunity, inclusion, and a spirit of partnership. Come, change the world with us:
- Build the coolest tech for world’s leading brands
- Solve complex problems – and learn new skills
- Experience the power of transforming digital engineering for Fortune 500 clients
- Master your craft with leading training programs and hands-on experience
Experience a community of change makers!
Join a culture of high-performing innovators with endless ideas and a passion for tech. Our culture is the fabric of our company, and it is what makes us unique and diverse. The way we share ideas, learning, experiences, successes, and joy allows everyone to be their best at Ascendion.
About the Role:
Job Title: Site Reliability Engineer (SRE)
- We are looking for a highly skilled and motivated Site Reliability Engineer (SRE) to join our team.
- In this role, you will be responsible for building and maintaining reliable, scalable, and efficient systems that ensure the high availability and performance of our applications.
- You will work closely with development and operations teams to implement SRE practices, including dashboard building, monitoring, and performance optimization.
- Design, build, and maintain SRE dashboards to provide real-time visibility into the health and performance of our applications.
- Implement and maintain SLA/SLO/SSO to ensure service reliability and align with business requirements.
- Leverage DevOps principles to improve CI/CD pipelines, enabling faster and more reliable deployment cycles.
- Support and optimize microservices development to ensure scalability, reliability, and performance across distributed systems.
- Build and manage AWS infrastructure for efficient resource provisioning, scaling, and monitoring.
- Collaborate with cross-functional teams to identify and resolve production issues in a timely manner.
- Automate monitoring, alerting, and remediation processes to reduce manual intervention and increase uptime.
- Participate in on-call rotations to ensure prompt resolution of incidents and service disruptions.
- Conduct post-mortems on incidents, identify root causes, and implement preventive measures to avoid recurrence.
- Foster a culture of continuous improvement, reliability, and resilience in the software development lifecycle.
- Proven experience in SRE practices, including dashboard building, monitoring, and alerting.
- In-depth understanding of SLA/SLO/SSO concepts and how they apply to service reliability.
- Strong experience with DevOps, including CI/CD pipelines, version control systems, and automated testing.
- Solid background in microservices development, containerization (Docker, Kubernetes), and distributed systems.
- Proficient in cloud infrastructure management, particularly AWS services (EC2, S3, Lambda, CloudWatch, etc.).
- Expertise in scripting and automation tools (e.g., Python, Bash, Terraform).
- Strong troubleshooting and incident response skills, with a focus on improving system reliability.
- Experience with monitoring tools such as Prometheus, Grafana, and Datadog.
- Strong collaboration and communication skills to work across teams and support business goals.
Location: McLean, VA or Richmond, VA - Hybrid Onsite role
Salary Range: The salary for this position is between $150,000 – $170,000 annually. Factors which may affect pay within this range may include geography/market, skills, education, experience, and other qualifications of the successful candidate.
Benefits: The Company offers the following benefits for this position, subject to applicable eligibility requirements: [medical insurance] [dental insurance] [vision insurance] [401(k) retirement plan] [long-term disability insurance] [short-term disability insurance] [5 personal days accrued each calendar year. The Paid time off benefits meet the paid sick and safe time laws that pertains to the City/ State] [10-15 days of paid vacation time] [6 paid holidays and 1 floating holiday per calendar year] [Ascendion Learning Management System]
Want to change the world? Let us know.
Tell us about your experiences, education, and ambitions. Bring your knowledge, unique viewpoint, and creativity to the table. Let’s talk!
Preferred Skills
- Familiarity with Kubernetes
- Background in high-availability systems
Job details
Job ID
330889
Job Requirements
Site Reliability Engineer
Location
McLean, Virginia, US
Recruiter
Aayush
aayush.shah@ascendion.com
About Ascendion
Ascendion is a full-service digital engineering solutions company. We make and manage software platforms and products that power growth and deliver captivating experiences to consumers and employees.
Our engineering, cloud, data, experience design, and talent solution capabilities accelerate transformation and impact for enterprise clients. Headquartered in New Jersey, our workforce of 6,000+ Ascenders delivers solutions from around the globe. Ascendion is built differently to engineer the next.
Visit Original Source:
http://www.indeed.com/viewjob