Opening for SRE(Site Reliability Engineer) Lead

Hi,

Hope to do well

 

My name is Abhishek Kumar, a Senior Recruiter with Saibber. We are looking for a suitable candidate for the position, I came across your resume and found this a perfect  fit for this role. I would appreciate it if you can provide the best time and number to reach you to discuss this further

Role :- SRE Lead(Site Reliability Engineer) Lead

Location :- Houston, TX(Onsite)

 

 

Primary skillset – New Relic developer, Automation/Programing (Python or Java), Linux

Job Summary:

We seek an experienced SRE Lead to lead our team in ensuring system reliability, performance, and scalability. The candidate will drive infrastructure automation, optimize performance, and lead incident management, while fostering a culture of continuous improvement

Key Responsibilities:

Technical Leadership: Build and mentor a team of SREs; set goals, conduct reviews, and drive SRE best practices.

System Reliability: Oversee the design and maintenance of high-availability systems; lead performance monitoring and issue resolution.

Automation & CI/CD: Lead development of automation scripts and enhance CI/CD pipelines using tools like Terraform, Ansible, and others

Observability: Deploy and manage tools (e.g., New Relic) for system monitoring; develop dashboards and alerts

Incident Management: Lead Root Cause Analysis (RCA) and refine incident response processes

Performance Optimization: Provide strategic insights to enhance application and database performance (Java, Kafka, SQL) 

Qualifications:

Proven experience managing SRE or related teams in an eCommerce or highly distributed systems environment.

Strong skills in automation tools (Terraform, Ansible) and observability solutions (New Relic), with an emphasis on managing large-scale distributed systems.

Experience working with SAP modules in conjunction with custom applications or microservices architectures.

Good understanding of storage technologies (SAN/NAS), network infrastructure (load balancers, firewalls), and their impact on system performance in high-throughput environments.

Background in optimizing performance for Java-based applications, Spring Boot services, Kafka message brokers, SQL/NoSQL databases, and middleware components.

Familiarity with middleware technologies such as Kafka in distributed environments.

Excellent leadership, problem-solving, communication skills with experience working cross-functionally between development teams, infrastructure teams, and business stakeholders.

 

 


0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Most Voted
Newest Oldest
Inline Feedbacks
View all comments