Lead SRE Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for a Lead SRE Engineer, offering a contract length of "unknown" at a pay rate of "unknown." Candidates should have 7+ years in SRE or DevOps, expertise in Kubernetes and OpenShift, and banking industry experience.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

🗓️ - Date discovered

September 13, 2025

🕒 - Project duration

Unknown

🏝️ - Location type

Unknown

📄 - Contract type

Unknown

🔒 - Security clearance

Unknown

📍 - Location detailed

Dallas, TX

🧠 - Skills detailed

#Cloud #Monitoring #Azure #Istio #Scripting #GitLab #Bash #Scala #Leadership #Deployment #Jenkins #Datadog #DevOps #Grafana #Observability #Python #Linux #Programming #Terraform #Automation #GCP (Google Cloud Platform) #Logging #Prometheus #AWS (Amazon Web Services) #Compliance #Security #PCI (Payment Card Industry) #Splunk #Kubernetes #Ansible

Role description

Position Overview We are seeking a Site Reliability Engineering (SRE) Lead to design, scale, and optimize our containerized platforms and critical banking applications. The ideal candidate will have deep expertise in Kubernetes and OpenShift, a strong background in banking or financial services, and a proven track record of leading reliability, automation, and performance engineering initiatives in mission-critical environments. This is a hands-on technical leadership role — balancing architecture, engineering, and mentoring responsibilities while ensuring the resilience, availability, and security of our digital platforms. Key Responsibilities Platform & Infrastructure Engineering • Lead the design, deployment, and optimization of Kubernetes and OpenShift clusters supporting critical financial services applications. • Ensure high availability, scalability, and security of containerized workloads in hybrid/multi-cloud environments. • Drive automation for provisioning, configuration management, and CI/CD pipelines. • Optimize system performance through capacity planning, observability, and proactive tuning. Reliability & Operations • Establish and own SLOs, SLIs, and SLAs for banking systems. • Lead incident response for critical outages, ensuring rapid recovery and root cause analysis. • Build and enhance observability platforms (logging, monitoring, tracing, alerting). • Collaborate with security teams to integrate compliance and risk controls into infrastructure design. Leadership & Collaboration • Mentor and guide a team of SRE engineers, fostering a culture of automation, ownership, and reliability. • Partner with DevOps, Application Development, and Security teams to embed SRE best practices across the organization. • Provide technical thought leadership in modern infrastructure, DevOps, and platform engineering strategies. Required Qualifications • 7+ years of site reliability, DevOps, or infrastructure engineering experience. • 3+ years in a leadership or senior technical role with team mentoring responsibilities. • Deep hands-on expertise with Kubernetes and OpenShift. • Strong background in banking, financial services, or other regulated industries. • Proficiency with CI/CD pipelines (Jenkins, GitLab, ArgoCD) and automation tools (Ansible, Terraform, Helm). • Strong skills in observability stacks (Prometheus, Grafana, ELK, Splunk, Datadog, etc.). • Solid experience with Linux systems, networking, and container runtimes. Preferred Qualifications • Cloud experience with AWS, Azure, or GCP. • Knowledge of compliance and regulatory standards (SOX, PCI DSS, FFIEC, ISO 27001). • Expertise with service mesh technologies (Istio, Linkerd). • Programming/scripting skills in Python, Go, or Bash. • Experience with resiliency engineering, chaos testing, and performance tuning.

Apply now Apply with DFH Sign up

← See all roles