

Lead SRE Engineer
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Lead SRE Engineer, offering a contract length of "unknown" at a pay rate of "unknown." Candidates should have 7+ years in SRE or DevOps, expertise in Kubernetes and OpenShift, and banking industry experience.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
-
ποΈ - Date discovered
September 13, 2025
π - Project duration
Unknown
-
ποΈ - Location type
Unknown
-
π - Contract type
Unknown
-
π - Security clearance
Unknown
-
π - Location detailed
Dallas, TX
-
π§ - Skills detailed
#Cloud #Monitoring #Azure #Istio #Scripting #GitLab #Bash #Scala #Leadership #Deployment #Jenkins #Datadog #DevOps #Grafana #Observability #Python #Linux #Programming #Terraform #Automation #GCP (Google Cloud Platform) #Logging #Prometheus #AWS (Amazon Web Services) #Compliance #Security #PCI (Payment Card Industry) #Splunk #Kubernetes #Ansible
Role description
Position Overview
We are seeking a Site Reliability Engineering (SRE) Lead to design, scale, and optimize our containerized platforms and critical banking applications. The ideal candidate will have deep expertise in Kubernetes and OpenShift, a strong background in banking or financial services, and a proven track record of leading reliability, automation, and performance engineering initiatives in mission-critical environments.
This is a hands-on technical leadership role β balancing architecture, engineering, and mentoring responsibilities while ensuring the resilience, availability, and security of our digital platforms.
Key Responsibilities
Platform & Infrastructure Engineering
β’ Lead the design, deployment, and optimization of Kubernetes and OpenShift clusters supporting critical financial services applications.
β’ Ensure high availability, scalability, and security of containerized workloads in hybrid/multi-cloud environments.
β’ Drive automation for provisioning, configuration management, and CI/CD pipelines.
β’ Optimize system performance through capacity planning, observability, and proactive tuning.
Reliability & Operations
β’ Establish and own SLOs, SLIs, and SLAs for banking systems.
β’ Lead incident response for critical outages, ensuring rapid recovery and root cause analysis.
β’ Build and enhance observability platforms (logging, monitoring, tracing, alerting).
β’ Collaborate with security teams to integrate compliance and risk controls into infrastructure design.
Leadership & Collaboration
β’ Mentor and guide a team of SRE engineers, fostering a culture of automation, ownership, and reliability.
β’ Partner with DevOps, Application Development, and Security teams to embed SRE best practices across the organization.
β’ Provide technical thought leadership in modern infrastructure, DevOps, and platform engineering strategies.
Required Qualifications
β’ 7+ years of site reliability, DevOps, or infrastructure engineering experience.
β’ 3+ years in a leadership or senior technical role with team mentoring responsibilities.
β’ Deep hands-on expertise with Kubernetes and OpenShift.
β’ Strong background in banking, financial services, or other regulated industries.
β’ Proficiency with CI/CD pipelines (Jenkins, GitLab, ArgoCD) and automation tools (Ansible, Terraform, Helm).
β’ Strong skills in observability stacks (Prometheus, Grafana, ELK, Splunk, Datadog, etc.).
β’ Solid experience with Linux systems, networking, and container runtimes.
Preferred Qualifications
β’ Cloud experience with AWS, Azure, or GCP.
β’ Knowledge of compliance and regulatory standards (SOX, PCI DSS, FFIEC, ISO 27001).
β’ Expertise with service mesh technologies (Istio, Linkerd).
β’ Programming/scripting skills in Python, Go, or Bash.
β’ Experience with resiliency engineering, chaos testing, and performance tuning.
Position Overview
We are seeking a Site Reliability Engineering (SRE) Lead to design, scale, and optimize our containerized platforms and critical banking applications. The ideal candidate will have deep expertise in Kubernetes and OpenShift, a strong background in banking or financial services, and a proven track record of leading reliability, automation, and performance engineering initiatives in mission-critical environments.
This is a hands-on technical leadership role β balancing architecture, engineering, and mentoring responsibilities while ensuring the resilience, availability, and security of our digital platforms.
Key Responsibilities
Platform & Infrastructure Engineering
β’ Lead the design, deployment, and optimization of Kubernetes and OpenShift clusters supporting critical financial services applications.
β’ Ensure high availability, scalability, and security of containerized workloads in hybrid/multi-cloud environments.
β’ Drive automation for provisioning, configuration management, and CI/CD pipelines.
β’ Optimize system performance through capacity planning, observability, and proactive tuning.
Reliability & Operations
β’ Establish and own SLOs, SLIs, and SLAs for banking systems.
β’ Lead incident response for critical outages, ensuring rapid recovery and root cause analysis.
β’ Build and enhance observability platforms (logging, monitoring, tracing, alerting).
β’ Collaborate with security teams to integrate compliance and risk controls into infrastructure design.
Leadership & Collaboration
β’ Mentor and guide a team of SRE engineers, fostering a culture of automation, ownership, and reliability.
β’ Partner with DevOps, Application Development, and Security teams to embed SRE best practices across the organization.
β’ Provide technical thought leadership in modern infrastructure, DevOps, and platform engineering strategies.
Required Qualifications
β’ 7+ years of site reliability, DevOps, or infrastructure engineering experience.
β’ 3+ years in a leadership or senior technical role with team mentoring responsibilities.
β’ Deep hands-on expertise with Kubernetes and OpenShift.
β’ Strong background in banking, financial services, or other regulated industries.
β’ Proficiency with CI/CD pipelines (Jenkins, GitLab, ArgoCD) and automation tools (Ansible, Terraform, Helm).
β’ Strong skills in observability stacks (Prometheus, Grafana, ELK, Splunk, Datadog, etc.).
β’ Solid experience with Linux systems, networking, and container runtimes.
Preferred Qualifications
β’ Cloud experience with AWS, Azure, or GCP.
β’ Knowledge of compliance and regulatory standards (SOX, PCI DSS, FFIEC, ISO 27001).
β’ Expertise with service mesh technologies (Istio, Linkerd).
β’ Programming/scripting skills in Python, Go, or Bash.
β’ Experience with resiliency engineering, chaos testing, and performance tuning.