DevOps Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for a DevOps Engineer with a 12+ month contract in Santa Clara, CA. Key skills include Kubernetes, Ansible, Python, and CI/CD pipeline experience. A minimum of 2 years in DevOps or related fields is required, with GPU-based environment familiarity preferred.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

🗓️ - Date discovered

May 30, 2025

🕒 - Project duration

More than 6 months

🏝️ - Location type

On-site

📄 - Contract type

Unknown

🔒 - Security clearance

Unknown

📍 - Location detailed

San Jose, CA

🧠 - Skills detailed

#Security #ML (Machine Learning) #Ansible #Cloud #Prometheus #Scala #Infrastructure as Code (IaC) #Version Control #Python #Grafana #Monitoring #Deployment #DevOps #Jenkins #AI (Artificial Intelligence) #Automation #Terraform #Kubernetes #Observability #GitHub #Docker #Bash

Role description

Title: DevOps Engineer Duration: 12+ months Location: Santa Clara, CA (Onsite) Description: • We are seeking a skilled and motivated DevOps Engineer to join our team in building and maintaining high-performance infrastructure for GPU-based workloads. • In this role, you'll be responsible for developing scalable, reliable systems across both on-premises and cloud environments. • You’ll work closely with engineering teams to streamline CI/CD pipelines, automate operations, and support advanced compute environments. Key Responsibilities: • Design and implement scalable infrastructure using Kubernetes across both on-prem and major cloud service providers (CSPs) • Develop and maintain CI/CD pipelines with tools like Buildkite, GitHub Actions, and Jenkins to ensure smooth and reliable software delivery • Automate infrastructure operations using Ansible, Python, and Bash to reduce manual toil and improve system consistency • Manage service deployment within Kubernetes using Helm and GitOps-style workflows • Configure and support GPU servers, including lifecycle management, health monitoring, and test automation • Maintain node health and security, ensuring timely updates and proactive monitoring of GPU server fleets • Provision, scale, and maintain Kubernetes clusters Required Qualifications • 2+ years of experience in DevOps, Site Reliability Engineering (SRE), or Infrastructure Engineering • Proficiency in Ansible, Python, and Bash for automation and tooling • Solid hands-on experience with Kubernetes, Docker, and Helm • Strong knowledge of CI/CD pipeline design, version control best practices, and build systems • Experience with monitoring and observability tools (e.g., Prometheus, Grafana, Nagios) Nice to Have • Familiarity with GPU-based compute environments and automated CI/test workflows • Experience with infrastructure-as-code (IaC) tools such as Terraform • Familiarity with container security practices and CVE scanning • Background in high-performance computing (HPC), Slurm, or ML/AI training pipelines

Apply now Apply with DFH Sign up

← See all roles

Go to role

DevOps Engineer

Premium Members Land Roles Faster—Upgrade today.

Earnix Developer

Business Analyst

Data Engineer

SAS Developer

Premium Members Land Roles Faster—Upgrade today.

Book a

chat

with us

Company