

EPITEC
Cloud Engineer
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Cloud Engineer with a contract length of "unknown," offering a pay rate of "unknown." Located in a hybrid setting, it requires expertise in NVIDIA GPU systems, Linux administration, and experience with AI/ML frameworks and orchestration tools.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
Unknown
-
ποΈ - Date
November 20, 2025
π - Duration
Unknown
-
ποΈ - Location
Hybrid
-
π - Contract
Unknown
-
π - Security
Unknown
-
π - Location detailed
Chicago, IL
-
π§ - Skills detailed
#AI (Artificial Intelligence) #Docker #Grafana #GCP (Google Cloud Platform) #Documentation #Bash #Scala #Containers #Python #Kubernetes #Azure #Terraform #Ansible #Data Science #Compliance #Monitoring #Security #Automation #Linux #DevOps #PyTorch #TensorFlow #Scripting #ML (Machine Learning) #Cloud #AWS (Amazon Web Services) #Prometheus #Server Administration
Role description
Senior Server Administrator β AI Engineering Team
Onsite Requirement: 1 day/week initially, transitioning to 5 days/week
About the Role
We are seeking a highly skilled Senior Server Administrator to join our AI Engineering team. This role is pivotal in deploying, maintaining, and optimizing high-performance computing infrastructure leveraging NVIDIA GPU technologies. You will collaborate with AI researchers, data scientists, and software engineers to ensure our systems are robust, scalable, and tuned for cutting-edge machine learning workloads.
What Youβll Do
β’ Administer and maintain GPU-accelerated servers and clusters (NVIDIA A100, H100, etc.).
β’ Manage and optimize NVIDIA software stack (CUDA, cuDNN, TensorRT, NCCL, NGC containers).
β’ Monitor system performance, troubleshoot issues, and ensure high availability.
β’ Collaborate with DevOps and AI teams to support containerized workflows (Docker, Kubernetes).
β’ Implement security best practices and compliance standards.
β’ Lead upgrades, patching, and lifecycle management of GPU servers.
β’ Provide documentation, automation scripts, and training for internal teams.
Why This Opportunity Stands Out
β’ Work on cutting-edge AI infrastructure projects.
β’ Collaborate with top-tier engineers and researchers.
β’ Gain hands-on experience with advanced GPU technologies and scalable systems.
What Weβre Looking For
β’ Education: Bachelorβs Degree + 8 years of experience (5+ years in server administration, 3+ years with NVIDIA GPU systems).
β’ Technical Skills:Linux system administration in HPC/AI environments.
β’ NVIDIA GPU drivers, CUDA toolkit, performance tuning.
β’ Slurm, Kubernetes, orchestration tools.
β’ Monitoring tools (Prometheus, Grafana), automation (Ansible, Terraform).
β’ Strong scripting (Bash, Python).
Preferred:NVIDIA Certified Professional.
β’ Multi-GPU/multi-node training experience.
β’ Familiarity with AI/ML frameworks (PyTorch, TensorFlow).
β’ Exposure to cloud GPU infrastructure (AWS, Azure, GCP).
Senior Server Administrator β AI Engineering Team
Onsite Requirement: 1 day/week initially, transitioning to 5 days/week
About the Role
We are seeking a highly skilled Senior Server Administrator to join our AI Engineering team. This role is pivotal in deploying, maintaining, and optimizing high-performance computing infrastructure leveraging NVIDIA GPU technologies. You will collaborate with AI researchers, data scientists, and software engineers to ensure our systems are robust, scalable, and tuned for cutting-edge machine learning workloads.
What Youβll Do
β’ Administer and maintain GPU-accelerated servers and clusters (NVIDIA A100, H100, etc.).
β’ Manage and optimize NVIDIA software stack (CUDA, cuDNN, TensorRT, NCCL, NGC containers).
β’ Monitor system performance, troubleshoot issues, and ensure high availability.
β’ Collaborate with DevOps and AI teams to support containerized workflows (Docker, Kubernetes).
β’ Implement security best practices and compliance standards.
β’ Lead upgrades, patching, and lifecycle management of GPU servers.
β’ Provide documentation, automation scripts, and training for internal teams.
Why This Opportunity Stands Out
β’ Work on cutting-edge AI infrastructure projects.
β’ Collaborate with top-tier engineers and researchers.
β’ Gain hands-on experience with advanced GPU technologies and scalable systems.
What Weβre Looking For
β’ Education: Bachelorβs Degree + 8 years of experience (5+ years in server administration, 3+ years with NVIDIA GPU systems).
β’ Technical Skills:Linux system administration in HPC/AI environments.
β’ NVIDIA GPU drivers, CUDA toolkit, performance tuning.
β’ Slurm, Kubernetes, orchestration tools.
β’ Monitoring tools (Prometheus, Grafana), automation (Ansible, Terraform).
β’ Strong scripting (Bash, Python).
Preferred:NVIDIA Certified Professional.
β’ Multi-GPU/multi-node training experience.
β’ Familiarity with AI/ML frameworks (PyTorch, TensorFlow).
β’ Exposure to cloud GPU infrastructure (AWS, Azure, GCP).






