Quantum World Technologies Inc.

Sr. DevOps Engineer (GPU Bare Metal)

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Sr. DevOps Engineer (GPU Bare Metal) with a contract length of "unknown" and a pay rate of "unknown." Key skills include bare metal orchestration, Linux/Unix, automation, and AI/ML experience. Strong communication skills and HPC experience are preferred.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
October 31, 2025
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
New Jersey, United States
-
🧠 - Skills detailed
#Deployment #Network Engineering #Debugging #Unix #Monitoring #DevOps #KVM (Kernel-based Virtual Machine) #Linux #Ansible #ML (Machine Learning) #Scala #Virtualization #AI (Artificial Intelligence) #Cloud #Security #Automation #Documentation #CHEF #Puppet #Alation
Role description
Role: Sr. DevOps Engineer (GPU Bare Metal) Required Key Skills: Bare Metal, Open Source Virtualization (QEMU/KVM etc.), Linux/Unix, Automation/Orchestration (Puppet/Chef/Ansible etc.), DevOps, AI/ML We need engineers skilled in multiple areas here to support both GPU Bare Metal and GPU VM products. We also need engineers skilled in AI/ML DevOps and Linux Sysadmins. GPU Bare Metal Required Skills • Proven ability to orchestrate bare metal linux systems at scale including building automation for firmware updates, bios config management, configuring PXE environments. • Deep Linux systems experience including troubleshooting network interfaces, developing and applying configuration management, security best practices and monitoring and alerting. • Strong automation mindset. Expert knowledge in 1 or more orchestration tools such as MaaS, Salt, Chef, Ansible or Puppet. • Strong communication skills. Your job will involve writing detailed documentation for others to pick up or leading knowledge sharing sessions with operations teams. • Bonus skills include: • Hands-on experience in High Performance Computing (HPC) clustered environments from Nvidia or AMD. Experience in performing automated wide scale testing on NCCL or other frameworks. • Network engineering experience with VyOS platforms. What You'll Be Working On: • Provisioning and automating GPU Bare Metal deployments • DevOps - Assist customer support and CloudOps teams with GPU specific knowledge/debugging during customer escalations • Performance testing, analysis and monitoring • Firmware, BIOS, Kernel upgrades and testing GPU Virtual Machines Required Skills • Strong understanding of Linux based operating systems • Deep experience with the internals of QEMU, KVM, Linux kernel and libvirt. Strong proficiency in C. • Strong knowledge of DO’s proprietary services and how they intersect with our virtualization stack. What You'll Be Working On: • Accelerate the virtualization of next generation GPU enabled platforms that power AI/ML workloads. • Work with hardware engineering teams and vendors to validate GPU fabric performance. Optimize performance while maintaining DO’s high security standards. • Collaborate with open source Linux, QEMU and libvirt communities to drive the evolution of Linux virtualization technologies and incorporate them into the DO fleet. • Backport, build, and deploy software patches in order to support new features, backport bug fixes, and resolve security issues.