BCforward

ML Ops Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for an MLOps Engineer with 3+ years of experience, focusing on ML infrastructure and CI/CD pipelines. Contract length is long-term, with a pay rate of "unknown." Location is hybrid in San Diego, CA, or Indianapolis, IN.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

Unknown

🗓️ - Date

January 7, 2026

🕒 - Duration

Unknown

🏝️ - Location

Hybrid

📄 - Contract

Unknown

🔒 - Security

Unknown

📍 - Location detailed

San Diego, CA

🧠 - Skills detailed

#Cloud #GCP (Google Cloud Platform) #Langchain #GitHub #Logging #Python #AI (Artificial Intelligence) #Distributed Computing #Kubernetes #Programming #Generative Models #Monitoring #Version Control #ML Ops (Machine Learning Operations) #Data Processing #AWS (Amazon Web Services) #"ETL (Extract #Transform #Load)" #Scala #ML (Machine Learning) #DevOps #Databases #Data Engineering #TensorFlow #PyTorch #Model Deployment #Azure #Deployment #Datasets #Data Pipeline

Role description

BCforward is seeking a highly motivated MLOps Engineer to support their pharmaceutical client in San Diego, CA. Expected Duration: Long term Location: San Diego, CA or Indianapolis, IN/hybrid About the Role We're seeking an experienced MLOps Engineer to build and scale the infrastructure powering next-generation in-silico protein design and engineering. In this role, you'll bridge the gap between cutting-edge AI research and production systems, working at the intersection of machine learning, computational biology, and high-performance computing. You'll collaborate closely with both computational scientists and platform engineers to accelerate the development and deployment of foundational models for protein engineering. What You'll Do • Build and maintain ML infrastructure: Design, implement, and optimize CI/CD pipelines using GitHub Actions for model training, evaluation, and deployment workflows • Orchestrate compute resources: Manage and scale workloads across Kubernetes clusters and SLURM-based HPC environments, ensuring efficient resource utilization for large-scale model training • Develop ML-ready data pipelines: Build robust, scalable data processing and loading pipelines for diverse biological data types, including protein structure files (PDBs), sequence databases, and experimental assay readouts. The primary focus is to optimize ML-ready data delivery for model training, testing, and benchmarking. • Enable research velocity: Create tools and infrastructure that empower researchers to iterate quickly on protein language models, diffusion models, and flow-based generative models • Optimize for scale: Architect systems that can handle multi-modal datasets and train large foundational models efficiently across distributed computing environments • Monitor and maintain: Implement monitoring, logging, and alerting systems to ensure reliability and performance of production ML systems Required Qualifications • Strong MLOps foundation: 3+ years of experience in MLOps, ML infrastructure, or related roles • CI/CD expertise: Demonstrated experience building and maintaining CI/CD pipelines, particularly with GitHub Actions • Container orchestration: Hands-on experience with Kubernetes for deploying and managing containerized applications. Familiarity with container registries and Kubernetes. • HPC systems: Proficiency with SLURM or similar job scheduling systems for high-performance computing environments • Programming skills: Strong Python skills; experience with ML frameworks (PyTorch, TensorFlow, JAX) • Data engineering: Experience building scalable data pipelines and ETL processes • DevOps practices: Familiarity with infrastructure-as-code, version control, and collaborative development workflows • Cloud platforms: Experience with cloud infrastructure (AWS, GCP, Azure) for ML workloads Preferred Qualifications • Performance optimization: Experience with distributed training, mixed precision, gradient checkpointing, and other optimization techniques • Large model infrastructure: Experience training or deploying large-scale foundational models (billions of parameters) • Biological data experience: Prior work with scientific data, particularly in computational biology, bioinformatics, or related fields • Protein AI models: Understanding of protein language models (ESM, ProtGPT, etc.) and their training requirements • Protein structure expertise: Experience working with protein structure data formats (PDB, mmCIF), structural bioinformatics tools, and 3D molecular data • Generative models: Familiarity with diffusion models, flow-based models, or other generative approaches for molecular design • Cross-functional collaboration: Experience working with experimental scientists and translating wet-lab requirements into computational solutions • Agentic systems: Building and deployment MCP clients/servers, particularly for AI model-as-a-tool use cases, and integration with our internal LLMs Bonus Points • Publications or contributions to computational biology or protein engineering projects • Experience with structure prediction tools (AlphaFold, ESMFold, RoseTTAFold) • Familiarity with laboratory information management systems (LIMS) or assay data formats, particularly Benchling. • Contributions to open-source ML or bioinformatics projects • Experience with vector databases or embedding-based retrieval systems • Experience deploying multi-agent tooling using MCP, LangGraph, and Langchain What Success Looks Like • Researchers can train and evaluate models with minimal friction • Data pipelines reliably handle TB-scale datasets with diverse formats • Infrastructure scales seamlessly from prototype to production • Experimental data flows efficiently from various sources to model training • Model deployment cycles are measured in hours, not weeks

Apply now Apply with DFH

BCforward

ML Ops Engineer

Data Scientist

Data Analyst

Sr. Data Analyst

Data Analyst

Book a

chat

with us

Company