

BCforward
ML Ops Engineer
β - Featured Role | Apply direct with Data Freelance Hub
This role is for an MLOps Engineer with 3+ years of experience, focusing on ML infrastructure and CI/CD pipelines. Contract length is long-term, with a pay rate of "unknown." Location is hybrid in San Diego, CA, or Indianapolis, IN.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
Unknown
-
ποΈ - Date
January 7, 2026
π - Duration
Unknown
-
ποΈ - Location
Hybrid
-
π - Contract
Unknown
-
π - Security
Unknown
-
π - Location detailed
San Diego, CA
-
π§ - Skills detailed
#Cloud #GCP (Google Cloud Platform) #Langchain #GitHub #Logging #Python #AI (Artificial Intelligence) #Distributed Computing #Kubernetes #Programming #Generative Models #Monitoring #Version Control #ML Ops (Machine Learning Operations) #Data Processing #AWS (Amazon Web Services) #"ETL (Extract #Transform #Load)" #Scala #ML (Machine Learning) #DevOps #Databases #Data Engineering #TensorFlow #PyTorch #Model Deployment #Azure #Deployment #Datasets #Data Pipeline
Role description
BCforward is seeking a highly motivated MLOps Engineer to support their pharmaceutical client in San Diego, CA.
Expected Duration: Long term
Location: San Diego, CA or Indianapolis, IN/hybrid
About the Role
We're seeking an experienced MLOps Engineer to build and scale the infrastructure powering next-generation in-silico protein design and engineering. In this role, you'll bridge the gap between cutting-edge AI research and production systems, working at the intersection of machine learning, computational biology, and high-performance computing. You'll collaborate closely with both computational scientists and platform engineers to accelerate the development and deployment of foundational models for protein engineering.
What You'll Do
β’ Build and maintain ML infrastructure: Design, implement, and optimize CI/CD pipelines using GitHub Actions for model training, evaluation, and deployment workflows
β’ Orchestrate compute resources: Manage and scale workloads across Kubernetes clusters and SLURM-based HPC environments, ensuring efficient resource utilization for large-scale model training
β’ Develop ML-ready data pipelines: Build robust, scalable data processing and loading pipelines for diverse biological data types, including protein structure files (PDBs), sequence databases, and experimental assay readouts. The primary focus is to optimize ML-ready data delivery for model training, testing, and benchmarking.
β’ Enable research velocity: Create tools and infrastructure that empower researchers to iterate quickly on protein language models, diffusion models, and flow-based generative models
β’ Optimize for scale: Architect systems that can handle multi-modal datasets and train large foundational models efficiently across distributed computing environments
β’ Monitor and maintain: Implement monitoring, logging, and alerting systems to ensure reliability and performance of production ML systems
Required Qualifications
β’ Strong MLOps foundation: 3+ years of experience in MLOps, ML infrastructure, or related roles
β’ CI/CD expertise: Demonstrated experience building and maintaining CI/CD pipelines, particularly with GitHub Actions
β’ Container orchestration: Hands-on experience with Kubernetes for deploying and managing containerized applications. Familiarity with container registries and Kubernetes.
β’ HPC systems: Proficiency with SLURM or similar job scheduling systems for high-performance computing environments
β’ Programming skills: Strong Python skills; experience with ML frameworks (PyTorch, TensorFlow, JAX)
β’ Data engineering: Experience building scalable data pipelines and ETL processes
β’ DevOps practices: Familiarity with infrastructure-as-code, version control, and collaborative development workflows
β’ Cloud platforms: Experience with cloud infrastructure (AWS, GCP, Azure) for ML workloads
Preferred Qualifications
β’ Performance optimization: Experience with distributed training, mixed precision, gradient checkpointing, and other optimization techniques
β’ Large model infrastructure: Experience training or deploying large-scale foundational models (billions of parameters)
β’ Biological data experience: Prior work with scientific data, particularly in computational biology, bioinformatics, or related fields
β’ Protein AI models: Understanding of protein language models (ESM, ProtGPT, etc.) and their training requirements
β’ Protein structure expertise: Experience working with protein structure data formats (PDB, mmCIF), structural bioinformatics tools, and 3D molecular data
β’ Generative models: Familiarity with diffusion models, flow-based models, or other generative approaches for molecular design
β’ Cross-functional collaboration: Experience working with experimental scientists and translating wet-lab requirements into computational solutions
β’ Agentic systems: Building and deployment MCP clients/servers, particularly for AI model-as-a-tool use cases, and integration with our internal LLMs
Bonus Points
β’ Publications or contributions to computational biology or protein engineering projects
β’ Experience with structure prediction tools (AlphaFold, ESMFold, RoseTTAFold)
β’ Familiarity with laboratory information management systems (LIMS) or assay data formats, particularly Benchling.
β’ Contributions to open-source ML or bioinformatics projects
β’ Experience with vector databases or embedding-based retrieval systems
β’ Experience deploying multi-agent tooling using MCP, LangGraph, and Langchain
What Success Looks Like
β’ Researchers can train and evaluate models with minimal friction
β’ Data pipelines reliably handle TB-scale datasets with diverse formats
β’ Infrastructure scales seamlessly from prototype to production
β’ Experimental data flows efficiently from various sources to model training
β’ Model deployment cycles are measured in hours, not weeks
BCforward is seeking a highly motivated MLOps Engineer to support their pharmaceutical client in San Diego, CA.
Expected Duration: Long term
Location: San Diego, CA or Indianapolis, IN/hybrid
About the Role
We're seeking an experienced MLOps Engineer to build and scale the infrastructure powering next-generation in-silico protein design and engineering. In this role, you'll bridge the gap between cutting-edge AI research and production systems, working at the intersection of machine learning, computational biology, and high-performance computing. You'll collaborate closely with both computational scientists and platform engineers to accelerate the development and deployment of foundational models for protein engineering.
What You'll Do
β’ Build and maintain ML infrastructure: Design, implement, and optimize CI/CD pipelines using GitHub Actions for model training, evaluation, and deployment workflows
β’ Orchestrate compute resources: Manage and scale workloads across Kubernetes clusters and SLURM-based HPC environments, ensuring efficient resource utilization for large-scale model training
β’ Develop ML-ready data pipelines: Build robust, scalable data processing and loading pipelines for diverse biological data types, including protein structure files (PDBs), sequence databases, and experimental assay readouts. The primary focus is to optimize ML-ready data delivery for model training, testing, and benchmarking.
β’ Enable research velocity: Create tools and infrastructure that empower researchers to iterate quickly on protein language models, diffusion models, and flow-based generative models
β’ Optimize for scale: Architect systems that can handle multi-modal datasets and train large foundational models efficiently across distributed computing environments
β’ Monitor and maintain: Implement monitoring, logging, and alerting systems to ensure reliability and performance of production ML systems
Required Qualifications
β’ Strong MLOps foundation: 3+ years of experience in MLOps, ML infrastructure, or related roles
β’ CI/CD expertise: Demonstrated experience building and maintaining CI/CD pipelines, particularly with GitHub Actions
β’ Container orchestration: Hands-on experience with Kubernetes for deploying and managing containerized applications. Familiarity with container registries and Kubernetes.
β’ HPC systems: Proficiency with SLURM or similar job scheduling systems for high-performance computing environments
β’ Programming skills: Strong Python skills; experience with ML frameworks (PyTorch, TensorFlow, JAX)
β’ Data engineering: Experience building scalable data pipelines and ETL processes
β’ DevOps practices: Familiarity with infrastructure-as-code, version control, and collaborative development workflows
β’ Cloud platforms: Experience with cloud infrastructure (AWS, GCP, Azure) for ML workloads
Preferred Qualifications
β’ Performance optimization: Experience with distributed training, mixed precision, gradient checkpointing, and other optimization techniques
β’ Large model infrastructure: Experience training or deploying large-scale foundational models (billions of parameters)
β’ Biological data experience: Prior work with scientific data, particularly in computational biology, bioinformatics, or related fields
β’ Protein AI models: Understanding of protein language models (ESM, ProtGPT, etc.) and their training requirements
β’ Protein structure expertise: Experience working with protein structure data formats (PDB, mmCIF), structural bioinformatics tools, and 3D molecular data
β’ Generative models: Familiarity with diffusion models, flow-based models, or other generative approaches for molecular design
β’ Cross-functional collaboration: Experience working with experimental scientists and translating wet-lab requirements into computational solutions
β’ Agentic systems: Building and deployment MCP clients/servers, particularly for AI model-as-a-tool use cases, and integration with our internal LLMs
Bonus Points
β’ Publications or contributions to computational biology or protein engineering projects
β’ Experience with structure prediction tools (AlphaFold, ESMFold, RoseTTAFold)
β’ Familiarity with laboratory information management systems (LIMS) or assay data formats, particularly Benchling.
β’ Contributions to open-source ML or bioinformatics projects
β’ Experience with vector databases or embedding-based retrieval systems
β’ Experience deploying multi-agent tooling using MCP, LangGraph, and Langchain
What Success Looks Like
β’ Researchers can train and evaluate models with minimal friction
β’ Data pipelines reliably handle TB-scale datasets with diverse formats
β’ Infrastructure scales seamlessly from prototype to production
β’ Experimental data flows efficiently from various sources to model training
β’ Model deployment cycles are measured in hours, not weeks






