NextGenPros Inc

MLOps Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for an MLOps Engineer in New York, NY (Hybrid) for 6 months, with a pay rate of "unknown." Key skills include AWS SageMaker, PyTorch, TensorFlow, and experience with ML inference systems and A/B testing frameworks.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
May 20, 2026
🕒 - Duration
More than 6 months
-
🏝️ - Location
Hybrid
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
New York, NY
-
🧠 - Skills detailed
#SageMaker #Datasets #PyTorch #Storage #AutoScaling #AWS SageMaker #Deployment #ML (Machine Learning) #AWS (Amazon Web Services) #Security #Monitoring #TensorFlow #Model Deployment #Observability #Neural Networks #BERT #Data Science #A/B Testing #DevOps #NLP (Natural Language Processing) #"ETL (Extract #Transform #Load)"
Role description
Title: MLOps Engineer Location: New York, NY (Hybrid) Duration: 6 Months Partnering with ML Engineers, Data Scientists, and Platform Engineering, the MLOps Engineer owns the production lifecycle of machine‑learning systems. This role is responsible for deploying, operating, scaling, monitoring, and governing ML workloads so they run reliably, securely, and cost‑effectively in production. The MLOps Engineer ensures that models and inference pipelines built by ML Engineers can be safely promoted across Dev, QA, and Prod, meet operational SLAs, and evolve without introducing instability or uncontrolled cost. This is a production operations role, focused on runtime behavior, infrastructure, and reliability. What You’ll Do • Design, deploy, and operate end‑to‑end production ML pipelines across Dev, QA, and Prod environments. • Set up and manage AWS SageMaker pipelines, endpoints, and monitoring for large scale inference workloads, including embedding generation, named entity recognition, reranking, and video processing. • Own GPU and CPU infrastructure selection, scaling, and optimization, including instance benchmarking, autoscaling behavior, and load testing. • Deploy, monitor, and operate inference services that support hundreds of thousands of queries per day across text, image, and video pipelines. • Establish standardized ML deployment patterns at AP, including: • Containerization and orchestration strategies • Environment isolation (Dev / QA / Prod) • Versioned promotion, rollback, and recovery mechanisms • Implement monitoring, alerting, drift detection, and evaluation metrics for production ML systems, tracking latency, error rates, throughput, and model/data drift. • Enable A/B testing and controlled rollout strategies for ML models in production, in partnership with engineering and product teams. • Partner closely with ML Engineers, Data Scientists, DevOps, and Platform teams to: • Operationalize new models and pipeline improvements • Promote systems across environments safely • Ensure deployments meet reliability, scale, and cost targets • Manage high-throughput I/O and data movement for large collections of media assets (text, images, video), avoiding CPU, network, and storage bottlenecks. • Reduce operational risk by enforcing reproducibility, observability, security, and cost controls across all production ML systems. Required Skills & Experience • Hands‑on experience deploying and operating ML inference systems in production. • Strong experience with AWS SageMaker, including pipelines, endpoints, monitoring, and multi‑environment deployments. • Expertise deploying ML models using PyTorch and TensorFlow from an operational and serving perspective. • Proven experience with model deployment and orchestration, including containerized inference and autoscaling. • Experience selecting, evaluating, and optimizing compute resources (GPU/CPU) for production ML workloads. • Experience setting up monitoring, evaluation metrics, and A/B testing frameworks for ML systems in production. • Ability to collaborate effectively with ML Engineers, Data Scientists, and platform teams in a shared ownership model. Strongly Preferred • Operational experience supporting ML systems involving: • Transformer‑based NLP models (e.g., BERT‑family models) • Computer vision models • Ranking and reranking systems • Familiarity operating systems that use common ML model types such as: • Convolutional and feed‑forward neural networks • Ranking algorithms • Approximate Nearest Neighbor methods (e.g., HNSW) • Experience running ML workloads over large‑scale text, image, and video datasets.