

InfoVision Inc.
MLOps Engineer
β - Featured Role | Apply direct with Data Freelance Hub
This role is for an MLOps Engineer with a contract length of "unknown," offering a pay rate of "unknown," and is remote. Key skills required include Kubernetes, ML/DL model deployment, local LLM optimization, and proficiency in Python.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
Unknown
-
ποΈ - Date
May 7, 2026
π - Duration
Unknown
-
ποΈ - Location
Unknown
-
π - Contract
Unknown
-
π - Security
Unknown
-
π - Location detailed
Dallas, TX
-
π§ - Skills detailed
#Model Deployment #Monitoring #Deployment #API (Application Programming Interface) #Python #Scala #Load Balancing #Kubernetes #ML (Machine Learning) #Batch #"ETL (Extract #Transform #Load)" #AutoScaling
Role description
Weβre hiring an experienced MLOps Engineer to productionize and scale ML and GenAI systems, with a focus on LLM deployment, orchestration, and reliability in production environments.
Key Responsibilities
Deploy, manage, and scale ML/DL models in production
Build and operate Kubernetes-based infrastructure for ML workloads
Handle model packaging, serialization, and versioning
Design scalable inference systems (batch and real-time)
Deploy and optimize local LLMs (latency, throughput, cost)
Implement GenAI workflows (RAG, prompt pipelines, orchestration)
Build and manage agentic systems with tool integration
Design and manage LLM memory (short-term, long-term, vector stores)
Integrate and manage API gateways for model access, routing, and rate limiting
Monitor performance, drift, and system reliability
Requirements
Strong Kubernetes fundamentals (pods, services, autoscaling, deployments)
Hands-on experience with ML/DL models and serialization
Proven experience in model deployment, scaling, and monitoring
Experience with local LLM deployment and optimization
Solid understanding of LLM memory patterns (context windows, retrieval, persistence)
Experience with API gateways, load balancing, and service routing
Familiarity with GenAI workflows (RAG, orchestration frameworks)
Experience building agentic / multi-step LLM systems
Proficiency in Python and modern ML/infra tooling
Weβre hiring an experienced MLOps Engineer to productionize and scale ML and GenAI systems, with a focus on LLM deployment, orchestration, and reliability in production environments.
Key Responsibilities
Deploy, manage, and scale ML/DL models in production
Build and operate Kubernetes-based infrastructure for ML workloads
Handle model packaging, serialization, and versioning
Design scalable inference systems (batch and real-time)
Deploy and optimize local LLMs (latency, throughput, cost)
Implement GenAI workflows (RAG, prompt pipelines, orchestration)
Build and manage agentic systems with tool integration
Design and manage LLM memory (short-term, long-term, vector stores)
Integrate and manage API gateways for model access, routing, and rate limiting
Monitor performance, drift, and system reliability
Requirements
Strong Kubernetes fundamentals (pods, services, autoscaling, deployments)
Hands-on experience with ML/DL models and serialization
Proven experience in model deployment, scaling, and monitoring
Experience with local LLM deployment and optimization
Solid understanding of LLM memory patterns (context windows, retrieval, persistence)
Experience with API gateways, load balancing, and service routing
Familiarity with GenAI workflows (RAG, orchestration frameworks)
Experience building agentic / multi-step LLM systems
Proficiency in Python and modern ML/infra tooling






