

New York Global Consultants Inc. (NYGCI)
Senior LLMOps Engineer
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Senior LLMOps Engineer with a contract length of "unknown" and a pay rate of "unknown," located in Charlotte, NC or Jersey City, NJ. Key skills include Kubernetes, LLM deployment, and MLOps pipeline management.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
February 10, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Hybrid
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Jersey City, NJ
-
🧠 - Skills detailed
#Deployment #AI (Artificial Intelligence) #Monitoring #"ETL (Extract #Transform #Load)" #Kubernetes #Load Balancing #Model Optimization #API (Application Programming Interface) #ML (Machine Learning) #Scala #Microservices
Role description
Position: Senior Consultant (AI/ML Platform)
Location: Charlotte, NC or Jersey City, NJ (Hybrid 3 days a week onsite)
Project Tasks:
AI Operations Platform Consultant
• Experience deploying, managing, operating, and troubleshooting containerized services at scale on Kubernetes for mission-critical applications (OpenShift)
• Experience with deploying, configuring, and tuning LLMs using TensorRT-LLM and Triton Inference server.
• Managing MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
• Setup and operation of AI inference service monitoring for performance and availability.
• Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc.
• Operation and support of MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
• Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc.
• Experience with standard processes for operation of a mission critical system – incident management, change management, event management, etc.
• Managing scalable infrastructure for deploying and managing LLMs
• Deploying models in production environments, including containerization, microservices, and API design
• Triton Inference Server, including its architecture, configuration, and deployment.
• Model Optimization techniques using Triton with TRTLLM
• Model optimization techniques, including pruning, quantization, and knowledge distillation
Position: Senior Consultant (AI/ML Platform)
Location: Charlotte, NC or Jersey City, NJ (Hybrid 3 days a week onsite)
Project Tasks:
AI Operations Platform Consultant
• Experience deploying, managing, operating, and troubleshooting containerized services at scale on Kubernetes for mission-critical applications (OpenShift)
• Experience with deploying, configuring, and tuning LLMs using TensorRT-LLM and Triton Inference server.
• Managing MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
• Setup and operation of AI inference service monitoring for performance and availability.
• Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc.
• Operation and support of MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production
• Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc.
• Experience with standard processes for operation of a mission critical system – incident management, change management, event management, etc.
• Managing scalable infrastructure for deploying and managing LLMs
• Deploying models in production environments, including containerization, microservices, and API design
• Triton Inference Server, including its architecture, configuration, and deployment.
• Model Optimization techniques using Triton with TRTLLM
• Model optimization techniques, including pruning, quantization, and knowledge distillation





