New York Global Consultants Inc. (NYGCI)

Senior LLMOps Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for a Senior LLMOps Engineer with a contract length of "unknown" and a pay rate of "unknown," located in Charlotte, NC or Jersey City, NJ. Key skills include Kubernetes, LLM deployment, and MLOps pipeline management.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

Unknown

🗓️ - Date

February 10, 2026

🕒 - Duration

Unknown

🏝️ - Location

Hybrid

📄 - Contract

Unknown

🔒 - Security

Unknown

📍 - Location detailed

Jersey City, NJ

🧠 - Skills detailed

#Deployment #AI (Artificial Intelligence) #Monitoring #"ETL (Extract #Transform #Load)" #Kubernetes #Load Balancing #Model Optimization #API (Application Programming Interface) #ML (Machine Learning) #Scala #Microservices

Role description

Position: Senior Consultant (AI/ML Platform) Location: Charlotte, NC or Jersey City, NJ (Hybrid 3 days a week onsite) Project Tasks: AI Operations Platform Consultant • Experience deploying, managing, operating, and troubleshooting containerized services at scale on Kubernetes for mission-critical applications (OpenShift) • Experience with deploying, configuring, and tuning LLMs using TensorRT-LLM and Triton Inference server. • Managing MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production • Setup and operation of AI inference service monitoring for performance and availability. • Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc. • Operation and support of MLOps/LLMOps pipelines, using TensorRT-LLM and Triton Inference server to deploy inference services in production • Experience deploying and troubleshooting LLM models on a containerized platform, monitoring, load balancing, etc. • Experience with standard processes for operation of a mission critical system – incident management, change management, event management, etc. • Managing scalable infrastructure for deploying and managing LLMs • Deploying models in production environments, including containerization, microservices, and API design • Triton Inference Server, including its architecture, configuration, and deployment. • Model Optimization techniques using Triton with TRTLLM • Model optimization techniques, including pruning, quantization, and knowledge distillation

Apply now Apply with DFH

← See all roles

Go to role

Inter-American Development Bank

is hiring for a:

New York Global Consultants Inc. (NYGCI)

Senior LLMOps Engineer

AI Architect Specialist

Generative AI Engineer (contract)

Cloud Data Engineer

Product Owner - AI (contract)

Book a

chat

with us

Company