Rivago Infotech Inc

AI Platform with LLM Infrastructure

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for a Senior AI Platform / LLM Infrastructure Engineer in Charlotte, NC (Hybrid) for a long-term project. Key skills include LLM inference frameworks, model optimization, Kubernetes, GPU orchestration, and Python programming.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

Unknown

🗓️ - Date

June 9, 2026

🕒 - Duration

Unknown

🏝️ - Location

Hybrid

📄 - Contract

Unknown

🔒 - Security

Unknown

📍 - Location detailed

Charlotte, NC

🧠 - Skills detailed

#"ETL (Extract #Transform #Load)" #Python #Kubernetes #Observability #Batch #Deployment #Scala #AI (Artificial Intelligence) #Programming #Grafana #Prometheus #ML (Machine Learning) #Model Optimization #Monitoring

Role description

Role: Senior AI Platform / LLM Infrastructure Engineer Location: Charlotte, NC (Hybrid) Duration: Long Term Project We are hiring a Senior AI Platform Engineer to build and optimize on-prem LLM inference platforms. The role focuses on high-performance model serving, GPU workloads, and scalable ML infrastructure using modern inference frameworks and Kubernetes. Must-Have Skills • LLM Inference Frameworks: vLLM, TensorRT-LLM, Triton Inference Server, SGLang • Model Optimization: Continuous Batching, Speculative Decoding, KV Cache / Prefix Caching, FP8 / AWQ / GPTQ • Distributed/Parallel Systems: Tensor Parallelism • Platform & Orchestration: Kubernetes, KServe, OpenShift AI, Helm / Operators • GPU & Performance: CUDA, NCCL, MIG, GPU Orchestration (Run:AI) • Monitoring: Prometheus, Grafana, ML Observability • Programming: Python • GenAI Tools: Arize AI, Claude (CoWork) • Load / performance testing: GuideLLM, Locust Key Responsibilities • Build and manage LLM inference platforms on on-prem GPU infrastructure • Optimize model performance using advanced inference techniques (batching, caching, quantization) • Deploy and operate ML workloads on Kubernetes (KServe/OpenShift AI) • Enable GPU scheduling and orchestration for large-scale workloads • Implement monitoring and performance benchmarking frameworks • Drive SRE practices for platform reliability and scalability (observability, incident handling) • Collaborate with AI/ML teams to enable production-grade GenAI deployments

Apply now Apply with DFH

Rivago Infotech Inc

AI Platform with LLM Infrastructure

Financial Performance & Insight Analyst

Senior ICT Support Analyst

Cribl Data Analytics Engineer

InterPro/Pfam Software Project Leader

Book a

chat

with us

Company