Jobs via Dice

GenAI Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a GenAI Engineer based in Philadelphia, Pennsylvania, offering a 6+ month contract at an unspecified pay rate. Key skills required include on-prem LLM deployment, Python proficiency, vector databases, and data privacy understanding.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
February 10, 2026
🕒 - Duration
More than 6 months
-
🏝️ - Location
On-site
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Philadelphia, PA
-
🧠 - Skills detailed
#Deployment #C++ #"ETL (Extract #Transform #Load)" #Docker #Kubernetes #Langchain #Python #Security #Documentation #Databases #Hugging Face #Transformers #Logging #Metadata #Data Privacy
Role description
Dice is the leading career destination for tech experts at every stage of their careers. Our client, 1 Point System, is seeking the following. Apply via Dice today! : GenAI engineer- final round in-person interview Location: Philadelphia, Pennsylvania Duration: 6+ month contract Core Experience Consultant Requirements – On-Prem LLM & Vector DB Implementation • Hands-on experience deploying open-source LLMs such as Meta Llama 3 and Mistral / Mixtral in on-prem or private environments • Strong proficiency in Python for LLM inference, prompt engineering, and integration • Experience with CPU-based inference, model quantization, and performance tuning Vector Databases & RAG • Practical experience with open-source vector databases such as Qdrant, Chroma, Milvus, or pgvector • Proven implementation of Retrieval-Augmented Generation (RAG) pipelines • Experience generating and managing embeddings and metadata filtering Security & Governance • Understanding of data privacy, air-gapped deployments, and enterprise security requirements • Experience implementing access controls and audit logging Nice to Have • Experience with LangChain or LlamaIndex • Exposure to Rust, Go, or C++ for high-performance services • Familiarity with Docker and Kubernetes for on-prem deployments • Knowledge of inference frameworks (e.g., vLLM, llama.cpp, Hugging Face Transformers) • Prior work in regulated or enterprise environments Deliverables • Reference architecture and deployment guidance • Working prototype (LLM + vector DB + RAG) • Documentation and knowledge transfer to internal teams