

STAFFXPERT LLC
GenAI Engineer
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a GenAI Engineer with a contract length of 6+ months, located in Philadelphia, Pennsylvania. Key skills include Python, LLM deployment, RAG solutions, and vector databases. Experience in secure enterprise environments is required. Pay rate is competitive.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
April 9, 2026
🕒 - Duration
More than 6 months
-
🏝️ - Location
On-site
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Philadelphia, PA
-
🧠 - Skills detailed
#C++ #Security #Metadata #Indexing #Programming #Databases #Compliance #Docker #ML (Machine Learning) #Database Architecture #Kubernetes #Hugging Face #Data Privacy #AI (Artificial Intelligence) #Scala #Langchain #"ETL (Extract #Transform #Load)" #Python #Documentation #Transformers #Logging #Data Pipeline #Deployment
Role description
About the Company
STAFFXPERT LLC is seeking a GenAI Engineer on behalf of our client in Philadelphia, Pennsylvania. This role focuses on designing and implementing on-premise Large Language Model (LLM) solutions and vector database architectures.
About the Role
The ideal candidate will have strong hands-on experience with open-source LLMs, Retrieval-Augmented Generation (RAG) pipelines, and secure enterprise deployments.
Responsibilities
• Deploy and optimize open-source LLMs such as Llama 3 and Mistral / Mixtral in on-premise or private environments
• Develop and integrate LLM-based applications using Python, including prompt engineering and inference workflows
• Implement CPU-based inference, model quantization, and performance tuning techniques
• Design and build scalable Retrieval-Augmented Generation (RAG) pipelines
• Work with vector databases to manage embeddings, indexing, and metadata filtering
• Ensure security, data privacy, and compliance in air-gapped or enterprise environments
• Collaborate with cross-functional teams to deliver architecture, prototypes, and documentation
Qualifications
• Strong proficiency in Python for AI/ML application development
• Hands-on experience with vector databases such as Qdrant, Chroma, Milvus, or pgvector
• Proven experience implementing RAG solutions
• Experience deploying LLMs in on-premise or secure environments
• Strong understanding of embeddings, semantic search, and data pipelines
• Knowledge of enterprise security practices, including access controls and audit logging
Required Skills
• Experience with LangChain or LlamaIndex
• Familiarity with containerization tools such as Docker and Kubernetes
• Exposure to inference frameworks like vLLM, llama.cpp, or Hugging Face Transformers
• Experience with high-performance programming languages (Rust, Go, or C++)
• Prior experience in regulated or enterprise environments
Pay range and compensation package
Contract (6+ Months)
Equal Opportunity Statement
We are committed to diversity and inclusivity.
About the Company
STAFFXPERT LLC is seeking a GenAI Engineer on behalf of our client in Philadelphia, Pennsylvania. This role focuses on designing and implementing on-premise Large Language Model (LLM) solutions and vector database architectures.
About the Role
The ideal candidate will have strong hands-on experience with open-source LLMs, Retrieval-Augmented Generation (RAG) pipelines, and secure enterprise deployments.
Responsibilities
• Deploy and optimize open-source LLMs such as Llama 3 and Mistral / Mixtral in on-premise or private environments
• Develop and integrate LLM-based applications using Python, including prompt engineering and inference workflows
• Implement CPU-based inference, model quantization, and performance tuning techniques
• Design and build scalable Retrieval-Augmented Generation (RAG) pipelines
• Work with vector databases to manage embeddings, indexing, and metadata filtering
• Ensure security, data privacy, and compliance in air-gapped or enterprise environments
• Collaborate with cross-functional teams to deliver architecture, prototypes, and documentation
Qualifications
• Strong proficiency in Python for AI/ML application development
• Hands-on experience with vector databases such as Qdrant, Chroma, Milvus, or pgvector
• Proven experience implementing RAG solutions
• Experience deploying LLMs in on-premise or secure environments
• Strong understanding of embeddings, semantic search, and data pipelines
• Knowledge of enterprise security practices, including access controls and audit logging
Required Skills
• Experience with LangChain or LlamaIndex
• Familiarity with containerization tools such as Docker and Kubernetes
• Exposure to inference frameworks like vLLM, llama.cpp, or Hugging Face Transformers
• Experience with high-performance programming languages (Rust, Go, or C++)
• Prior experience in regulated or enterprise environments
Pay range and compensation package
Contract (6+ Months)
Equal Opportunity Statement
We are committed to diversity and inclusivity.






