

Tranzeal Incorporated
LlaMA Developer (Generative AI / LLM Engineer)
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a LlaMA Developer (Generative AI / LLM Engineer) with a contract length of "unknown" and a pay rate of "unknown." Required skills include Python, Llama models, vector databases, and experience with AWS/GCP/Azure. Remote work location.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
November 20, 2025
🕒 - Duration
Unknown
-
🏝️ - Location
Remote
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
United States
-
🧠 - Skills detailed
#AI (Artificial Intelligence) #Docker #Hugging Face #GCP (Google Cloud Platform) #C++ #Scala #Python #Kubernetes #Azure #Data Science #FastAPI #Compliance #Microservices #Monitoring #API (Application Programming Interface) #PyTorch #Datasets #Knowledge Graph #Databases #Langchain #ML (Machine Learning) #Cloud #AWS (Amazon Web Services) #Logging
Role description
Role: Llama Developer (Generative AI / LLM Engineer)
Location: Remote
About the Role
We are looking for a highly skilled Llama Developer to design, develop, and deploy AI. The ideal candidate has hands-on experience fine-tuning LLMs, building intelligent agents, integrating LLM APIs, and optimizing inference for production workloads.
Key Responsibilities
Build AI applications using Llama (Llama 3 / Llama Stack / Llama API / local LLM inference).
Fine-tune and evaluate Llama models on proprietary and domain-specific datasets.
Implement Retrieval-Augmented Generation (RAG) pipelines using vector databases.
Develop conversational agents, copilots, or knowledge assistants for business workflows.
Optimize model performance via quantization, prompt engineering, and latency reduction.
Integrate LLM capabilities into back-end services, microservices, APIs, or cloud platforms.
Ensure compliance, safety, and responsible-AI standards for generated content.
Collaborate with Data Science, MLOps, and Product teams to deploy scalable AI products.
Required Skills & Experience
3–8+ years of software engineering or machine-learning experience.
Proven experience with Llama models.
Proficiency in Python (FastAPI, LangChain, LlamaIndex, Hugging Face ecosystem).
Experience with vector databases (FAISS, Pinecone, Weaviate, ChromaDB).
Strong understanding of prompt engineering, model fine-tuning, LoRA / QLoRA.
Hands-on experience with GPU computing, PyTorch, Docker, Kubernetes.
Familiarity with MLOps practices: CI/CD for ML, model monitoring, logging.
Experience deploying models on AWS / GCP / Azure / on-prem GPU clusters.
Knowledge of RAG architectures, knowledge graphs, and document parsing pipelines.
Understanding of model safety, hallucination mitigation, red-team testing.
Experience with llama.cpp, vLLM, Ollama, or NVIDIA Triton.
Contributions to open-source LLMs or AI frameworks.
Please share your updated resume at subhd@tranzeal.com
Role: Llama Developer (Generative AI / LLM Engineer)
Location: Remote
About the Role
We are looking for a highly skilled Llama Developer to design, develop, and deploy AI. The ideal candidate has hands-on experience fine-tuning LLMs, building intelligent agents, integrating LLM APIs, and optimizing inference for production workloads.
Key Responsibilities
Build AI applications using Llama (Llama 3 / Llama Stack / Llama API / local LLM inference).
Fine-tune and evaluate Llama models on proprietary and domain-specific datasets.
Implement Retrieval-Augmented Generation (RAG) pipelines using vector databases.
Develop conversational agents, copilots, or knowledge assistants for business workflows.
Optimize model performance via quantization, prompt engineering, and latency reduction.
Integrate LLM capabilities into back-end services, microservices, APIs, or cloud platforms.
Ensure compliance, safety, and responsible-AI standards for generated content.
Collaborate with Data Science, MLOps, and Product teams to deploy scalable AI products.
Required Skills & Experience
3–8+ years of software engineering or machine-learning experience.
Proven experience with Llama models.
Proficiency in Python (FastAPI, LangChain, LlamaIndex, Hugging Face ecosystem).
Experience with vector databases (FAISS, Pinecone, Weaviate, ChromaDB).
Strong understanding of prompt engineering, model fine-tuning, LoRA / QLoRA.
Hands-on experience with GPU computing, PyTorch, Docker, Kubernetes.
Familiarity with MLOps practices: CI/CD for ML, model monitoring, logging.
Experience deploying models on AWS / GCP / Azure / on-prem GPU clusters.
Knowledge of RAG architectures, knowledge graphs, and document parsing pipelines.
Understanding of model safety, hallucination mitigation, red-team testing.
Experience with llama.cpp, vLLM, Ollama, or NVIDIA Triton.
Contributions to open-source LLMs or AI frameworks.
Please share your updated resume at subhd@tranzeal.com






