Salt

Generative AI Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Generative AI Engineer on a long-term contract (more than 6 months) in Oakland, CA (hybrid, 3x/week). Key skills include deep learning, NLP, prompt engineering, and model optimization, specifically within the utility industry.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
560
-
🗓️ - Date
November 7, 2025
🕒 - Duration
More than 6 months
-
🏝️ - Location
Hybrid
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Oakland, CA
-
🧠 - Skills detailed
#Deep Learning #AI (Artificial Intelligence) #NLP (Natural Language Processing) #PyTorch #Databases #Datasets #Documentation #Data Engineering #"ETL (Extract #Transform #Load)" #Hugging Face #TensorFlow #Transformers #Scala #Libraries #Deployment #Monitoring #Model Optimization
Role description
We are seeking a GenAI Engineer with deep expertise in Large Language Models to drive the development, optimization, and deployment of advanced LLM capabilities across the organization. This role focuses on fine-tuning foundation models, developing scalable prompt engineering frameworks, and building retrieval-augmented generation (RAG) solutions tailored to the utility industry. You will work closely with Data Engineering, MLOps, and Product teams to ensure our AI systems are accurate, efficient, governed, and performant in production. This is a long-term contract role, hybrid in Oakland-CA. (3x/week) Key Responsibilities • Implement and optimize model fine-tuning approaches (LoRA, PEFT, QLoRA) to adapt foundation models to domain-specific needs. • Develop structured prompt engineering methodologies aligned with utility operations, regulatory requirements, and technical documentation workflows. • Create and maintain reusable prompt templates and shared prompt libraries for consistent usage across applications. • Build and maintain prompt testing frameworks to quantitatively evaluate and continuously improve prompt performance. • Define and enforce prompt versioning and governance standards to ensure high-quality outputs across teams and models. • Apply model optimization techniques (knowledge distillation, quantization, pruning) to improve efficiency and reduce inference cost. • Address memory and compute constraints using strategies like sharded data parallelism, GPU offloading, and hybrid CPU+GPU execution. • Architect and deploy RAG pipelines using vector databases, embedding pipelines, and optimized chunking strategies for retrieval performance. • Design advanced prompting strategies such as chain-of-thought reasoning, agent orchestration, and multi-step task decomposition. • Collaborate with MLOps to deploy, monitor, and retrain LLMs in production environments. Expected Skillset • Deep Learning & NLP: Strong proficiency with PyTorch/TensorFlow, Hugging Face Transformers, and modern LLM training workflows (e.g., LoRA, PEFT, QLoRA). • Prompt Engineering & RAG: Experience designing structured prompts and implementing retrieval-augmented pipelines with vector stores. • GPU & Compute Optimization: Hands-on experience with multi-GPU training, model parallelism, memory optimization, and handling large-scale model workloads. • LLMOps: Understanding of deploying and monitoring LLM-based systems in production environments. • Research Adaptability: Ability to interpret research papers and rapidly apply emerging model optimization techniques. • Domain Adaptation: Experience preparing and curating domain-specific datasets for fine-tuning and evaluation. If you enjoy solving hard problems in scalable AI systems and want to shape the next generation of enterprise LLM capabilities, we’d like to hear from you.