Bernard Nickels & Associates

LLM Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for an LLM Engineer on a W2 contract in San Jose, CA, lasting until 12/31/2025, with a pay rate of $85 to $95 per hour. Requires 5+ years in machine learning, strong Python skills, and experience with LLMs.
🌎 - Country
United States
πŸ’± - Currency
$ USD
-
πŸ’° - Day rate
760
-
πŸ—“οΈ - Date
November 14, 2025
πŸ•’ - Duration
More than 6 months
-
🏝️ - Location
Hybrid
-
πŸ“„ - Contract
W2 Contractor
-
πŸ”’ - Security
Unknown
-
πŸ“ - Location detailed
San Jose, CA
-
🧠 - Skills detailed
#Consulting #ML (Machine Learning) #Azure #Programming #Cloud #Deployment #GCP (Google Cloud Platform) #NLP (Natural Language Processing) #Data Science #Data Pipeline #NLU (Natural Language Understanding) #TensorFlow #"ETL (Extract #Transform #Load)" #AI (Artificial Intelligence) #Python #PyTorch #AWS (Amazon Web Services) #Deep Learning #Databases #Scala
Role description
Job Title: LLM Engineer Job Type: Contract (W2 Only) Contract Duration: ASAP through 12/31/2025 (with good potential for extension into 2026) Work Location: San Jose, CA (HYBRID role; Onsite 2 days per week) Work Schedule/Hours: Monday–Friday, 8 hours per day, 40 hours per week (standard business hours) Compensation: $85 to $95 per hour Overview: A leading Big Four consulting firm is seeking a highly-skilled LLM Engineer to design, train, and optimize large language models that drive cutting-edge applications in generative AI and natural language understanding. This role offers the opportunity to work on advanced model development, scalable deployment systems, and innovative research alongside cross-functional product and engineering teams. Responsibilities: Model Development & Optimization β€’ Design, train, fine-tune, and evaluate large language models (LLMs) to ensure high performance, efficiency, and alignment with research or product goals. β€’ Optimize model architectures, tokenization strategies, and data pipelines to enhance throughput and model accuracy. Systems Integration & Deployment β€’ Build and maintain scalable inference pipelines for production environments. β€’ Optimize serving infrastructure using techniques such as quantization, caching, pruning, and distillation. β€’ Integrate trained models into enterprise applications, APIs, or end-user products. Research & Cross-Functional Collaboration β€’ Lead experimentation with new architectures, retrieval-augmented generation (RAG) frameworks, and prompt-engineering techniques. β€’ Collaborate closely with product managers, data scientists, and ML operations teams to translate research into production-grade solutions. β€’ Stay current with advancements in transformer architectures, fine-tuning methods, and LLM safety/alignment best practices. Qualifications: Required: β€’ High school diploma or GED required; Bachelor’s degree or higher preferred. β€’ 5+ years of experience in machine learning, NLP, or large-scale model development. β€’ Strong understanding of deep learning frameworks such as PyTorch or TensorFlow. β€’ Experience building, training, or fine-tuning large language models (e.g., GPT, LLaMA, PaLM, Falcon, etc.). β€’ Solid programming skills in Python, with experience in distributed training and cloud-based ML infrastructure (AWS, GCP, or Azure). β€’ Strong problem-solving and communication skills, with the ability to work cross-functionally in fast-paced environments. Preferred: β€’ Experience with retrieval systems, vector databases, or RAG pipelines. β€’ Familiarity with model alignment, evaluation metrics, and responsible AI practices.