Ampstek

Staff Machine Learning Engineer - LLM Fine Tuning (Verilog/RTL Applications with Cloud Bedrock/SageMaker)

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Staff Machine Learning Engineer focused on LLM fine-tuning for Verilog/RTL applications, based in San Jose, CA. Contract length is C2C/W2/FTE, with a pay rate of "unknown." Requires 10+ years of experience, deep LLM skills, and AWS expertise.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
October 25, 2025
🕒 - Duration
Unknown
-
🏝️ - Location
On-site
-
📄 - Contract
W2 Contractor
-
🔒 - Security
Unknown
-
📍 - Location detailed
San Jose, CA
-
🧠 - Skills detailed
#IP (Internet Protocol) #Security #Compliance #GitLab #Regression #SageMaker #MLflow #PyTorch #REST (Representational State Transfer) #Deployment #ECR (Elastic Container Registery) #OpenSearch #VPC (Virtual Private Cloud) #Observability #S3 (Amazon Simple Storage Service) #AI (Artificial Intelligence) #Cloud #Licensing #Transformers #ML (Machine Learning) #Datasets #Java #Hugging Face #"ETL (Extract #Transform #Load)" #Batch #C++ #IAM (Identity and Access Management) #EC2 #AWS (Amazon Web Services) #AutoScaling #Python #Data Privacy #GitHub
Role description
Role: Staff Machine Learning Engineer - LLM Fine Tuning (Verilog/RTL Applications with Cloud Bedrock/SageMaker) Location: San Jose, CA (Onsite) Job Type: C2C/W2/FTE Job Description We’re building privacy preserving LLM capabilities that help hardware design teams reason over Verilog/SystemVerilog and RTL artifacts—code generation, refactoring, lint explanation, constraint translation, and spec to RTL assistance. We’re looking for a Staff level engineer to technically lead a small, high leverage team that fine tunes and productizes LLMs for these workflows in a strict enterprise data privacy environment. You don’t need to be a Verilog/RTL expert to start; curiosity, drive, and deep LLM craftsmanship matter most. Any HDL/EDA fluency is a strong plus. Responsibilities • Own the technical roadmap for Verilog/RTL focused LLM capabilities—from model selection and adaptation to evaluation, deployment, and continuous improvement. • Lead a hands on team of applied scientists/engineers: set direction, unblock technically, review designs/code, and raise the bar on experimentation velocity and reliability. • Fine tune and customize models using state of the art techniques (LoRA/QLoRA, PEFT, instruction tuning, preference optimization/RLAIF) with robust HDL specific evals: • Compile /lint /simulate based pass rates, pass@k for code generation, constrained decoding to enforce syntax, and “does it synthesize” checks. • Design privacy first ML pipelines on AWS: • Training/customization and hosting using Amazon Bedrock (including Anthropic models) where appropriate; SageMaker (or EKS + KServe/Triton/DJL) for bespoke training needs. • Artifacts in S3 with KMS CMKs; isolated VPC subnets & PrivateLink (including Bedrock VPC endpoints), IAM least privilege, CloudTrail auditing, and Secrets Manager for credentials. • Enforce encryption in transit/at rest, data minimization, no public egress for customer/RTL corpora. • Stand up dependable model serving: Bedrock model invocation where it fits, and/or low latency self hosted inference (vLLM/TensorRT LLM), autoscaling, and canary/blue green rollouts. • Build an evaluation culture: automatic regression suites that run HDL compilers/simulators, measure behavioral fidelity, and detect hallucinations/constraint violations; model cards and experiment tracking (MLflow/Weights & Biases). • Partner deeply with hardware design, CAD/EDA, Security, and Legal to source/prepare datasets (anonymization, redaction, licensing), define acceptance gates, and meet compliance requirements. • Drive productization: integrate LLMs with internal developer tools (IDEs/plug ins, code review bots, CI), retrieval (RAG) over internal HDL repos/specs, and safe tool use/function calling. • Mentor & uplevel: coach ICs on LLM best practices, reproducible training, critical paper reading, and building secure by default systems. Minimum Qualifications • 10+ years total engineering experience with 5+ years in ML/AI or large-scale distributed systems; 3+ years working directly with transformers/LLMs. • Proven track record shipping LLM powered features in production and leading ambiguous, cross functional initiatives at Staff level. • Deep hands-on skill with PyTorch, Hugging Face Transformers/PEFT/TRL, distributed training (DeepSpeed/FSDP), quantization aware fine tuning (LoRA/QLoRA), and constrained/grammar guided decoding. • AWS expertise to design and defend secure enterprise deployments, including: • Amazon Bedrock (model selection, Anthropic model usage, model customization, Guardrails, Knowledge Bases, Bedrock runtime APIs, VPC endpoints) • SageMaker (Training, Inference, Pipelines), S3, EC2/EKS/ECR, VPC/Subnets/Security Groups, IAM, KMS, PrivateLink, CloudWatch/CloudTrail, Step Functions, Batch, Secrets Manager. • Strong software engineering fundamentals: testing, CI/CD, observability, performance tuning; Python a must (bonus for Go/Java/C++). • Demonstrated ability to set technical vision and influence across teams; excellent written and verbal communication for execs and engineers. • Preferred Qualifications • Familiarity with Verilog/SystemVerilog/RTL workflows: lint, synthesis, timing closure, simulation, formal, test benches, and EDA tools (Synopsys/Cadence/Mentor). • Experience integrating static analysis/AST aware tokenization for code models or grammar constrained decoding. • RAG at scale over code/specs (vector stores, chunking strategies), tool use/function calling for code transformation. • Inference optimization: TensorRT LLM, KV cache optimization, speculative decoding; throughput/latency trade offs at batch and token levels. • Model governance/safety in the enterprise: model cards, red teaming, secure eval data handling; exposure to SOC2/ISO 27001/NIST frameworks. • Data anonymization, DLP scanning, and code de identification to protect IP. Tech you’ll touch • Modeling: PyTorch, HF Transformers/PEFT/TRL, DeepSpeed/FSDP, vLLM, TensorRT LLM • AWS & MLOps: Amazon Bedrock (Anthropic and other FMs, Guardrails, Knowledge Bases, Runtime APIs), SageMaker (Training/Inference/Pipelines), MLflow/W&B, ECR, EKS/KServe/Triton, Step Functions • Platform/Security: S3 + KMS, IAM, VPC/PrivateLink (incl. Bedrock), CloudWatch/CloudTrail, Secrets Manager • Tooling (Nice to Have): HDL toolchains for compile/simulate/lint, vector stores (pgvector/OpenSearch), GitHub/GitLab CI Thanks & Regards Alok Ranjan Pathak | Team Lead - US Staffing Email: Alok.ranjan@ampstek.com | Desk: (609) 360-2613 Ampstek LLC – Global IT Partner | www.ampstek.com