

Staff Machine Learning Engineer
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Staff Machine Learning Engineer on a 12-month contract, paying £1000 – £1200 per day, hybrid in London. Requires 7+ years in ML systems, strong Python/PyTorch skills, and experience with distributed training frameworks and GPU programming.
🌎 - Country
United Kingdom
💱 - Currency
£ GBP
💰 - Day rate
Unknown
Unknown
1000
🗓️ - Date discovered
May 20, 2025
🕒 - Project duration
More than 6 months
🏝️ - Location type
Hybrid
📄 - Contract type
Outside IR35
🔒 - Security clearance
Unknown
📍 - Location detailed
London Area, United Kingdom
🧠 - Skills detailed
#Cloud #PyTorch #Transformers #Generative Models #"ETL (Extract #Transform #Load)" #Kubernetes #Programming #ML (Machine Learning) #AI (Artificial Intelligence) #Scala #Python
Role description
We’re teaming up with one of the leading names in AI, known for pushing the boundaries of what’s possible with large-scale generative models and next-gen cloud infrastructure. We're offering a rare opportunity to step into a Staff Machine Learning Engineer role and play a key part in shaping the platforms powering millions of users across the globe.
You'll be joining a team of exceptional researchers and engineers, all passionate about advancing the field and delivering world-class AI experiences.
Location: London, Oxford Street (Hybrid; onsite in London once a week)
Rate: £1000 – £1200 per day - Outside IR35
Start date: ASAP, 12-month contract
What you'll be doing
• Leading the development of scalable, reliable systems for training and fine-tuning transformer-based models
• Optimising inference pipelines for real-time applications — aiming for low latency and high throughput
• Exploring and applying advanced fine-tuning methods like LoRA, prefix-tuning, and adapters
• Tuning performance across GPUs and systems using tools like DeepSpeed, Triton, TensorRT, and even custom kernels
• Working closely with research, platform, and product teams to deliver new features and enhance the developer experience
• Identifying and resolving bottlenecks through profiling, benchmarking, and performance tuning
• Helping to define best practices for building, testing, and maintaining production ML services and APIs
• Mentoring other engineers and helping to foster a culture of technical excellence and innovation
What our client is looking for
• 7+ years of experience building and deploying large-scale ML systems in production
• Strong Python and PyTorch skills, with a deep understanding of transformers, LLMs, and multimodal models
• Hands-on experience with distributed training frameworks like DeepSpeed or FSDP
• Solid background in GPU programming (CUDA, ROCm) and inference optimisation
• Practical experience with parameter-efficient fine-tuning techniques in real-world applications
• Familiarity with container orchestration tools (Kubernetes, Kubeflow) and cloud-native environments
• Knowledge of serving frameworks like Triton, vLLM, or similar
• Clean, maintainable coding style and a strong testing discipline
• Great communication skills and a collaborative mindset
For more information contact Cam Dalziel, cameron.dalziel@aspirerecruitmentgroup.com, or apply today.
We’re teaming up with one of the leading names in AI, known for pushing the boundaries of what’s possible with large-scale generative models and next-gen cloud infrastructure. We're offering a rare opportunity to step into a Staff Machine Learning Engineer role and play a key part in shaping the platforms powering millions of users across the globe.
You'll be joining a team of exceptional researchers and engineers, all passionate about advancing the field and delivering world-class AI experiences.
Location: London, Oxford Street (Hybrid; onsite in London once a week)
Rate: £1000 – £1200 per day - Outside IR35
Start date: ASAP, 12-month contract
What you'll be doing
• Leading the development of scalable, reliable systems for training and fine-tuning transformer-based models
• Optimising inference pipelines for real-time applications — aiming for low latency and high throughput
• Exploring and applying advanced fine-tuning methods like LoRA, prefix-tuning, and adapters
• Tuning performance across GPUs and systems using tools like DeepSpeed, Triton, TensorRT, and even custom kernels
• Working closely with research, platform, and product teams to deliver new features and enhance the developer experience
• Identifying and resolving bottlenecks through profiling, benchmarking, and performance tuning
• Helping to define best practices for building, testing, and maintaining production ML services and APIs
• Mentoring other engineers and helping to foster a culture of technical excellence and innovation
What our client is looking for
• 7+ years of experience building and deploying large-scale ML systems in production
• Strong Python and PyTorch skills, with a deep understanding of transformers, LLMs, and multimodal models
• Hands-on experience with distributed training frameworks like DeepSpeed or FSDP
• Solid background in GPU programming (CUDA, ROCm) and inference optimisation
• Practical experience with parameter-efficient fine-tuning techniques in real-world applications
• Familiarity with container orchestration tools (Kubernetes, Kubeflow) and cloud-native environments
• Knowledge of serving frameworks like Triton, vLLM, or similar
• Clean, maintainable coding style and a strong testing discipline
• Great communication skills and a collaborative mindset
For more information contact Cam Dalziel, cameron.dalziel@aspirerecruitmentgroup.com, or apply today.