On-Device AI Runtime Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for an On-Device AI Runtime Engineer on a 3-month contract, with a pay rate of $78.00 - $93.00 (DOE), located in San Diego or Sunnyvale, CA. Key skills include Swift, Metal Performance Shaders, and ML model optimization.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

744

🗓️ - Date discovered

August 30, 2025

🕒 - Project duration

3 to 6 months

🏝️ - Location type

On-site

📄 - Contract type

Unknown

🔒 - Security clearance

Unknown

📍 - Location detailed

Cupertino, CA

🧠 - Skills detailed

#TensorFlow #Data Science #Data Analysis #ML (Machine Learning) #PyTorch #AI (Artificial Intelligence) #Model Optimization #Scala #Consulting

Role description

A globally leading technology company is looking for an On-Device AI Runtime Engineer to join its cutting-edge AI team. In this position, you will be a key contributor in building high-performance machine learning inference systems, developing optimized AI drivers for edge devices, creating scalable model lifecycle management solutions, and delivering efficient on-device runtime drivers for AI inference across a wide range of hardware platforms. This role will focus on optimizing and deploying machine learning models on edge and mobile devices. Please note that this is not a Data Science or Data Analyst role. Job Responsibilities: • Design and implement robust Core ML model optimization pipelines for deploying large-scale ML models on resource-constrained devices. • Support product engineering teams by consulting on AI model performance, iterating on inference solutions to solve real-world mobile/edge AI problems, and developing/delivering custom on-device AI frameworks. • Interface with hardware and platform teams to ensure optimal utilization of neural processing units (NPUs), GPUs, and specialized AI accelerators across the device ecosystem. Minimum Qualifications: • Strong proficiency in Swift/Objective-C and Metal Performance Shaders. • Familiar with various ML model formats such as Core ML, ONNX, TensorFlow Lite, and PyTorch Mobile. • Strong critical thinking, performance optimization, and low-level system design skills. • Experience with model quantization, pruning, and hardware-aware neural architecture optimization. • Experience with real-time inference pipelines and latency-critical AI applications. • Understanding of mobile device thermal management, power consumption patterns, and compute resource allocation for AI workloads. Type: Contract Duration: 3 months (with a possibility for extension) Work Location: San Diego or Sunnyvale, CA (On-site) Pay rate: $78.00 - $93.00 (DOE)

Apply now Apply with DFH Sign up

← See all roles

Go to role

On-Device AI Runtime Engineer

Premium Members Land Roles Faster—Upgrade today.

Business Data Analyst- W2 Role

Analytics Consultant

Senior BI Developer

Lead Business Analyst

Premium Members Land Roles Faster—Upgrade today.

Book a

chat

with us

Company