Crossing Hurdles

Machine Learning Engineer (CUDA) | $250/hr Remote

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Machine Learning Engineer (CUDA) with a contract length of 10-40 hours/week at $120–$250/hour, remote. Key skills include CUDA expertise, GPU architecture, and performance optimization. Familiarity with PyTorch or TensorFlow is preferred.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
250
-
🗓️ - Date
November 12, 2025
🕒 - Duration
Unknown
-
🏝️ - Location
Remote
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
United States
-
🧠 - Skills detailed
#TensorFlow #PyTorch #AI (Artificial Intelligence) #ML (Machine Learning) #Documentation
Role description
At Crossing Hurdles, we work as a referral partner. We refer candidates to Mercor that collaborates with the world’s leading AI research labs to build and train cutting-edge AI models. Organization: Mercor Position: CUDA Kernel Optimizer – ML Engineer Type: Hourly Contract Compensation: $120–$250/hour Location: Remote Commitment: 10-40hr/week, flexible hours Role Responsibilities (Training support will be provided) • Develop, optimize, and benchmark CUDA kernels for tensor and operator workloads • Tune for occupancy, memory coalescing, instruction-level parallelism, and optimal warp scheduling • Profile and diagnose performance bottlenecks with tools such as Nsight Systems and Nsight Compute • Report performance results, analyze speedups, and propose architectural improvements • Integrate kernels with PyTorch and collaborate asynchronously with operator specialists • Produce reproducible benchmarks and write comprehensive performance documentation Required Qualifications • Deep expertise in CUDA, GPU architecture, and memory optimization • Proven record of quantifiable performance improvements across hardware generations • Proficiency with mixed precision, Tensor Core usage, and low-level numerical stability • Familiarity with PyTorch, TensorFlow, or Triton (preferred but not required) • Strong communication and independent problem-solving skills • Demonstrated contributions in open-source, research, or performance benchmarking Application Process: 1. Upload resume 1. AI interview based on your resume (15 min) 1. Submit form