AI Engineer – Video & Multimodal AI

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for an AI Engineer – Video & Multimodal AI, offering a remote contract for candidates with 10+ years of experience. Key skills include Python, PyTorch, TensorFlow, and MLOps. A preferred MS/PhD from an Ivy League institution is required.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

🗓️ - Date discovered

June 26, 2025

🕒 - Project duration

Unknown

🏝️ - Location type

Remote

📄 - Contract type

Unknown

🔒 - Security clearance

Unknown

📍 - Location detailed

United States

🧠 - Skills detailed

#TensorFlow #Kubernetes #Leadership #Data Science #Deployment #Computer Science #Scala #PyTorch #IP (Internet Protocol) #Python #Docker #MLflow #Deep Learning #AI (Artificial Intelligence) #ML (Machine Learning)

Role description

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5

Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

Item 1
Item 2
Item 3

Unordered list

Item A
Item B
Item C

Text link

Bold text

Emphasis

^Superscript

_Subscript

Job Title: AI Engineer – Video & Multimodal AI Location: USA-Remote Experience Level: 10+ Years About the Role: We are hiring a AI Engineer to spearhead the design, fine-tuning, and scalable deployment of cutting-edge AI systems, with a focus on deep learning, video intelligence, and multi-modal (vision + language) models. The ideal candidate has a strong academic foundation, preferably from Ivy League institutions—and proven experience in driving innovative AI solutions from research to production. Key Responsibilities: • Architect and lead the development of large-scale video AI and vision-language models (VLMs). • Fine-tune and optimize Large Language Models (LLMs) and Multi-modal Large Language Models (MLLMs) for task-specific applications. • Scale model training and evaluation across distributed systems with an emphasis on GPU/accelerated environments. • Build and maintain robust AI pipelines for training, evaluation, benchmarking, and deployment using state-of-the-art MLOps tools. • Drive performance optimization of models for real-time inference using tools like TensorRT, ONNX, and NVIDIA Triton. • Collaborate cross-functionally with data scientists, researchers, and platform engineers to align model development with business goals. • Publish internal/external papers and contribute to IP creation and thought leadership in AI innovation. Minimum Qualifications: • MS or Postgraduate degree in Computer Science or related field (PhD preferred); strong preference for Ivy League graduates. • 10+ years of industry or research experience in AI/ML, with a focus on Deep Learning, Video AI, and multi-modal systems. • Advanced proficiency in Python and DL frameworks such as PyTorch and TensorFlow. • Deep expertise in fine-tuning LLMs and MLLMs, including prompt engineering, transfer learning, and embedding-based techniques. • Proven experience scaling AI model training and inference across multi-GPU and distributed compute platforms. • Strong hands-on knowledge of MLOps practices, including Docker, Kubernetes, MLFlow, and model serving. Preferred Skills: • Familiarity with NVIDIA’s AI ecosystem (TensorRT, Triton Inference Server, DeepStream SDK). • Experience with retrieval-augmented generation (RAG), attention-based models, and real-time video inference. • Prior experience in leading AI teams or projects and mentoring junior researchers/engineers. • Publications, patents, or open-source contributions in the field of AI/ML.

Apply now Apply with DFH Sign up

← See all roles