Integrated Resources, Inc ( IRI )

Machine Learning Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for a Machine Learning Engineer, partially remote for local candidates near Orlando, FL, Glendale, CA, Anaheim, CA, or Seattle, WA. Contract length is 22 months with a competitive pay rate. Requires 2-4 years in Ops, strong cloud cost management, and proficiency in GCP.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

680

🗓️ - Date

February 13, 2026

🕒 - Duration

More than 6 months

🏝️ - Location

Hybrid

📄 - Contract

Unknown

🔒 - Security

Unknown

📍 - Location detailed

Orlando, FL

🧠 - Skills detailed

#Scala #Grafana #Python #Computer Science #Visualization #Datadog #Documentation #Observability #Tableau #Model Evaluation #Monitoring #Security #Scripting #API (Application Programming Interface) #SQL (Structured Query Language) #AI (Artificial Intelligence) #Cloud #Prometheus #AWS (Amazon Web Services) #Langchain #Compliance #"ETL (Extract #Transform #Load)" #Databases #Data Analysis #Deployment #Looker #GCP (Google Cloud Platform) #Model Deployment #Base #Batch #DataOps #MLflow #Storage #Triggers #A/B Testing #ML (Machine Learning) #Azure

Role description

Job Title:- Artificial Intelligence/Machine learning Engineer Location:- PARTIALLY REMOTE JOB - Local Candidates Required Near to Orlando, FL 32819 / Glendale, CA (91201) / Anaheim, CA (92802) / Seattle, WA (98104) Duration:- 22 Months (Possible for extension) PARTIALLY REMOTE JOB - Local Candidates Required Near to Orlando, FL 32819 / Glendale, CA (91201) / Anaheim, CA (92802) / Seattle, WA (98104) Job description:- AI/ML Operations • Manage operational workflows for model deployments, updates, and versioning across GCP, Azure, and AWS • Monitor model performance metrics: latency, throughput, error rates, token usage, and inference quality • Track model drift, accuracy degradation, and performance anomalies—escalating to engineering as needed • Support knowledge base operations including vector embedding pipeline health, chunk quality, and refresh cycles in Vertex AI • Maintain model inventory and documentation across multi-cloud environments • Coordinate model evaluation cycles with Responsible AI and Core Engineering teams Agent & MCP Server Operations • Monitor AI agent health, performance, and reliability (AutoGen-based agents, MCP servers) • Track agent execution metrics: task completion rates, tool call success/failure, latency, and error patterns • Support agent deployment and configuration management workflows • Document agent behaviors, known issues, and operational runbooks • Coordinate with Core Engineering on agent updates, testing, and rollouts • Monitor MCP server availability, connection health, and integration status FinOps & Cost Management • Track and analyze AI/ML cloud spend across GCP (Vertex AI), Azure (OpenAI), and AWS (Bedrock) • Build cost dashboards with breakdowns by model, application team, use case, and environment • Monitor token consumption, inference costs, and embedding/storage costs • Identify cost optimization opportunities—model selection, caching, batching, rightsizing • Provide cost allocation reporting for chargeback/showback to consuming application teams • Forecast spend trends and flag budget anomalies • Partner with Infrastructure and Finance teams on AI cost governance Monitoring, Dashboarding & Reporting • Build and maintain dashboards for platform performance, model health, agent metrics, and operational KPIs • Create executive and stakeholder reports on platform adoption, usage trends, and cost allocation • Develop Responsible AI dashboards tracking hallucination rates, accuracy metrics, guardrail triggers, and safety incidents • Monitor APIGEE gateway traffic patterns and API consumption trends • Provide regular reporting to product management on use case performance Release Operations Support • Support release management processes with pre/post-deployment validation checks • Track release health metrics for models, agents, and platform components • Maintain release documentation, runbooks, and operational playbooks • Coordinate with QA, Performance Engineering, and Infrastructure teams during releases Responsible AI Operations • Monitor guardrail effectiveness and flag anomalies to the Responsible AI team • Track and report on hallucination detection, content safety triggers, and accuracy trends • Support LLM Red Teaming efforts by collecting and organizing evaluation data • Maintain audit logs and compliance documentation for AI governance Cross-Functional Coordination • Serve as operational point of contact for application teams consuming DxT AI APIs • Coordinate with Corporate Security on audit requests and compliance reporting • Partner with Infrastructure team on capacity tracking and resource utilization • Support Performance Engineering with load test analysis and results documentation Basic Qualifications:- • 2-4 years in an Ops, Analytics, or Technical Operations role (MLOps, AIOps, DataOps, Platform Ops, or similar) • Understanding of AI/ML concepts: models, inference, embeddings, vector databases, LLMs, tokens, prompts • Experience with cloud cost management and FinOps—tracking, analyzing, and optimizing cloud spend Strong proficiency with dashboarding and visualization tools (Looker, Tableau, Grafana, or similar) • Working knowledge of GCP (required); familiarity with Azure and AWS a plus • Comfortable with SQL and basic Python for data analysis and scripting • Experience with monitoring and observability platforms (Datadog, Prometheus/Grafana, Cloud Monitoring, or similar) • Understanding of APIs and API gateways ability to read logs, trace requests, analyze traffic • Strong analytical and problem-solving skills with attention to detail • Excellent communication skills able to translate technical metrics into stakeholder insights • College degree in Computer Science, BIS, MIS, EE, ME or similar is required Preferred Qualifications • Hands-on experience with LLM platforms: Vertex AI, Azure OpenAI, AWS Bedrock • Familiarity with AI agents and agentic architectures (AutoGen, LangChain, or similar) • Exposure to MCP (Model Context Protocol) or agent-tool integration patterns • Experience with vector databases and RAG (Retrieval-Augmented Generation) operations • Understanding of MLOps lifecycle: model registry, versioning, deployment patterns, A/B testing • Experience with APIGEE or similar API management platforms • Familiarity with Responsible AI metrics hallucination, bias, content safety, guardrails • FinOps certification or formal cloud cost management experience • Experience supporting enterprise platform teams with multiple consuming applications Nice to Have • Familiarity with ML pipeline tools (Kubeflow, MLflow, Vertex AI Pipelines) • Exposure to prompt management and evaluation frameworks • ITIL or operational process framework experience • Experience creating runbooks and operational documentation Required Education:- Bachelor's Degree in CS, BIS, MIS, EE, ME or similar Feel free to forward my email to your friends/colleagues who might be available. We do offer a referral bonus! Thank you for your time and consideration. I am looking forward to hearing from you.

Apply now Apply with DFH

← See all roles

Go to role

Inter-American Development Bank

is hiring for a:

Integrated Resources, Inc ( IRI )

Machine Learning Engineer

AI Architect Specialist

Generative AI Engineer (contract)

Cloud Data Engineer

Product Owner - AI (contract)

Book a

chat

with us

Company