ALOIS Solutions

AI-Ops Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for an AI-Ops Engineer in Stanford, CA, with a contract length of "unknown" and a pay rate of "unknown." Key skills include AWS, DevOps, ML, and automation. Requires a Bachelor's degree and 3+ years of relevant experience.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

Unknown

🗓️ - Date

December 20, 2025

🕒 - Duration

Unknown

🏝️ - Location

On-site

📄 - Contract

Unknown

🔒 - Security

Unknown

📍 - Location detailed

Stanford, CA

🧠 - Skills detailed

#VPC (Virtual Private Cloud) #Docker #Infrastructure as Code (IaC) #Observability #Version Control #Lambda (AWS Lambda) #S3 (Amazon Simple Storage Service) #Kubernetes #ML (Machine Learning) #Compliance #GIT #AI (Artificial Intelligence) #AWS (Amazon Web Services) #Scala #Agile #GDPR (General Data Protection Regulation) #Automation #EC2 #Cloud #Monitoring #Anomaly Detection #Deployment #Terraform #NLP (Natural Language Processing) #Computer Science #IAM (Identity and Access Management) #DevOps

Role description

Title: AI-Ops Engineer Location: Stanford, CA Job Description: Position Overview: • The AI-Ops Engineer is a key technical contributor responsible for evolving traditional DevOps into AI- Ops at CGOE. • This role leverages AI and machine learning to automate and enhance IT operations - including performance monitoring, anomaly detection, root cause analysis, and automated remediation. • Working at the intersection of cloud infrastructure, AI-driven automation, and operational excellence, the engineer embeds intelligence into infrastructure, deployment, and monitoring to ensure high availability, predictive issue resolution, and operational efficiency across CGOE's global online programs. Key Responsibilities: 1. AI-Driven Operations & Automation: • Implement AIOps solutions that use ML algorithms to automate performance monitoring, workload scheduling, and infrastructure management. • Build anomaly detection systems that identify infrastructure issues before they impact users. • Develop automated root cause analysis capabilities using ML to correlate events and filter noise from critical alerts. • Create predictive maintenance workflows that analyze historical patterns to proactively mitigate issues. • Design and implement automated remediation scripts that respond to incidents without human intervention. 1. Observability & Intelligent Monitoring: • Architect comprehensive observability platforms that aggregate data from disparate sources into unified dashboards. • Implement intelligent alerting systems using NLP and ML to reduce alert fatigue and surface actionable insights. • Build real-time analytics dashboards for coordinated diagnosis across teams. • Deploy application performance monitoring (APM) solutions integrated with AI-driven analytics. Ensure end-to-end visibility across cloud infrastructure, applications, and AI/ML workloads. 1. Cloud Infrastructure & DevOps: • Design, build, and maintain scalable, secure AWS infrastructure using Infrastructure as Code (CloudFormation, Terraform, or CDK). • Implement and manage containerized environments using Docker, AWS ECS, Fargate, and Kubernetes (EKS). • Build CI/CD pipelines for continuous delivery, integrating AI-powered code quality and deployment optimization. • Manage cloud automation and optimization to improve cost-efficiency and resource utilization. • Ensure compliance with Stanford and regulatory standards (FERPA, GDPR) for secure data handling and governance. 1. Collaboration & Continuous Improvement: • Partner with cross-functional teams to implement domain-agnostic AIOps solutions across the organization. • Use Git-based version control and code review best practices as part of a collaborative, agile workflow. • Document operational procedures, runbooks, and AIOps workflows for team knowledge sharing. • Continuously evaluate and adopt emerging AIOps tools, AWS services, and AI-driven automation technologies. • Contribute to building an AI-first operational culture that prioritizes automation and predictive capabilities. Required Qualifications: Education & Certifications • Bachelor's degree in Computer Science, DevOps, Cloud Engineering, or a related field (Master's preferred). • AWS certification preferred (Solutions Architect, SysOps Administrator, or DevOps Engineer); Professional-level certification a plus. Experience: • 3+ years of experience in DevOps, SRE, or Cloud Engineering roles. • 2+ years of hands-on experience with AWS infrastructure (EC2, ECS, Lambda, S3, IAM, VPC). • Experience implementing monitoring, observability, and alerting solutions at scale. • Familiarity with ML/AI concepts and their application to operational automation.

Apply now Apply with DFH Sign up

ALOIS Solutions

AI-Ops Engineer

Logistic Reporting Analyst

DHS ICE Data Analyst

Tax Information Reporting Analyst

Data Warehouse Analyst I

Book a

chat

with us

Company