

AIOps Engineer, Google Cloud
β - Featured Role | Apply direct with Data Freelance Hub
This role is for an AIOps Engineer specializing in Google Cloud, offering a 12-month remote contract at a competitive pay rate. Requires 3+ years in IT operations, strong programming skills, and familiarity with cloud platforms and machine learning. Google Cloud certifications preferred.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
-
ποΈ - Date discovered
June 27, 2025
π - Project duration
More than 6 months
-
ποΈ - Location type
Remote
-
π - Contract type
Unknown
-
π - Security clearance
Unknown
-
π - Location detailed
United States
-
π§ - Skills detailed
#Visualization #Splunk #Data Engineering #AWS (Amazon Web Services) #Kafka (Apache Kafka) #Kubernetes #AI (Artificial Intelligence) #Java #PyTorch #Apache Kafka #Automation #Grafana #ML (Machine Learning) #Model Deployment #C++ #Data Pipeline #Docker #Microservices #Data Processing #GCP (Google Cloud Platform) #Anomaly Detection #Programming #Prometheus #Deployment #Spark (Apache Spark) #Cloud #Azure #Python #DevOps #TensorFlow #Apache Spark #Monitoring
Role description
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
- Item 1
- Item 2
- Item 3
Unordered list
- Item A
- Item B
- Item C
Bold text
Emphasis
Superscript
Subscript
Dice is the leading career destination for tech experts at every stage of their careers. Our client, DynPro Inc., is seeking the following. Apply via Dice today!
Role: AIOps Engineer, Google Cloud Location: Remote, USA
Duration: 12-month contract
About the Role:
Google Cloud is at the forefront of innovation, empowering businesses with cutting-edge cloud technologies. As an AIOps Engineer, you will play a critical role in enhancing the reliability, efficiency, and performance of Google Cloud's vast and complex infrastructure.
You will leverage your expertise in Artificial Intelligence and Machine Learning to design, implement, and optimize intelligent automation solutions for IT operations, ultimately improving the experience for our global customers.
This contract position offers a unique opportunity to work on challenging problems at scale, contribute to the evolution of cloud operations, and collaborate with world-class engineers and researchers at Google.
Responsibilities:
β’ Design, develop, and implement AIOps solutions to automate routine operational tasks, detect anomalies proactively, and enable self-healing capabilities across Google Cloud infrastructure.
β’ Apply machine-learning algorithms to large-scale operational data (logs, metrics, traces, events) to predict system failures, identify root causes, and optimize resource utilization.
β’ Build and maintain robust data pipelines for collecting, processing, and analyzing diverse IT operational data from various sources.
β’ Collaborate closely with Site Reliability Engineers (SREs), software developers, and infrastructure teams to integrate AIOps solutions into existing workflows and systems.
β’ Develop and implement monitoring and alerting systems that leverage AI-driven insights to ensure the reliability, availability, and performance of cloud services.
β’ Contribute to the continuous improvement of AIOps platforms and tools, staying current with industry trends and advancements in AI/ML, cloud computing, and IT operations.
β’ Troubleshoot and resolve complex platform-related issues, ensuring minimal impact on critical AI/ML operations and customer services.
β’ Generate reports and visualizations to provide actionable intelligence and communicate insights to stakeholders.
Minimum Qualifications:
β’ 3+ years of experience in platform engineering, DevOps, Site Reliability Engineering (SRE), or IT operations, with a focus on automation and system reliability.
β’ Strong programming skills in Python, Go, Java, or C++.
β’ Experience with cloud platforms (e.g., Google Cloud Platform, AWS, and Azure) and containerization technologies (e.g., Docker, Kubernetes).
β’ Familiarity with data processing frameworks (e.g., Apache Kafka, Apache Spark) and IT monitoring tools (e.g., Prometheus, Grafana, Splunk, ELK stack).
β’ Understanding of machine learning algorithms and concepts, with practical experience in applying them to operational data for anomaly detection, predictive analytics, or root cause analysis.
β’ Excellent problem solving, analytical, and communication skills.
β’ Ability to work collaboratively in a fast-paced, dynamic environment.
Preferred Qualifications:
β’ Experience with MLOps practices, including model deployment, evaluation, and lifecycle management in production environments.
β’ Familiarity with large-scale distributed systems and microservices architectures.
β’ Knowledge of AI/ML frameworks such as TensorFlow, PyTorch, or scikit-learn.
β’ Experience in building self-healing systems and implementing automated remediation workflows.
Google Cloud certifications (e.g., Professional Cloud Architect, Professional Data Engineer, Machine Learning Engineer).