

Cloud Engineer
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Cloud Engineer with a contract length of "unknown" and a pay rate of "$/hour." The position requires expertise in Prometheus, Grafana, AWS, and Python scripting. A background in Software Engineering is preferred, along with L3 support experience.
🌎 - Country
United States
💱 - Currency
$ USD
💰 - Day rate
Unknown
Unknown
🗓️ - Date discovered
April 29, 2025
🕒 - Project duration
Unknown
🏝️ - Location type
Unknown
📄 - Contract type
W2 Contractor
🔒 - Security clearance
Unknown
📍 - Location detailed
Dallas, TX
🧠 - Skills detailed
#S3 (Amazon Simple Storage Service) #Programming #Cloud #EC2 #Scripting #Databases #Java #Datadog #Alation #Data Analysis #Kubernetes #AWS (Amazon Web Services) #Grafana #Scala #AWS EC2 (Amazon Elastic Compute Cloud) #Storage #Observability #Monitoring #Visualization #Python #Automation #Lambda (AWS Lambda) #Prometheus
Role description
W2 only
Special Instructions
This one we're going to be looking for someone who is really strong Prometheus and Grafana experience to help out as they migrate to OTEL Observability platform.
Ideally, they'd like someone to have come from a Software Engineering background earlier in their career and they got into the Cloud Space, They will not be doing much programming and development like the other roles but will do some scripting working in automation with Python.
They'd be expected to provide L3 support as needed.
Top Skills:
• Prometheus
• Grafana
• AWS, EC2, S3, Lambda
Nice to Have: Datadog & Kubernetes
Description
• Environment: We use Prometheus, which monitors cloud-native systems, such as Kubernetes. The data is graphically processed with the help of Grafana and made available in a dashboard and alerts to alerting.
• Looking for an experienced SME/ Sr Engineer with a deep understanding of Grafana and Prometheus to join our team. In this role, you will be responsible for optimizing and advancing our monitoring and observability systems. Your expertise will be critical in ensuring the reliability, performance, and scalability of our infrastructure. Additionally, we are looking for engineer to be doing automation and building tools for Observability domain using opensource technology (OTEL, Open Search, Grafana, Open Tofu) and Cloud Technologies (EKS, EC2, S3, Cloud Networking). Expertise in any one or more programing language (Python, Go lang, Java)
Key responsibilities:
1. Monitoring and Alerting:
• Design and manage alerting rules for proactive issue identification and resolution.
• Continuously improve and expand monitoring coverage to meet evolving needs.
• Collaborate with teams to define alert thresholds and escalation procedures.
1. Data Analysis and Visualization:
• Analyze metrics data to identify performance bottlenecks and areas for improvement.
• Create meaningful visualizations and reports to provide insights for stakeholders.
• Contribute to the enhancement of data retention and archiving strategies.
1. Scaling and Optimization:
• Collaborate with the infrastructure team to ensure seamless integration and scalability of Grafana and Prometheus.
• Fine-tune configurations to achieve optimal resource utilization and performance.
• Proven experience as an L3 Engineer specializing in Grafana and Prometheus administration.
• Proficiency in creating custom Grafana dashboards and queries.
• Strong understanding of monitoring best practices, alerting, and data analysis.
• Knowledge of time-series databases and storage strategies.
1. Automation and Development
• Scripting and automation skills for efficient system management.
• Building OTEL based component for Observability Stack
• Automation building Observability query language conversions
W2 only
Special Instructions
This one we're going to be looking for someone who is really strong Prometheus and Grafana experience to help out as they migrate to OTEL Observability platform.
Ideally, they'd like someone to have come from a Software Engineering background earlier in their career and they got into the Cloud Space, They will not be doing much programming and development like the other roles but will do some scripting working in automation with Python.
They'd be expected to provide L3 support as needed.
Top Skills:
• Prometheus
• Grafana
• AWS, EC2, S3, Lambda
Nice to Have: Datadog & Kubernetes
Description
• Environment: We use Prometheus, which monitors cloud-native systems, such as Kubernetes. The data is graphically processed with the help of Grafana and made available in a dashboard and alerts to alerting.
• Looking for an experienced SME/ Sr Engineer with a deep understanding of Grafana and Prometheus to join our team. In this role, you will be responsible for optimizing and advancing our monitoring and observability systems. Your expertise will be critical in ensuring the reliability, performance, and scalability of our infrastructure. Additionally, we are looking for engineer to be doing automation and building tools for Observability domain using opensource technology (OTEL, Open Search, Grafana, Open Tofu) and Cloud Technologies (EKS, EC2, S3, Cloud Networking). Expertise in any one or more programing language (Python, Go lang, Java)
Key responsibilities:
1. Monitoring and Alerting:
• Design and manage alerting rules for proactive issue identification and resolution.
• Continuously improve and expand monitoring coverage to meet evolving needs.
• Collaborate with teams to define alert thresholds and escalation procedures.
1. Data Analysis and Visualization:
• Analyze metrics data to identify performance bottlenecks and areas for improvement.
• Create meaningful visualizations and reports to provide insights for stakeholders.
• Contribute to the enhancement of data retention and archiving strategies.
1. Scaling and Optimization:
• Collaborate with the infrastructure team to ensure seamless integration and scalability of Grafana and Prometheus.
• Fine-tune configurations to achieve optimal resource utilization and performance.
• Proven experience as an L3 Engineer specializing in Grafana and Prometheus administration.
• Proficiency in creating custom Grafana dashboards and queries.
• Strong understanding of monitoring best practices, alerting, and data analysis.
• Knowledge of time-series databases and storage strategies.
1. Automation and Development
• Scripting and automation skills for efficient system management.
• Building OTEL based component for Observability Stack
• Automation building Observability query language conversions