

Cloud Engineer - AWS Observability
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Cloud Engineer - AWS Observability, offering a contract length of "unknown" with a pay rate of "unknown." Key skills include 7+ years in observability engineering, AWS expertise, and proficiency in observability tools. Hybrid work location.
π - Country
United Kingdom
π± - Currency
Β£ GBP
-
π° - Day rate
-
ποΈ - Date discovered
June 7, 2025
π - Project duration
Unknown
-
ποΈ - Location type
Hybrid
-
π - Contract type
Unknown
-
π - Security clearance
Unknown
-
π - Location detailed
Telford, England, United Kingdom
-
π§ - Skills detailed
#ML (Machine Learning) #Datadog #DevOps #S3 (Amazon Simple Storage Service) #AWS (Amazon Web Services) #Grafana #Bash #Strategy #Automation #Prometheus #Monitoring #Terraform #Security #Data Governance #Logging #Lambda (AWS Lambda) #Pega #Observability #Splunk #Scripting #Python #Athena #API (Application Programming Interface) #Compliance #Anomaly Detection #Data Integration #AI (Artificial Intelligence) #Redshift #Microservices #Data Lineage #Data Pipeline #Dynatrace #Infrastructure as Code (IaC) #Cloud
Role description
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
- Item 1
- Item 2
- Item 3
Unordered list
- Item A
- Item B
- Item C
Bold text
Emphasis
Superscript
Subscript
"We are looking for a technically proficient Observability Subject Matter Expert (SME) to architect, implement, and manage observability frameworks across a complex hybrid-cloud environment. This role will focus on AWS-native services (Connect, Data, Integration), enterprise platforms (Pega, Contact Center), and the underlying infrastructure, ensuring end-to-end visibility, performance optimization, and proactive incident response.
Key Responsibilities:
Β· Observability Architecture & Strategy:
Β· Design and implement observability pipelines using AWS-native and third-party tools.
Β· Define telemetry standards (metrics, logs, traces) across microservices, APIs, and data pipelines.
Β· Establish SLIs/SLOs and integrate them into service health dashboards.
Β· AWS Workload Monitoring:
Β· Implement observability for AWS Connect (contact flows, agent metrics, call quality).
Β· Monitor AWS Data Services (Glue, Redshift, Athena, S3, Lake Formation) for performance, throughput, and data lineage.
Β· Integrate AWS Integration Services (API Gateway, EventBridge, Step Functions, Lambda) with distributed tracing and structured logging.
Β· Tooling & Automation:
Β· Deploy and manage observability tools: CloudWatch, X-Ray, OpenTelemetry, Prometheus, Grafana, Datadog, Splunk, ELK.
Β· Automate alerting, anomaly detection, and incident correlation using AI/ML-based tools.
Β· Integrate observability into CI/CD pipelines and Infrastructure-as-Code (IaC) workflows.
Β· Incident Management & RCA:
Β· Lead real-time diagnostics during major incidents using telemetry data.
Β· Conduct post-incident reviews with detailed root cause analysis and observability insights.
Β· Collaboration & Governance:
Β· Work closely with DevOps, Security, and Application teams to enforce observability standards.
Β· Ensure compliance with data governance, retention, and security policies for telemetry data.
Required Skills & Experience:
Β· 7+ years in observability engineering.
Β· Deep expertise in AWS services, especially AWS Connect, Glue, Lambda, API Gateway, S3, Infrastructure and Network
Β· Strong hands-on experience with observability stacks such as : Dynatrace OpenTelemetry, Prometheus, Grafana, Datadog, Splunk, ELK, CloudWatch/X-Ray.
Β· Proficient in scripting (Python, Bash) and IaC (Terraform, CloudFormation).
Β· Experience with monitoring enterprise platforms like Pega and Contact Center systems.
Β· Solid understanding of distributed systems, networking, and application performance tuning."