

Searchabilityยฎ
Engineering Lead
โญ - Featured Role | Apply direct with Data Freelance Hub
This role is for an Engineering Lead focused on observability and operational resilience, offering a 6-month contract with a pay rate of "unknown." Requires expertise in Prometheus and Grafana, along with 6+ years in SRE or DevOps within enterprise environments.
๐ - Country
United Kingdom
๐ฑ - Currency
ยฃ GBP
-
๐ฐ - Day rate
Unknown
-
๐๏ธ - Date
February 12, 2026
๐ - Duration
Unknown
-
๐๏ธ - Location
Hybrid
-
๐ - Contract
Unknown
-
๐ - Security
Unknown
-
๐ - Location detailed
London
-
๐ง - Skills detailed
#Data Ingestion #Deployment #Strategy #Grafana #Observability #Documentation #Logging #API (Application Programming Interface) #Leadership #DevOps #Monitoring #Automation #Prometheus
Role description
NEW CONTRACT ROLE - ENGINEERING LEAD (OBSERVABILITY / SRE) | ASAP START | UK (Remote / Hybrid) | 6-Month Contract | Possible Extension | London, Manchester, Birmingham or Edinburgh
THE OPPORTUNITY
We're looking for an experienced Engineering Lead to support a critical enterprise observability and operational resilience programme.
This role is focused on leading the uplift of monitoring, alerting, and end-to-end service visibility across business-critical applications. It's ideal for a senior, hands-on engineering lead with deep Prometheus and Grafana expertise, capable of guiding best practices across SRE, platform, and application teams.
THE ROLE
โข Lead collaboration with Application Stewards and Site Reliability Engineers (SREs) to confirm critical services and assets in scope for monitoring verification and uplift
โข Work with EMAS to analyse Prometheus scrape coverage, exporter deployment, and Grafana dashboard availability for critical applications
โข Drive improvements across monitoring configuration, alert quality, metrics, dashboards, KPIs, SLIs, and SLOs
โข Lead the optimisation of alerting to ensure alerts are reliable, actionable, and noise-optimised, applying Alertmanager best practices
โข Oversee delivery of automated end-to-end business flow visibility through Grafana service maps, dependency visualisation, and topology integrations
โข Review observability roles and responsibilities and recommend improvements aligned to Operational Resilience standards
โข Champion automation and API-driven approaches for dashboard provisioning, alert management, and data ingestion
โข Ensure clear documentation of standards, configurations, and improvements delivered
TECHNICAL SKILLS / REQUIREMENTS
Strong hands-on and leadership experience with:
Prometheus - instrumentation strategy, exporters, service discovery, custom metrics, PromQL, recording rules, alerting rules, HA architectures (Thanos, Cortex, Mimir)
Grafana - dashboard and panel design, alerting and routing, synthetic monitoring, Loki, real user monitoring (e.g. Grafana Faro)
Observability Ecosystem - integration of metrics, logs, and traces (Loki, Tempo, OpenTelemetry), APIs and automation
PROFILE
โข Proven experience as a Senior Engineer, Technical Lead, or Engineering Lead within SRE, Observability, DevOps, or Platform Engineering
โข Comfortable leading technical direction while remaining hands-on
โข Strong stakeholder engagement and communication skills
โข Experience operating in complex, enterprise-scale or regulated environments
โข Typically 6+ years' experience in reliability engineering, monitoring, or observability-focused roles
KEYWORDS
Engineering Lead, Observability Engineering, Site Reliability Engineering, SRE, Prometheus, Grafana, Alertmanager, PromQL, Monitoring, Operational Resilience, DevOps, Platform Engineering, Metrics, Logging, Tracing, OpenTelemetry, Loki, Tempo, Thanos, Cortex, Mimir
NEW CONTRACT ROLE - ENGINEERING LEAD (OBSERVABILITY / SRE) | ASAP START | UK (Remote / Hybrid) | 6-Month Contract | Possible Extension | London, Manchester, Birmingham or Edinburgh
THE OPPORTUNITY
We're looking for an experienced Engineering Lead to support a critical enterprise observability and operational resilience programme.
This role is focused on leading the uplift of monitoring, alerting, and end-to-end service visibility across business-critical applications. It's ideal for a senior, hands-on engineering lead with deep Prometheus and Grafana expertise, capable of guiding best practices across SRE, platform, and application teams.
THE ROLE
โข Lead collaboration with Application Stewards and Site Reliability Engineers (SREs) to confirm critical services and assets in scope for monitoring verification and uplift
โข Work with EMAS to analyse Prometheus scrape coverage, exporter deployment, and Grafana dashboard availability for critical applications
โข Drive improvements across monitoring configuration, alert quality, metrics, dashboards, KPIs, SLIs, and SLOs
โข Lead the optimisation of alerting to ensure alerts are reliable, actionable, and noise-optimised, applying Alertmanager best practices
โข Oversee delivery of automated end-to-end business flow visibility through Grafana service maps, dependency visualisation, and topology integrations
โข Review observability roles and responsibilities and recommend improvements aligned to Operational Resilience standards
โข Champion automation and API-driven approaches for dashboard provisioning, alert management, and data ingestion
โข Ensure clear documentation of standards, configurations, and improvements delivered
TECHNICAL SKILLS / REQUIREMENTS
Strong hands-on and leadership experience with:
Prometheus - instrumentation strategy, exporters, service discovery, custom metrics, PromQL, recording rules, alerting rules, HA architectures (Thanos, Cortex, Mimir)
Grafana - dashboard and panel design, alerting and routing, synthetic monitoring, Loki, real user monitoring (e.g. Grafana Faro)
Observability Ecosystem - integration of metrics, logs, and traces (Loki, Tempo, OpenTelemetry), APIs and automation
PROFILE
โข Proven experience as a Senior Engineer, Technical Lead, or Engineering Lead within SRE, Observability, DevOps, or Platform Engineering
โข Comfortable leading technical direction while remaining hands-on
โข Strong stakeholder engagement and communication skills
โข Experience operating in complex, enterprise-scale or regulated environments
โข Typically 6+ years' experience in reliability engineering, monitoring, or observability-focused roles
KEYWORDS
Engineering Lead, Observability Engineering, Site Reliability Engineering, SRE, Prometheus, Grafana, Alertmanager, PromQL, Monitoring, Operational Resilience, DevOps, Platform Engineering, Metrics, Logging, Tracing, OpenTelemetry, Loki, Tempo, Thanos, Cortex, Mimir






