Senior SRE Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Senior SRE Engineer on a contract-to-perm basis in Hartford, CT (Hybrid). Requires 5+ years of SRE/DevOps experience, expertise in Grafana, Prometheus, GCP, and strong skills in Python, Kubernetes, and API management.
🌎 - Country
United States
πŸ’± - Currency
$ USD
-
πŸ’° - Day rate
-
πŸ—“οΈ - Date discovered
September 24, 2025
πŸ•’ - Project duration
Unknown
-
🏝️ - Location type
Hybrid
-
πŸ“„ - Contract type
Unknown
-
πŸ”’ - Security clearance
Unknown
-
πŸ“ - Location detailed
Hartford, CT
-
🧠 - Skills detailed
#Prometheus #AI (Artificial Intelligence) #Monitoring #DevOps #Cloud #Logging #YAML (YAML Ain't Markup Language) #Docker #Splunk #Python #GIT #JSON (JavaScript Object Notation) #Automation #Security #Observability #Java #Grafana #Kubernetes #API (Application Programming Interface) #Linux #Deployment #Bash #GCP (Google Cloud Platform)
Role description
Hartford, CT (Hybrid) Contract to perm role MUST HAVE: GRAFANA, PROMETHEUS, CLOUD exp (GCP desired), LOGGING, TRACING, WEB PORTAL Metrics Looking for a senior level resource that can grow into lead role as team expands Job Description β€’ Design and implement comprehensive SRE monitoring for web portal on GCP β€’ Set up JVM metrics collection and performance monitoring for Java applications using GCP Monitoring β€’ Implement logging and tracing standards across all portal components using Cloud Logging and Cloud Trace β€’ Configure APIGEE monitoring and API performance tracking for portal services β€’ Implement distributed tracing with W3C Trace Context headers and OpenTelemetry β€’ Create drill-down dashboards with correlation between metrics, logs, and traces using GCP tools β€’ Integrate GCP Monitoring, Logging, and Trace with existing Prometheus/Grafana stack β€’ Configure GMP (Google Managed Prometheus) for enhanced metrics collection β€’ Implement UI zero code instrumentation for frontend monitoring and traceability β€’ Create RED (Request, Error, Duration) dashboards for Performance and Production environments β€’ Build service health dashboards with drill-down capabilities and error message analysis β€’ Develop and maintain SRE automation/scripts within GKE namespaces (SRE and others) for monitoring, deployment, and troubleshooting. Experience: 5+ years in SRE/DevOps with proven JVM, APIGEE, GCP observability, Grafana stack, GKE, OpenTelemetry, and UI instrumentation implementation experience Clear Skills Needed β€’ Technical: Python, Linux, Prometheus, Grafana, Kubernetes, Docker, Loki, Tempo β€’ JVM Metrics: Java application monitoring, JVM performance tuning, heap analysis, garbage collection optimization for portal applications β€’ Logging & Tracing: Splunk, distributed tracing, log aggregation standards, correlation IDs across portal systems β€’ API Management: APIGEE experience, API monitoring, rate limiting, security, performance tracking for portal APIs β€’ Infrastructure: CI/CD pipelines , AI tools like GIT copilot , Cursor etc. β€’ Observability Tools & Query Languages: PromQL, InfluxQL for querying metrics(Grafana) β€’ Strong experience with Kubernetes (GKE), including namespace management, RBAC, and deploying/maintaining SRE tools via code (Python, Bash, YAML, Helm). Additional Critical Skills β€’ Distributed Tracing Standards: W3C Trace Context headers implementation β€’ Structured Logging: JSON format with specific fields (trace\_id, service.name, log.level, customer.id, request.id) β€’ Performance Baseline Establishment: Ability to collect and analyze 2-4 weeks historical data for performance baselines β€’ Dashboard Implementation: Drill-down capabilities, service mapping from trace data, correlation between metrics/logs/traces GCP-Specific Observability Skills (CRITICAL) β€’ Google Cloud Monitoring: GMP (Google Managed Prometheus), Cloud Monitoring dashboards, alerting policies β€’ Google Cloud Logging: Centralized logging, log-based metrics, log exports β€’ OpenTelemetry (OTEL): Instrumentation, collectors, data collection from GCP services UI Instrumentation & Frontend Monitoring (CRITICAL) β€’ UI Span Management: Naming conventions for UI-initiated spans, W3C Trace Context headers for frontend β€’ Frontend Observability: User session tracking, component-level monitoring, UI performance metrics β€’ Cross-Platform Tracing: End-to-end traceability from UI to backend services