Patton Labs Inc

SRE -Google Distributed Cloud Edge (GDCE) Based Restaurant Workloads

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for an SRE specializing in Google Distributed Cloud Edge for restaurant workloads, offering a contract length of "unknown," with a pay rate of "unknown." Key skills include Google Edge, Kubernetes, Terraform, and observability tools.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

Unknown

🗓️ - Date

March 4, 2026

🕒 - Duration

Unknown

🏝️ - Location

Unknown

📄 - Contract

Unknown

🔒 - Security

Unknown

📍 - Location detailed

Chicago, IL

🧠 - Skills detailed

#Cloud #Terraform #Monitoring #Logging #Alation #Automation #Deployment #Kubernetes #Leadership #Prometheus #Observability #Scala

Role description

Site Reliability Engineering (SRE) Leadership • Leads the SRE framework for GDCE based restaurant workloads, applying Google’s core principles around SLIs, SLOs, error budgets, and golden signals (latency, traffic, errors, saturation). • Defines end to end reliability objectives for the GDC Connect platform, ensuring consistent behavior across 24,000+ geographically distributed restaurant nodes. • Establishes runbooks, playbooks, and automated remediation workflows, reducing MTTR and ensuring consistent responses across global operations. • Implements proactive failure detection using distributed monitoring patterns and Google SRE best practices, enabling early identification of degraded services before they impact restaurant operations. Platform Reliability Engineering • Architects a resilient GDCE platform capable of operating in low bandwidth or intermittent connectivity environments typical of QSR stores. • Designs high availability clusters using Google Distributed Cloud Edge capabilities such as local control planes, fleet registration, and secure edge node lifecycle management. • Sets up self healing infrastructure patterns using Kubernetes health checks, auto restarts, policy controls, and declarative GitOps-driven configuration. • Enables versioned rollouts, canary deployments, blue green upgrades, ensuring zero downtime restaurant service continuity during releases. Observability, Monitoring, and Alerting (Aligned to Google Cloud Operations Suite) • Integrates GDCE clusters with Cloud Logging, Cloud Monitoring, Managed Service for Prometheus, and Cloud Trace to establish full stack observability. • Creates centralized alarms and event consolidation for store level services—covering GDC Connect modules, POS-facing services, ordering APIs, kiosk integration points, and network health. • Defines multi-level alerting policies (store → region → global) to ensure the right stakeholders are notified with context-rich insights. • Builds actionable dashboards and heatmaps for real-time fleet visibility and rollout readiness. Operational Excellence & Platform Support • Designs and operationalizes the Platform Support Model (L1.5/L2/L3) for GDCE-backed restaurant workloads. • Establishes ticket triage workflows, escalation paths, incident swarming practices, and KPIs such as MTTA, MTTR, and platform uptime targets. • Oversees the release certification pipeline—validating every store release through automated tests, conformance checks, resource baselines, and failure rollback mechanisms. • Collaborates closely with McDonald's Global Operations, Google engineering groups, and Accenture to maintain a compliant, stable, and governed platform for all restaurant markets. Automation & Tooling • Drives automation using Terraform, Config Sync, GitOps pipelines, and Google-provided GDCE provisioning frameworks. • Automates: o Cluster provisioning o Edge node onboarding o Software rollout orchestration o Store-level configuration sync • Ensures fleet-wide consistency through declarative definitions and automated drift detection. High scale Distributed Support Mindset • Brings expertise in managing large, globally distributed footprints, ensuring: o Zero-impact upgrades o Predictable deployments o Scalable edge support o Efficient troubleshooting at store, region, and fleet levels • Designs run at-scale diagnostic routines, remote recovery actions, and self-service operations tooling for store support teams." What are the Mandatory skills and skill proficiencies required for this position? strong hands‑on experience with Google Edge

Apply now Apply with DFH

← See all roles