Idexcel

Observability – DevOps Architect

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for an "Observability – DevOps Architect" with a contract length of 16+ months, located in a hybrid setting in Washington DC. Requires 5+ years in DevOps/SRE, expertise in AWS, Terraform, and observability stacks, plus strong scripting skills.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
October 8, 2025
🕒 - Duration
More than 6 months
-
🏝️ - Location
Hybrid
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Washington, DC
-
🧠 - Skills detailed
#GitHub #AWS (Amazon Web Services) #GIT #Scripting #Deployment #DevOps #Observability #Bash #Cloud #Security #Monitoring #IAM (Identity and Access Management) #Python #Docker #Kubernetes #Grafana #Terraform #Leadership #Prometheus #Linux #Infrastructure as Code (IaC) #Datadog #Compliance #Computer Science #Documentation #Scala #Ansible #Jenkins #Automation
Role description
Job Title: Observability – Devops Architect Location: Hybrid – Washington DC Duration: 16+ Months Job Description – Responsibilities: Reliability Engineering: Define and maintain service-level objectives (SLOs), implement error budgeting, and lead incident response and postmortem analysis. Infrastructure Automation: Use Terraform, Ansible, and other IaC tools to create secure, scalable, and repeatable environments. CI/CD Optimization: Architect secure and efficient pipelines (e.g., GitHub Actions, Jenkins), incorporating automated rollback, canary/blue-green deploys, and artifact validation. Observability: Build dashboards, alerts, synthetic checks, and telemetry pipelines that ensure visibility into system performance, availability, and cost. Security & Compliance: Integrate security tooling (SAST, DAST, SBOM, secrets scanning) and enforce policy-as-code in deployment workflows. Cost & Capacity Planning: Implement tooling and practices to monitor cloud cost trends, right-size infrastructure, and ensure high availability at optimal spend. Internal Enablement: Develop reusable internal tools, shared playbooks, and self-service platforms that boost developer productivity and ensure consistent delivery. Mentorship & Leadership: Serve as a technical mentor across platform, security, and engineering teams. Establish best practices in operational readiness, fault tolerance, and secure delivery Required Skill Set: Bachelor’s degree in Computer Science, Engineering, or related technical discipline At least 5 years of experience in DevOps, SRE, or Platform Engineering roles with leadership experience in automation and infrastructure reliability 3+ years hands-on experience in high-availability production environments with cloud-native security and observability tooling. Deep expertise in AWS (or equivalent cloud platform), especially in compute, networking, IAM, and monitoring Proficiency in Terraform, CloudFormation, Kubernetes, Docker, and Linux systems Strong knowledge of observability stacks (Prometheus, Grafana, ELK, Datadog, CloudWatch). Experience implementing and managing CI/CD systems with security tollgates and rollback logic Strong scripting skills in Python, Go, or Bash for automation and tooling. In-depth understanding of SRE practices including incident response, SLO/SLA management, chaos engineering, and capacity modeling. Familiarity with Git and GitOps patterns Proven track record of creating shared tooling and documentation that promotes operational excellence.