Lead Love

DevOps Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is a long-term contract for a DevOps Engineer, offering a pay rate of "unknown." It is remote, available in the US (PST, MST) or Colombia (COT). Key skills include 5+ years of DevOps/SRE experience, Terraform, Docker, GitHub Actions, and security controls.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
November 17, 2025
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
United States
-
🧠 - Skills detailed
#Compliance #Databases #Security #IAM (Identity and Access Management) #GitHub #Logging #Containers #AutoScaling #Cloud #DevOps #Deployment #Redis #SaaS (Software as a Service) #Terraform #Infrastructure as Code (IaC) #Observability #Firewalls #Docker
Role description
Contract (long-term), US (PST, MST) or Colombia (COT) Role Overview Own the reliability and release operations for our integration work. You’ll give developers a smooth path from code to production, keep environments healthy and secure, and make system health visible so issues are found and fixed fast. Over time, you’ll tune cost/performance, harden security, and evolve our standards so we can ship integrations predictably as we scale. Responsibilities • Own the platform lifecycle: maintain and improve our cloud setup (DigitalOcean preferred), databases (Postgres), caches/queues (Redis), and the way environments are created (Terraform/IaC). • Operate releases: keep CI/CD fast and safe (GitHub Actions), enforce health checks and rollbacks, and make deploys predictable across multiple integration workstreams. • Make reliability visible: centralize logs/metrics/traces, keep practical alerts in place, and publish clear runbooks so first responders know what to do. • Strengthen security & compliance basics: secrets handling, least-privilege access, image scanning, patches, and simple evidence for audits when needed. • Manage capacity, cost, and performance: right-size resources, set autoscaling policies, and keep cloud spend within plan. • Enable the team: answer “how do we…?” questions, write concise docs, and collaborate closely with the Fractional CTO to unblock delivery. Success Metrics • Deployment success rate — % of deploys that complete without rollback. Target: ≥95%. • Time to restore — median time to recover from a production incident. Target: ≤30 minutes. • Operational visibility — core alerts verified monthly; runbooks exercised in a safe test. Target: 100% pass. • Cost & capacity — stay within agreed monthly cloud budget while meeting performance targets. Required Experience & Skills • DevOps/SRE: 5+ years running cloud-hosted applications end-to-end. • IaC & containers: Terraform, Docker; reproducible environments and change control. • CI/CD: GitHub Actions (or similar) with build/test/scan/sign, blue/green or rolling deploys, and proven rollback. • Data & queues: operating managed Postgres and Redis at production scale. • Observability & ops: logging/metrics/alerts (OpenTelemetry or equivalent), incident triage, basic on-call hygiene. • Security controls: secrets management, certs, firewalls, IAM; incident response. • Multi-tenant integration patterns: per-tenant config, fairness/rate-limit. • Apigee knowledge; OpenTelemetry; experience with multi-tenant SaaS and token-bucket rate limiting. Mindset Pragmatic and service-oriented • automates toil • documents as they go • calm in ambiguity • explains choices in plain English • raises the bar without heavy process.