

Lead Love
DevOps Engineer
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is a long-term contract for a DevOps Engineer, offering a pay rate of "unknown." It is remote, available in the US (PST, MST) or Colombia (COT). Key skills include 5+ years of DevOps/SRE experience, Terraform, Docker, GitHub Actions, and security controls.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
November 17, 2025
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
United States
-
🧠 - Skills detailed
#Compliance #Databases #Security #IAM (Identity and Access Management) #GitHub #Logging #Containers #AutoScaling #Cloud #DevOps #Deployment #Redis #SaaS (Software as a Service) #Terraform #Infrastructure as Code (IaC) #Observability #Firewalls #Docker
Role description
Contract (long-term), US (PST, MST) or Colombia (COT)
Role Overview
Own the reliability and release operations for our integration work. You’ll give developers a smooth path from code to production, keep environments healthy and secure, and make system health visible so issues are found and fixed fast. Over time, you’ll tune cost/performance, harden security, and evolve our standards so we can ship integrations predictably as we scale.
Responsibilities
• Own the platform lifecycle: maintain and improve our cloud setup (DigitalOcean preferred), databases (Postgres), caches/queues (Redis), and the way environments are created (Terraform/IaC).
• Operate releases: keep CI/CD fast and safe (GitHub Actions), enforce health checks and rollbacks, and make deploys predictable across multiple integration workstreams.
• Make reliability visible: centralize logs/metrics/traces, keep practical alerts in place, and publish clear runbooks so first responders know what to do.
• Strengthen security & compliance basics: secrets handling, least-privilege access, image scanning, patches, and simple evidence for audits when needed.
• Manage capacity, cost, and performance: right-size resources, set autoscaling policies, and keep cloud spend within plan.
• Enable the team: answer “how do we…?” questions, write concise docs, and collaborate closely with the Fractional CTO to unblock delivery.
Success Metrics
• Deployment success rate — % of deploys that complete without rollback. Target: ≥95%.
• Time to restore — median time to recover from a production incident. Target: ≤30 minutes.
• Operational visibility — core alerts verified monthly; runbooks exercised in a safe test. Target: 100% pass.
• Cost & capacity — stay within agreed monthly cloud budget while meeting performance targets.
Required Experience & Skills
• DevOps/SRE: 5+ years running cloud-hosted applications end-to-end.
• IaC & containers: Terraform, Docker; reproducible environments and change control.
• CI/CD: GitHub Actions (or similar) with build/test/scan/sign, blue/green or rolling deploys, and proven rollback.
• Data & queues: operating managed Postgres and Redis at production scale.
• Observability & ops: logging/metrics/alerts (OpenTelemetry or equivalent), incident triage, basic on-call hygiene.
• Security controls: secrets management, certs, firewalls, IAM; incident response.
• Multi-tenant integration patterns: per-tenant config, fairness/rate-limit.
• Apigee knowledge; OpenTelemetry; experience with multi-tenant SaaS and token-bucket rate limiting.
Mindset
Pragmatic and service-oriented
• automates toil
• documents as they go
• calm in ambiguity
• explains choices in plain English
• raises the bar without heavy process.
Contract (long-term), US (PST, MST) or Colombia (COT)
Role Overview
Own the reliability and release operations for our integration work. You’ll give developers a smooth path from code to production, keep environments healthy and secure, and make system health visible so issues are found and fixed fast. Over time, you’ll tune cost/performance, harden security, and evolve our standards so we can ship integrations predictably as we scale.
Responsibilities
• Own the platform lifecycle: maintain and improve our cloud setup (DigitalOcean preferred), databases (Postgres), caches/queues (Redis), and the way environments are created (Terraform/IaC).
• Operate releases: keep CI/CD fast and safe (GitHub Actions), enforce health checks and rollbacks, and make deploys predictable across multiple integration workstreams.
• Make reliability visible: centralize logs/metrics/traces, keep practical alerts in place, and publish clear runbooks so first responders know what to do.
• Strengthen security & compliance basics: secrets handling, least-privilege access, image scanning, patches, and simple evidence for audits when needed.
• Manage capacity, cost, and performance: right-size resources, set autoscaling policies, and keep cloud spend within plan.
• Enable the team: answer “how do we…?” questions, write concise docs, and collaborate closely with the Fractional CTO to unblock delivery.
Success Metrics
• Deployment success rate — % of deploys that complete without rollback. Target: ≥95%.
• Time to restore — median time to recover from a production incident. Target: ≤30 minutes.
• Operational visibility — core alerts verified monthly; runbooks exercised in a safe test. Target: 100% pass.
• Cost & capacity — stay within agreed monthly cloud budget while meeting performance targets.
Required Experience & Skills
• DevOps/SRE: 5+ years running cloud-hosted applications end-to-end.
• IaC & containers: Terraform, Docker; reproducible environments and change control.
• CI/CD: GitHub Actions (or similar) with build/test/scan/sign, blue/green or rolling deploys, and proven rollback.
• Data & queues: operating managed Postgres and Redis at production scale.
• Observability & ops: logging/metrics/alerts (OpenTelemetry or equivalent), incident triage, basic on-call hygiene.
• Security controls: secrets management, certs, firewalls, IAM; incident response.
• Multi-tenant integration patterns: per-tenant config, fairness/rate-limit.
• Apigee knowledge; OpenTelemetry; experience with multi-tenant SaaS and token-bucket rate limiting.
Mindset
Pragmatic and service-oriented
• automates toil
• documents as they go
• calm in ambiguity
• explains choices in plain English
• raises the bar without heavy process.






