

NLB Services
Lead Data Engineer
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Lead Data Engineer with a contract length of "unknown," offering a pay rate of "unknown." Key skills include Databricks, Spark, Kafka, Delta Lake, Python, SQL, and AWS. Experience in healthcare or HIPAA is preferred.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
May 15, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
United States
-
🧠 - Skills detailed
#Datadog #Databricks #dbt (data build tool) #PostgreSQL #SQL (Structured Query Language) #Delta Lake #Data Engineering #Data Quality #IAM (Identity and Access Management) #Observability #AWS (Amazon Web Services) #Terraform #Cloud #PySpark #S3 (Amazon Simple Storage Service) #Scala #Spark (Apache Spark) #Kafka (Apache Kafka) #Data Pipeline #Debugging #Batch #Python #Documentation
Role description
Title: Senior Data Engineer
Main skills
· Databricks
· Spark/PySpark
· Kafka
· Delta Lake
· Python
· SQL
· PostgreSQL
· AWS
· Datadog
· PagerDuty
Role Summary:
· We are seeking a Senior Data Engineer to support client’s production data platform operations. This role is responsible for end-to-end production incident response for critical data pipelines, from alert triage and root cause analysis through resolution or clean handoff.
Key Responsibilities:
· Respond promptly to PagerDuty and Datadog alerts, triaging incidents efficiently.
· Debug and resolve failures across streaming and batch data pipelines.
· Troubleshoot Databricks/Spark job failures, Kafka lag/connectivity issues, Delta
· Lake checkpoint failures, PostgreSQL sink issues, and schema-related failures.
· Restart jobs, apply configuration fixes, and escalate issues with detailed root
· cause analysis as needed.
· Execute operational runbooks and maintain thorough incident documentation and
· postmortems.
· Improve platform stability by reducing recurring incidents and alert noise.
· Coordinate effectively with downstream consumers and engineering teams
· during incidents.
Required Skills:
· Strong hands-on experience with Databricks, Spark/PySpark, Kafka, Delta Lake,
· Python, SQL, and PostgreSQL.
· Experience with AWS services including S3, IAM, Secrets Manager, and KMS.
· Knowledge of Datadog, PagerDuty, and production observability practices.
· Strong troubleshooting and debugging skills across distributed systems and
· streaming pipelines.
· Ability to distinguish between transient infrastructure issues, configuration fixes,
· and application/code defects.
Preferred Skills:
· Experience with Apache Flink and stateful streaming systems.
· Healthcare or HIPAA domain exposure.
· Experience with dbt, Great Expectations, or data quality frameworks.
· Terraform-managed cloud infrastructure experience.
What Success Looks Like:
· Quickly identify root causes for common production issues.
· Minimize MTTR and restore pipeline stability efficiently.
· Maintain clear and reusable incident documentation.
· Demonstrate strong ownership and operational accountability in a fast-paced
· production environment.
Title: Senior Data Engineer
Main skills
· Databricks
· Spark/PySpark
· Kafka
· Delta Lake
· Python
· SQL
· PostgreSQL
· AWS
· Datadog
· PagerDuty
Role Summary:
· We are seeking a Senior Data Engineer to support client’s production data platform operations. This role is responsible for end-to-end production incident response for critical data pipelines, from alert triage and root cause analysis through resolution or clean handoff.
Key Responsibilities:
· Respond promptly to PagerDuty and Datadog alerts, triaging incidents efficiently.
· Debug and resolve failures across streaming and batch data pipelines.
· Troubleshoot Databricks/Spark job failures, Kafka lag/connectivity issues, Delta
· Lake checkpoint failures, PostgreSQL sink issues, and schema-related failures.
· Restart jobs, apply configuration fixes, and escalate issues with detailed root
· cause analysis as needed.
· Execute operational runbooks and maintain thorough incident documentation and
· postmortems.
· Improve platform stability by reducing recurring incidents and alert noise.
· Coordinate effectively with downstream consumers and engineering teams
· during incidents.
Required Skills:
· Strong hands-on experience with Databricks, Spark/PySpark, Kafka, Delta Lake,
· Python, SQL, and PostgreSQL.
· Experience with AWS services including S3, IAM, Secrets Manager, and KMS.
· Knowledge of Datadog, PagerDuty, and production observability practices.
· Strong troubleshooting and debugging skills across distributed systems and
· streaming pipelines.
· Ability to distinguish between transient infrastructure issues, configuration fixes,
· and application/code defects.
Preferred Skills:
· Experience with Apache Flink and stateful streaming systems.
· Healthcare or HIPAA domain exposure.
· Experience with dbt, Great Expectations, or data quality frameworks.
· Terraform-managed cloud infrastructure experience.
What Success Looks Like:
· Quickly identify root causes for common production issues.
· Minimize MTTR and restore pipeline stability efficiently.
· Maintain clear and reusable incident documentation.
· Demonstrate strong ownership and operational accountability in a fast-paced
· production environment.






