

Data Engineer
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer with a contract length of "unknown," offering a pay rate of "$/hour." Remote work is permitted. Required skills include AWS, Azure, Databricks, SQL, Python, and healthcare compliance experience (e.g., HIPAA).
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
-
🗓️ - Date discovered
August 13, 2025
🕒 - Project duration
Unknown
-
🏝️ - Location type
Unknown
-
📄 - Contract type
Unknown
-
🔒 - Security clearance
Unknown
-
📍 - Location detailed
United States
-
🧠 - Skills detailed
#Monitoring #Storage #Data Pipeline #Cloud #Data Storage #GIT #Azure #Data Processing #ML (Machine Learning) #Data Lake #dbt (data build tool) #Infrastructure as Code (IaC) #ADLS (Azure Data Lake Storage) #Docker #AI (Artificial Intelligence) #"ETL (Extract #Transform #Load)" #RDS (Amazon Relational Database Service) #Security #Lambda (AWS Lambda) #Data Privacy #Data Science #Data Warehouse #Model Deployment #Code Reviews #Delta Lake #Python #Azure Data Factory #Automation #AWS RDS (Amazon Relational Database Service) #Logging #Azure Resource Manager #Data Engineering #Datadog #AWS (Amazon Web Services) #Computer Science #AWS S3 (Amazon Simple Storage Service) #Databricks #Terraform #Grafana #ADF (Azure Data Factory) #Redshift #Data Quality #Spark (Apache Spark) #Airflow #Data Access #Scala #SQL (Structured Query Language) #Data Modeling #Batch #Data Ingestion #"ACID (Atomicity #Consistency #Isolation #Durability)" #Prometheus #Compliance #Data Lakehouse #Leadership #Azure ADLS (Azure Data Lake Storage) #Data Governance #DevOps #Data Architecture #Synapse #Deployment #Database Design #Kubernetes #S3 (Amazon Simple Storage Service) #Version Control
Role description
DATA ENGINEER
JOB SUMMARY
We are seeking an experienced and forward-thinking Data Engineer to design, implement, and optimize our evolving data infrastructure. In this pivotal role, you will lead initiatives across AWS, Azure, and Databricks, ensuring our data ecosystems are secure, scalable, and performance-driven. You will collaborate closely with data scientists, DevOps, and software engineering teams to deliver robust data solutions that support advanced analytics, AI/ML workflows, and critical healthcare compliance requirements.
WHAT ARE YOUR ASSIGNMENTS
Data Architecture & Pipeline Development
• Architect & Implement: Design and implement scalable, multi-cloud data pipelines (AWS and Azure) that handle data ingestion, transformation, and integration across diverse sources.
• Data Warehouses & Data Lakes: Lead the development and maintenance of data lakehouses, warehouses, and lake architectures using platforms like Databricks (Delta Lake, Iceberg), Azure Data Lake Storage (ADLS), AWS S3, Redshift, and more.
• ETL/ELT Processes: Build and optimize end-to-end data pipelines using dbt, Airflow, Dagster, or similar orchestration tools to ensure reliability, consistency, and high performance.
Databricks & Advanced Data Solutions
• Spark Development: Leverage Databricks to manage large-scale data processing, batch/streaming jobs, and ML model deployments.
• Modern Table Formats: Implement and optimize Delta Lake or Iceberg for fast, ACID-compliant transactions and scalable data analytics.
• Collaboration & Best Practices: Promote best practices for Spark job creation, resource utilization, and distributed data processing to ensure efficient use of Databricks clusters.
Cloud Infrastructure & Integration
• Multi-Cloud Expertise: Utilize services from both AWS (RDS, Lambda, Glue, S3, Redshift) and Azure (Data Factory, Data Lake Storage, Synapse) to build resilient, cost-effective data solutions.
• Infrastructure as Code (IaC): Work with DevOps to implement and maintain infrastructure via Terraform, CloudFormation, or ARM/Bicep templates where applicable.
• Cross-Functional Collaboration: Partner with DevOps to ensure robust monitoring, logging, and alerting solutions are in place (e.g., Datadog, ELK Stack, CloudWatch, Azure Monitor) for all data pipelines.
Data Governance, Security & Compliance
• Data Governance & Quality: Develop and enforce data governance policies, standards, and frameworks—ensuring high data quality, lineage, and stewardship.
• Healthcare Compliance: Ensure compliance with healthcare data regulations (e.g., HIPAA), implementing robust access controls, audit trails, and encryption strategies.
• Security Monitoring: Implement advanced security measures and monitor data access patterns, collaborating with the InfoSec team to conduct scans, penetration tests, and incident response drills.
Performance Optimization & Troubleshooting
• Data Performance: Continuously evaluate and optimize data storage/queries for performance and cost efficiency at scale (SQL tuning, partitioning strategies, caching).
• Monitoring & Alerting: Set up proactive alerting and monitoring systems (e.g., Datadog, Prometheus, Grafana, CloudWatch Metrics) to promptly identify and address pipeline bottlenecks and failures.
• Incident Management: Investigate and resolve data-related issues, working closely with DevOps, Data Science, and Software Engineering teams to minimize downtime and maintain SLAs.
Technical Leadership & Mentorship
• Team Mentorship: Guide junior and mid-level data engineers, providing code reviews, technical direction, and professional development opportunities.
• Stakeholder Communication: Work closely with Product, Analytics, and AI teams to capture requirements, translate business needs into technical solutions, and communicate progress effectively.
• Innovation & Thought Leadership: Evaluate emerging technologies, tools, and frameworks; recommend and implement solutions that enhance the data platform’s capabilities.
WHAT YOU HAVE ALREADY ACHIEVED
• Education: Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field.
• Experience: 5+ years in data engineering roles with a proven record of leading complex data projects in both AWS and Azure environments.
• Cloud Expertise: Advanced proficiency in AWS (RDS, Lambda, Glue, Redshift, S3) and Azure (Data Factory, ADLS).
• Databricks & Spark: Hands-on experience with Databricks, Spark, and modern table formats (Delta, Iceberg).
• ETL/ELT & Orchestration: Strong background in building and maintaining pipelines using dbt, Airflow, Dagster, or similar tools.
• SQL & Python: Expert-level skills in SQL and Python for data processing, transformation, and automation.
• Version Control: Proficiency with Git for version control and CI/CD workflows.
• Data Modeling: Deep understanding of data modeling (3NF, star schema) and database design principles.
• Healthcare Compliance: Familiarity with HIPAA or similar regulatory frameworks, ensuring data privacy and security.
• Communication & Collaboration: Excellent verbal and written communication skills, with the ability to work effectively in cross-functional teams.
ATTRIBUTES NEEDED FOR THE ROLE
• Infrastructure as Code: Experience with Terraform, CloudFormation, Azure Resource Manager (ARM) templates, or AWS CDK.
• AI/ML Pipelines: Exposure to AI/ML workflows, model training, and model hosting on Databricks or other platforms.
• Containerization & Orchestration: Familiarity with Docker, Kubernetes, or similar technologies for packaging and deploying data applications.
• Performance Tuning: Experience optimizing large-scale data warehouse/lake architectures for cost, speed, and reliability.
• Mentorship & Leadership: Previous experience in a senior or lead capacity, mentoring junior engineers and driving team-wide best practices.
DATA ENGINEER
JOB SUMMARY
We are seeking an experienced and forward-thinking Data Engineer to design, implement, and optimize our evolving data infrastructure. In this pivotal role, you will lead initiatives across AWS, Azure, and Databricks, ensuring our data ecosystems are secure, scalable, and performance-driven. You will collaborate closely with data scientists, DevOps, and software engineering teams to deliver robust data solutions that support advanced analytics, AI/ML workflows, and critical healthcare compliance requirements.
WHAT ARE YOUR ASSIGNMENTS
Data Architecture & Pipeline Development
• Architect & Implement: Design and implement scalable, multi-cloud data pipelines (AWS and Azure) that handle data ingestion, transformation, and integration across diverse sources.
• Data Warehouses & Data Lakes: Lead the development and maintenance of data lakehouses, warehouses, and lake architectures using platforms like Databricks (Delta Lake, Iceberg), Azure Data Lake Storage (ADLS), AWS S3, Redshift, and more.
• ETL/ELT Processes: Build and optimize end-to-end data pipelines using dbt, Airflow, Dagster, or similar orchestration tools to ensure reliability, consistency, and high performance.
Databricks & Advanced Data Solutions
• Spark Development: Leverage Databricks to manage large-scale data processing, batch/streaming jobs, and ML model deployments.
• Modern Table Formats: Implement and optimize Delta Lake or Iceberg for fast, ACID-compliant transactions and scalable data analytics.
• Collaboration & Best Practices: Promote best practices for Spark job creation, resource utilization, and distributed data processing to ensure efficient use of Databricks clusters.
Cloud Infrastructure & Integration
• Multi-Cloud Expertise: Utilize services from both AWS (RDS, Lambda, Glue, S3, Redshift) and Azure (Data Factory, Data Lake Storage, Synapse) to build resilient, cost-effective data solutions.
• Infrastructure as Code (IaC): Work with DevOps to implement and maintain infrastructure via Terraform, CloudFormation, or ARM/Bicep templates where applicable.
• Cross-Functional Collaboration: Partner with DevOps to ensure robust monitoring, logging, and alerting solutions are in place (e.g., Datadog, ELK Stack, CloudWatch, Azure Monitor) for all data pipelines.
Data Governance, Security & Compliance
• Data Governance & Quality: Develop and enforce data governance policies, standards, and frameworks—ensuring high data quality, lineage, and stewardship.
• Healthcare Compliance: Ensure compliance with healthcare data regulations (e.g., HIPAA), implementing robust access controls, audit trails, and encryption strategies.
• Security Monitoring: Implement advanced security measures and monitor data access patterns, collaborating with the InfoSec team to conduct scans, penetration tests, and incident response drills.
Performance Optimization & Troubleshooting
• Data Performance: Continuously evaluate and optimize data storage/queries for performance and cost efficiency at scale (SQL tuning, partitioning strategies, caching).
• Monitoring & Alerting: Set up proactive alerting and monitoring systems (e.g., Datadog, Prometheus, Grafana, CloudWatch Metrics) to promptly identify and address pipeline bottlenecks and failures.
• Incident Management: Investigate and resolve data-related issues, working closely with DevOps, Data Science, and Software Engineering teams to minimize downtime and maintain SLAs.
Technical Leadership & Mentorship
• Team Mentorship: Guide junior and mid-level data engineers, providing code reviews, technical direction, and professional development opportunities.
• Stakeholder Communication: Work closely with Product, Analytics, and AI teams to capture requirements, translate business needs into technical solutions, and communicate progress effectively.
• Innovation & Thought Leadership: Evaluate emerging technologies, tools, and frameworks; recommend and implement solutions that enhance the data platform’s capabilities.
WHAT YOU HAVE ALREADY ACHIEVED
• Education: Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field.
• Experience: 5+ years in data engineering roles with a proven record of leading complex data projects in both AWS and Azure environments.
• Cloud Expertise: Advanced proficiency in AWS (RDS, Lambda, Glue, Redshift, S3) and Azure (Data Factory, ADLS).
• Databricks & Spark: Hands-on experience with Databricks, Spark, and modern table formats (Delta, Iceberg).
• ETL/ELT & Orchestration: Strong background in building and maintaining pipelines using dbt, Airflow, Dagster, or similar tools.
• SQL & Python: Expert-level skills in SQL and Python for data processing, transformation, and automation.
• Version Control: Proficiency with Git for version control and CI/CD workflows.
• Data Modeling: Deep understanding of data modeling (3NF, star schema) and database design principles.
• Healthcare Compliance: Familiarity with HIPAA or similar regulatory frameworks, ensuring data privacy and security.
• Communication & Collaboration: Excellent verbal and written communication skills, with the ability to work effectively in cross-functional teams.
ATTRIBUTES NEEDED FOR THE ROLE
• Infrastructure as Code: Experience with Terraform, CloudFormation, Azure Resource Manager (ARM) templates, or AWS CDK.
• AI/ML Pipelines: Exposure to AI/ML workflows, model training, and model hosting on Databricks or other platforms.
• Containerization & Orchestration: Familiarity with Docker, Kubernetes, or similar technologies for packaging and deploying data applications.
• Performance Tuning: Experience optimizing large-scale data warehouse/lake architectures for cost, speed, and reliability.
• Mentorship & Leadership: Previous experience in a senior or lead capacity, mentoring junior engineers and driving team-wide best practices.