

Brooksource
Data Engineer
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Senior Data Engineer, remote, with a contract length of "unknown" and a pay rate of "unknown." Required skills include Databricks, PostgreSQL, Python, and healthcare data experience. A Bachelor's or Master's degree and 5–8+ years in data engineering are essential.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
480
-
🗓️ - Date
June 27, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Remote
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Louisville Metropolitan Area
-
🧠 - Skills detailed
#FHIR (Fast Healthcare Interoperability Resources) #GIT #Data Access #Data Lineage #Batch #SQL (Structured Query Language) #Azure Data Factory #PySpark #Compliance #Apache Spark #Data Governance #Data Pipeline #Python #Data Security #Programming #DevOps #Spark (Apache Spark) #Version Control #Data Processing #Data Warehouse #Scala #Schema Design #Strategy #Databricks #PostgreSQL #Cloud #Data Engineering #AWS (Amazon Web Services) #Azure DevOps #Datasets #EDW (Enterprise Data Warehouse) #Data Quality #"ETL (Extract #Transform #Load)" #Normalization #Snowflake #Storage #Airflow #Kafka (Apache Kafka) #Leadership #Data Exploration #Microsoft Power BI #Azure #BI (Business Intelligence) #Data Transformations #Data Strategy #Visualization #Data Science #ADF (Azure Data Factory) #Delta Lake #ML (Machine Learning) #Data Modeling #AI (Artificial Intelligence) #Security #GitHub #Computer Science
Role description
Senior Data Engineer
Fortune 50 Healthcare
Brooksource
Remote
Overview
Our Fortune 50 Healthcare client is seeking a Senior Data Engineer to support our mission of improving the health and well-being of our members. This role will focus on building scalable, secure, data centric solutions and compliant data platforms that power analytics, clinical insights, and business decision-making across the enterprise.
The ideal candidate will have strong experience with cloud-based data platforms, Databricks, PostgreSQL, and healthcare data, with a passion for delivering high-quality, trusted data solutions in a regulated environment.
Key Responsibilities
Data Engineering & Platform Development
• Design, develop, and scalable data pipeline solutions using Databricks (Spark) and cloud-native services
• Build and optimize ETL/ELT workflows for ingesting structured and unstructured healthcare data (claims, clinical, provider, and member data)
• Develop and maintain data models in PostgreSQL and enterprise data warehouses
• Support Lakehouse architecture leveraging Databricks, Delta Lake, and cloud storage
• Improve performance, reliability, and cost-efficiency of data platforms
Healthcare Data & Compliance
• Work with healthcare datasets, including producer/agent, broker, commission, and distribution data, ensuring proper ingestion, normalization, and optimization for analytics and reporting
• Ensure compliance with HIPAA, HITECH, and enterprise data governance policies
• Implement data security, encryption, masking, and access controls
• Maintain data lineage, auditability, and regulatory reporting readiness
Advanced Data Processing
• Build real-time and batch pipelines for analytics and operational use cases
• Develop data transformations using PySpark and SQL within Databricks
• Leverage PostgreSQL for transactional and analytical workloads where applicable
• Integrate data from APIs, third-party vendors, and internal systems
Collaboration & Stakeholder Engagement
• Partner with business stakeholders to support data-driven initiatives and member acquisition strategies
• Translate insurance distribution, agent/producer, and marketing requirements into scalable, high-quality data solutions
• Support downstream consumers, including Power BI, marketing analytics teams, and operational reporting stakeholders, by delivering curated, analytics-ready datasets
Technical Leadership
• Lead design and architecture discussions for enterprise data solutions
• Establish and enforce best practices in data engineering, testing, and CI/CD
• Contribute to enterprise data strategy and platform modernization
AI & Advanced Analytics (Databricks Genie)
• Leverage Databricks Genie (AI/BI capabilities) to enable natural language querying and democratize data access for business stakeholders
• Design and optimize semantic layers and governed datasets that power Genie-driven insights with trusted, high-quality data
• Collaborate with stakeholders to translate business questions into AI-assisted analytics workflows using Databricks
• Ensure AI outputs are accurate, explainable, and compliant with healthcare data governance and HIPAA requirements
• Leverage large language models (LLMs), including Anthropic Claude, to enhance data exploration, automate insight generation, and support conversational analytics use cases
• Integrate Genie capabilities with Delta Lake and curated data models to support near real-time insights and decision-making
• Partner with data scientists and analytics teams to enhance AI-driven use cases, including producer performance insights, marketing attribution, and member engagement analysis
Required Qualifications
• Bachelor’s or Master’s degree in Computer Science, Engineering, or related field
• 5–8+ years of experience in data engineering
• Strong programming in Python (PySpark) and advanced SQL
• Hands-on experience with:
• Databricks (core requirement)
• PostgreSQL
• Distributed data processing frameworks (Apache Spark)
• Experience with cloud platforms (Azure preferred; AWS acceptable)
• Proficiency in building and maintaining ETL/ELT pipelines
• Strong understanding of data modeling and warehousing concepts
Preferred Qualifications
• Experience in healthcare or insurance industry (payer experience strongly preferred)
• Familiarity with healthcare standards (e.g., FHIR, HL7)
• Experience with:
• Delta Lake / Lakehouse architecture
• Orchestration tools (Airflow, Azure Data Factory)
• Streaming (Kafka, Event Hubs)
• Knowledge of DevOps and CI/CD pipelines (Azure DevOps, GitHub Actions)
• Experience supporting machine learning pipelines
Key Skills & Competencies
• Deep understanding of data pipelines at scale
• Strong experience with Databricks ecosystem and Spark optimization
• Expertise in PostgreSQL performance tuning and schema design
• Strong attention to data quality, governance, and compliance
• Excellent communication skills, especially with non-technical stakeholders
• Ability to work in a highly regulated healthcare environment
Typical Technology Stack
• Data Platform: Databricks, Delta Lake
• Database: PostgreSQL, Snowflake (optional)
• Cloud: Azure, Google, AWS
• Languages: Python, SQL
• Orchestration: Airflow, Azure Data Factory
• Visualization: Power BI
• Version Control: Git
KPIs / Success Metrics
• Reliability and performance of Databricks pipelines
• Data quality and compliance adherence (HIPAA standards)
• Time-to-delivery for new data products
• Query performance improvements in PostgreSQL and data warehouse systems
• Stakeholder adoption and satisfaction
Senior Data Engineer
Fortune 50 Healthcare
Brooksource
Remote
Overview
Our Fortune 50 Healthcare client is seeking a Senior Data Engineer to support our mission of improving the health and well-being of our members. This role will focus on building scalable, secure, data centric solutions and compliant data platforms that power analytics, clinical insights, and business decision-making across the enterprise.
The ideal candidate will have strong experience with cloud-based data platforms, Databricks, PostgreSQL, and healthcare data, with a passion for delivering high-quality, trusted data solutions in a regulated environment.
Key Responsibilities
Data Engineering & Platform Development
• Design, develop, and scalable data pipeline solutions using Databricks (Spark) and cloud-native services
• Build and optimize ETL/ELT workflows for ingesting structured and unstructured healthcare data (claims, clinical, provider, and member data)
• Develop and maintain data models in PostgreSQL and enterprise data warehouses
• Support Lakehouse architecture leveraging Databricks, Delta Lake, and cloud storage
• Improve performance, reliability, and cost-efficiency of data platforms
Healthcare Data & Compliance
• Work with healthcare datasets, including producer/agent, broker, commission, and distribution data, ensuring proper ingestion, normalization, and optimization for analytics and reporting
• Ensure compliance with HIPAA, HITECH, and enterprise data governance policies
• Implement data security, encryption, masking, and access controls
• Maintain data lineage, auditability, and regulatory reporting readiness
Advanced Data Processing
• Build real-time and batch pipelines for analytics and operational use cases
• Develop data transformations using PySpark and SQL within Databricks
• Leverage PostgreSQL for transactional and analytical workloads where applicable
• Integrate data from APIs, third-party vendors, and internal systems
Collaboration & Stakeholder Engagement
• Partner with business stakeholders to support data-driven initiatives and member acquisition strategies
• Translate insurance distribution, agent/producer, and marketing requirements into scalable, high-quality data solutions
• Support downstream consumers, including Power BI, marketing analytics teams, and operational reporting stakeholders, by delivering curated, analytics-ready datasets
Technical Leadership
• Lead design and architecture discussions for enterprise data solutions
• Establish and enforce best practices in data engineering, testing, and CI/CD
• Contribute to enterprise data strategy and platform modernization
AI & Advanced Analytics (Databricks Genie)
• Leverage Databricks Genie (AI/BI capabilities) to enable natural language querying and democratize data access for business stakeholders
• Design and optimize semantic layers and governed datasets that power Genie-driven insights with trusted, high-quality data
• Collaborate with stakeholders to translate business questions into AI-assisted analytics workflows using Databricks
• Ensure AI outputs are accurate, explainable, and compliant with healthcare data governance and HIPAA requirements
• Leverage large language models (LLMs), including Anthropic Claude, to enhance data exploration, automate insight generation, and support conversational analytics use cases
• Integrate Genie capabilities with Delta Lake and curated data models to support near real-time insights and decision-making
• Partner with data scientists and analytics teams to enhance AI-driven use cases, including producer performance insights, marketing attribution, and member engagement analysis
Required Qualifications
• Bachelor’s or Master’s degree in Computer Science, Engineering, or related field
• 5–8+ years of experience in data engineering
• Strong programming in Python (PySpark) and advanced SQL
• Hands-on experience with:
• Databricks (core requirement)
• PostgreSQL
• Distributed data processing frameworks (Apache Spark)
• Experience with cloud platforms (Azure preferred; AWS acceptable)
• Proficiency in building and maintaining ETL/ELT pipelines
• Strong understanding of data modeling and warehousing concepts
Preferred Qualifications
• Experience in healthcare or insurance industry (payer experience strongly preferred)
• Familiarity with healthcare standards (e.g., FHIR, HL7)
• Experience with:
• Delta Lake / Lakehouse architecture
• Orchestration tools (Airflow, Azure Data Factory)
• Streaming (Kafka, Event Hubs)
• Knowledge of DevOps and CI/CD pipelines (Azure DevOps, GitHub Actions)
• Experience supporting machine learning pipelines
Key Skills & Competencies
• Deep understanding of data pipelines at scale
• Strong experience with Databricks ecosystem and Spark optimization
• Expertise in PostgreSQL performance tuning and schema design
• Strong attention to data quality, governance, and compliance
• Excellent communication skills, especially with non-technical stakeholders
• Ability to work in a highly regulated healthcare environment
Typical Technology Stack
• Data Platform: Databricks, Delta Lake
• Database: PostgreSQL, Snowflake (optional)
• Cloud: Azure, Google, AWS
• Languages: Python, SQL
• Orchestration: Airflow, Azure Data Factory
• Visualization: Power BI
• Version Control: Git
KPIs / Success Metrics
• Reliability and performance of Databricks pipelines
• Data quality and compliance adherence (HIPAA standards)
• Time-to-delivery for new data products
• Query performance improvements in PostgreSQL and data warehouse systems
• Stakeholder adoption and satisfaction





