

Concepta Innovation Services
Information Data Engineer-AWS -Databricks
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for an Information Data Engineer specializing in AWS and Databricks, requiring 10+ years of data engineering experience. The contract is on-site in Washington, DC, for over 6 months, with a pay rate of $70-$80 per hour.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
640
-
🗓️ - Date
January 15, 2026
🕒 - Duration
More than 6 months
-
🏝️ - Location
On-site
-
📄 - Contract
1099 Contractor
-
🔒 - Security
Unknown
-
📍 - Location detailed
Washington, DC 20036
-
🧠 - Skills detailed
#Data Lifecycle #IAM (Identity and Access Management) #Apache Spark #Datasets #Observability #Cloud #Security #Deployment #AWS (Amazon Web Services) #Data Access #Data Pipeline #Version Control #Debugging #CLI (Command-Line Interface) #Batch #"ETL (Extract #Transform #Load)" #Spark (Apache Spark) #Data Modeling #PySpark #R #Python #Delta Lake #Pytest #Metadata #Clustering #SQL (Structured Query Language) #Scala #GIT #Databricks #Data Engineering #S3 (Amazon Simple Storage Service) #Data Quality #Kafka (Apache Kafka) #Documentation #Snowflake #Schema Design #Logging
Role description
Job SummaryWe are seeking a dynamic and highly skilled Data Engineer specializing in AWS cloud services and Databricks platform to drive our data transformation initiatives. The candidate will play a pivotal role in designing, building, and operating both batch and streaming data pipelines on Enterprise Data Platform (EDP) using Databricks and Apache Spark. The successful candidate will have hands-on experience in Python and R, supporting scalable engineering workflows that drive analytics and research initiatives. You will work in partnership with product, architecture, governance, and mission teams to deliver secure, high-performance, and reliable data pipelines and trusted datasets for our client.
Key Responsibilities
· Build and maintain end-to-end data pipelines in Databricks using Spark (PySpark) for ingesting, transforming, and publishing curated datasets.
· Implement streaming and near-real-time data patterns using Spark Structured Streaming.
· Design incremental processing, partitioning strategies, and data layout/file sizing for performance and cost optimization.
· Develop reusable pipeline components and standardized patterns to accelerate delivery.
· Create and operationalize workflows in Python and R for data preparation and analysis.
· Package code for repeatable execution, ensuring reliable dependency management and job configuration.
· Implement robust data quality controls for both batch and streaming data.
· Establish pipeline observability through logging, metrics, alerting, and dashboards; support incident response and root-cause analysis.
· Develop runbooks and operational procedures for critical pipelines and streaming services.
· Ensure secure handling of sensitive data with least-privilege principles.
· Document dataset definitions, lineage notes, and operational procedures for auditability.
· Utilize version control and CI/CD practices for code and deployment.
· Collaborate with stakeholders to refine requirements, define SLAs, and deliver measurable outcomes.
· Implement Lakeflow/Delta Live Tables (DLT) pipelines and maintain declarative ETL workflows.
· Design and implement medallion architecture patterns with appropriate data quality gates and optimization techniques.
· Develop and maintain comprehensive testing strategies using frameworks such as Great Expectations or deequ.
· Perform data modeling and schema design for dimensional models and analytical structures.
· Contribute to Unity Catalog governance by registering datasets and implementing security controls.
Required Qualifications
· Bachelor’s degree in a related field or equivalent experience.
· 10+ years of data engineering experience, including production Spark-based batch pipelines and streaming implementations.
· Strong proficiency in Python and R for data engineering and analytical workflows.
· Extensive hands-on experience with Databricks and Apache Spark, including Structured Streaming concepts.
· Advanced SQL skills for data transformation and validation.
· Proven experience building production-grade pipelines: idempotency, incremental loads, backfills, schema evolution, and error handling.
· Experience with data quality checks and validation for batch and event streams.
· Skills in logging, metrics, alerting, troubleshooting, and performance tuning.
· Proficiency with Git and CI/CD concepts for data pipelines, Databricks asset bundling, deployments, and CLI usage.
· Knowledge of Lakehouse table formats and patterns (Delta tables), compaction, optimization, and lifecycle management.
· Familiarity with orchestration (Databricks Workflows/Jobs) and dependency management.
· Experience with governance controls (catalog permissions, secure data access, metadata/lineage).
· Knowledge of message/event platforms and streaming ingestion (Kafka/Kinesis equivalents).
· Ability to collaborate with research/analytics stakeholders and translate needs into data products.
· Strong problem-solving and debugging skills across the data lifecycle.
· Clear technical communication and documentation abilities.
· Experience working with cross-functional teams in a regulated environment.
· Expertise in Delta Lake features, liquid clustering, and table maintenance.
· Experience with Lakeflow/Delta Live Tables (DLT), expectations framework, and declarative pipeline design.
· Proficiency with testing frameworks (pytest, Great Expectations, deequ) and test-driven development for data pipelines.
· Data modeling skills, including dimensional modeling (star/snowflake schemas), medallion architecture, and SCD patterns.
· AWS data services experience including S3 optimization, IAM configuration, CloudWatch integration, and cost management.
Preferred Certifications
· Databricks Certified Apache Spark Developer Associate
· Databricks Certified Data Engineer Associate or Professional
· AWS Certified Developer Associate
· AWS Certified Data Engineer Associate
AWS Certified Solution Architect Associate
Additional Details
· Location: On-site, Washington, DC
Contract Type: 1099
Citizen Status: US Citizenship Required
Job Types: Full-time, Contract
Pay: $70.00 - $80.00 per hour
Expected hours: 40 per week
Application Question(s):
1. Are you a US Citizen?
1. Do you have more than 10 years of data engineering experience?Required Certifications
Databricks Certified Apache Spark Developer Associate
Databricks Certified Data Engineer Associate or Professional
AWS Certified Developer Associate
AWS Certified Data Engineer Associate
AWS Certified Solution Architect Associate
Work Location: In person
Job SummaryWe are seeking a dynamic and highly skilled Data Engineer specializing in AWS cloud services and Databricks platform to drive our data transformation initiatives. The candidate will play a pivotal role in designing, building, and operating both batch and streaming data pipelines on Enterprise Data Platform (EDP) using Databricks and Apache Spark. The successful candidate will have hands-on experience in Python and R, supporting scalable engineering workflows that drive analytics and research initiatives. You will work in partnership with product, architecture, governance, and mission teams to deliver secure, high-performance, and reliable data pipelines and trusted datasets for our client.
Key Responsibilities
· Build and maintain end-to-end data pipelines in Databricks using Spark (PySpark) for ingesting, transforming, and publishing curated datasets.
· Implement streaming and near-real-time data patterns using Spark Structured Streaming.
· Design incremental processing, partitioning strategies, and data layout/file sizing for performance and cost optimization.
· Develop reusable pipeline components and standardized patterns to accelerate delivery.
· Create and operationalize workflows in Python and R for data preparation and analysis.
· Package code for repeatable execution, ensuring reliable dependency management and job configuration.
· Implement robust data quality controls for both batch and streaming data.
· Establish pipeline observability through logging, metrics, alerting, and dashboards; support incident response and root-cause analysis.
· Develop runbooks and operational procedures for critical pipelines and streaming services.
· Ensure secure handling of sensitive data with least-privilege principles.
· Document dataset definitions, lineage notes, and operational procedures for auditability.
· Utilize version control and CI/CD practices for code and deployment.
· Collaborate with stakeholders to refine requirements, define SLAs, and deliver measurable outcomes.
· Implement Lakeflow/Delta Live Tables (DLT) pipelines and maintain declarative ETL workflows.
· Design and implement medallion architecture patterns with appropriate data quality gates and optimization techniques.
· Develop and maintain comprehensive testing strategies using frameworks such as Great Expectations or deequ.
· Perform data modeling and schema design for dimensional models and analytical structures.
· Contribute to Unity Catalog governance by registering datasets and implementing security controls.
Required Qualifications
· Bachelor’s degree in a related field or equivalent experience.
· 10+ years of data engineering experience, including production Spark-based batch pipelines and streaming implementations.
· Strong proficiency in Python and R for data engineering and analytical workflows.
· Extensive hands-on experience with Databricks and Apache Spark, including Structured Streaming concepts.
· Advanced SQL skills for data transformation and validation.
· Proven experience building production-grade pipelines: idempotency, incremental loads, backfills, schema evolution, and error handling.
· Experience with data quality checks and validation for batch and event streams.
· Skills in logging, metrics, alerting, troubleshooting, and performance tuning.
· Proficiency with Git and CI/CD concepts for data pipelines, Databricks asset bundling, deployments, and CLI usage.
· Knowledge of Lakehouse table formats and patterns (Delta tables), compaction, optimization, and lifecycle management.
· Familiarity with orchestration (Databricks Workflows/Jobs) and dependency management.
· Experience with governance controls (catalog permissions, secure data access, metadata/lineage).
· Knowledge of message/event platforms and streaming ingestion (Kafka/Kinesis equivalents).
· Ability to collaborate with research/analytics stakeholders and translate needs into data products.
· Strong problem-solving and debugging skills across the data lifecycle.
· Clear technical communication and documentation abilities.
· Experience working with cross-functional teams in a regulated environment.
· Expertise in Delta Lake features, liquid clustering, and table maintenance.
· Experience with Lakeflow/Delta Live Tables (DLT), expectations framework, and declarative pipeline design.
· Proficiency with testing frameworks (pytest, Great Expectations, deequ) and test-driven development for data pipelines.
· Data modeling skills, including dimensional modeling (star/snowflake schemas), medallion architecture, and SCD patterns.
· AWS data services experience including S3 optimization, IAM configuration, CloudWatch integration, and cost management.
Preferred Certifications
· Databricks Certified Apache Spark Developer Associate
· Databricks Certified Data Engineer Associate or Professional
· AWS Certified Developer Associate
· AWS Certified Data Engineer Associate
AWS Certified Solution Architect Associate
Additional Details
· Location: On-site, Washington, DC
Contract Type: 1099
Citizen Status: US Citizenship Required
Job Types: Full-time, Contract
Pay: $70.00 - $80.00 per hour
Expected hours: 40 per week
Application Question(s):
1. Are you a US Citizen?
1. Do you have more than 10 years of data engineering experience?Required Certifications
Databricks Certified Apache Spark Developer Associate
Databricks Certified Data Engineer Associate or Professional
AWS Certified Developer Associate
AWS Certified Data Engineer Associate
AWS Certified Solution Architect Associate
Work Location: In person






