

Open Systems Inc.
Big Data Engineer: Various Levels @ Atlanta, GA
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Big Data Engineer (Various Levels) in Atlanta, GA, offering a 6+ month contract at a competitive pay rate. Key skills include Databricks, Apache Spark, Delta Lake, and AWS. Candidates need a bachelor's degree and 5+ years of data engineering experience in rail transportation.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
May 5, 2026
🕒 - Duration
More than 6 months
-
🏝️ - Location
Hybrid
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
United States
-
🧠 - Skills detailed
#"ETL (Extract #Transform #Load)" #Deployment #Data Science #Computer Science #Spark (Apache Spark) #Data Engineering #Databricks #AWS S3 (Amazon Simple Storage Service) #Batch #Scrum #Spark SQL #Data Pipeline #Complex Queries #Apache Kafka #NoSQL #Databases #IAM (Identity and Access Management) #Observability #Data Governance #Data Lifecycle #S3 (Amazon Simple Storage Service) #Apache Spark #Data Quality #Scala #Agile #Data Architecture #Kanban #PySpark #Monitoring #ML (Machine Learning) #Datasets #AWS (Amazon Web Services) #Snowflake #Data Modeling #Big Data #SQL (Structured Query Language) #Kafka (Apache Kafka) #Delta Lake #BI (Business Intelligence)
Role description
Title: Big Data Engineer: Various levels (Lead, Senior, Intermediate)
Location: Atlanta, GA 303038 (Hybrid: 2x/Week)
Contract: 6+ Months. Long-term.
Industry: Rail Transportation.
•
•
• If not in the Atlanta area, then 100% REMOTE.
• Overview
• Client Corporation is seeking a Senior Data Engineer to collaborate across the organization and deliver reliable, scalable data solutions. You will join a high-performing team focused on modern data platforms, helping design, build, and operate a Databricks-based lakehouse architecture and streaming analytics ecosystem.
• This role combines strong business acumen with hands-on engineering expertise. You will work closely with stakeholders to understand business challenges, translate them into technical requirements, design pragmatic data architectures, and deliver production-grade data pipelines. Ownership spans the full lifecycle—from design and development through deployment and ongoing operations.
• You will partner with data modelers, business intelligence teams, and cross-functional stakeholders to define requirements, align on scope, and ensure high-quality delivery while promoting engineering best practices and consistency.
Key Responsibilities
• Define and document data requirements; ingest, integrate, and process large volumes of structured and semi/unstructured data.
• Design and build scalable data pipelines for ingestion, transformation, validation, and enrichment to support downstream analytics and BI use cases.
• Develop and maintain standardized datasets while supporting ad hoc analytical needs.
• Implement robust data quality frameworks and continuously improve trust and reliability of datasets.
• Contribute to data governance practices, including access control, data retention, and handling of sensitive data in alignment with enterprise policies.
• Collaborate with data science and BI teams to deliver data models and pipelines for reporting, analytics, and machine learning.
• Build and optimize pipelines that clean, transform, aggregate, and publish data into curated layers within the lakehouse architecture.
• Utilize Databricks, Apache Spark, SQL, and AWS services to integrate and process data efficiently.
• Apply sound data architecture principles to balance performance, scalability, cost, and maintainability.
• Champion best practices in data engineering, including testing, observability, and operational readiness.
• Participate in Agile development processes, including backlog refinement, sprint planning, and cross-team coordination.
Required Qualifications
• Bachelor’s degree in Computer Science, Information Systems, Engineering, or a related field (or equivalent practical experience).
• 5+ years of professional experience in data engineering, working with large-scale datasets in production environments.
Technical Expertise:
• Databricks (Required): Hands-on experience building and operating Databricks workflows, notebooks, and production pipelines using Apache Spark (Spark SQL and/or PySpark).
• Delta Lake (Required): Experience designing and maintaining Delta Lake tables, including incremental processing, merges/upserts, schema evolution, and performance optimization.
• Delta Live Tables (DLT) (Required): Experience building and managing DLT pipelines with a focus on dependencies, incremental processing, and monitoring.
• Databricks Governance (Required): Experience with Unity Catalog or similar governance frameworks, including secure data sharing and access control.
• Apache Spark: 4+ years of experience building and optimizing batch and/or streaming data pipelines.
• Streaming Technologies: 3+ years of experience with Apache Kafka or managed equivalents (e.g., Confluent), including scaling, throughput, and fault tolerance.
• AWS Ecosystem: 3+ years of experience with AWS services (e.g., S3, IAM, and related analytics integrations).
• SQL: Strong proficiency in writing and optimizing complex queries and translating business logic into data models.
• Proven experience delivering ETL/ELT pipelines in a lakehouse environment, including incremental loads and data quality enforcement.
• Experience working in Agile environments (Scrum, Kanban, SAFe, or similar).
Preferred Qualifications
• Experience with Snowflake or other large-scale analytical databases.
• Familiarity with NoSQL databases such as Cassandra.
• Experience with enterprise messaging systems (e.g., TIBCO EMS, IBM MQ) in addition to Kafka.
Role Summary
• Senior Data Engineers are responsible for the architecture, design, and delivery of scalable analytics and streaming solutions that transform enterprise data into governed, high-quality datasets for business intelligence and advanced analytics.
• They operate across the full data lifecycle—partnering with business stakeholders, defining requirements, designing lakehouse architectures, and building production-grade pipelines using technologies such as Databricks, Delta Lake, Delta Live Tables, Unity Catalog, Apache Spark, Kafka, AWS services, and SQL.
• Success in this role requires deep technical expertise, strong problem-solving skills, and the ability to communicate effectively across technical and business teams to deliver measurable, data-driven outcomes.
Senior Data Engineers (Lead/Senior/Intermediate) are responsible for designing, building, and operating scalable data pipelines and lakehouse architectures using Databricks, Apache Spark (Spark SQL/PySpark), Delta Lake, Delta Live Tables, Unity Catalog, Kafka, AWS (S3, IAM), and SQL. Core duties include defining data requirements; ingesting, integrating, and processing large structured and unstructured datasets; developing standardized and ad hoc datasets; implementing data quality and governance frameworks (access control, retention, sensitive data handling); optimizing batch and streaming pipelines; and collaborating with data science, BI, and business stakeholders to deliver analytics and machine learning solutions within Agile environments. Candidates must hold a bachelor’s degree in a related field and have 5+ years of data engineering experience, including 4+ years with Spark, 3+ years with Kafka and AWS, and proven expertise in ETL/ELT pipelines, incremental processing, and production-grade systems in lakehouse environments. Required skills include advanced SQL, data modeling, performance optimization, and operational best practices (testing, observability). Preferred qualifications include experience with Snowflake, NoSQL databases (e.g., Cassandra), and enterprise messaging systems (e.g., TIBCO EMS, IBM MQ).
Title: Big Data Engineer: Various levels (Lead, Senior, Intermediate)
Location: Atlanta, GA 303038 (Hybrid: 2x/Week)
Contract: 6+ Months. Long-term.
Industry: Rail Transportation.
•
•
• If not in the Atlanta area, then 100% REMOTE.
• Overview
• Client Corporation is seeking a Senior Data Engineer to collaborate across the organization and deliver reliable, scalable data solutions. You will join a high-performing team focused on modern data platforms, helping design, build, and operate a Databricks-based lakehouse architecture and streaming analytics ecosystem.
• This role combines strong business acumen with hands-on engineering expertise. You will work closely with stakeholders to understand business challenges, translate them into technical requirements, design pragmatic data architectures, and deliver production-grade data pipelines. Ownership spans the full lifecycle—from design and development through deployment and ongoing operations.
• You will partner with data modelers, business intelligence teams, and cross-functional stakeholders to define requirements, align on scope, and ensure high-quality delivery while promoting engineering best practices and consistency.
Key Responsibilities
• Define and document data requirements; ingest, integrate, and process large volumes of structured and semi/unstructured data.
• Design and build scalable data pipelines for ingestion, transformation, validation, and enrichment to support downstream analytics and BI use cases.
• Develop and maintain standardized datasets while supporting ad hoc analytical needs.
• Implement robust data quality frameworks and continuously improve trust and reliability of datasets.
• Contribute to data governance practices, including access control, data retention, and handling of sensitive data in alignment with enterprise policies.
• Collaborate with data science and BI teams to deliver data models and pipelines for reporting, analytics, and machine learning.
• Build and optimize pipelines that clean, transform, aggregate, and publish data into curated layers within the lakehouse architecture.
• Utilize Databricks, Apache Spark, SQL, and AWS services to integrate and process data efficiently.
• Apply sound data architecture principles to balance performance, scalability, cost, and maintainability.
• Champion best practices in data engineering, including testing, observability, and operational readiness.
• Participate in Agile development processes, including backlog refinement, sprint planning, and cross-team coordination.
Required Qualifications
• Bachelor’s degree in Computer Science, Information Systems, Engineering, or a related field (or equivalent practical experience).
• 5+ years of professional experience in data engineering, working with large-scale datasets in production environments.
Technical Expertise:
• Databricks (Required): Hands-on experience building and operating Databricks workflows, notebooks, and production pipelines using Apache Spark (Spark SQL and/or PySpark).
• Delta Lake (Required): Experience designing and maintaining Delta Lake tables, including incremental processing, merges/upserts, schema evolution, and performance optimization.
• Delta Live Tables (DLT) (Required): Experience building and managing DLT pipelines with a focus on dependencies, incremental processing, and monitoring.
• Databricks Governance (Required): Experience with Unity Catalog or similar governance frameworks, including secure data sharing and access control.
• Apache Spark: 4+ years of experience building and optimizing batch and/or streaming data pipelines.
• Streaming Technologies: 3+ years of experience with Apache Kafka or managed equivalents (e.g., Confluent), including scaling, throughput, and fault tolerance.
• AWS Ecosystem: 3+ years of experience with AWS services (e.g., S3, IAM, and related analytics integrations).
• SQL: Strong proficiency in writing and optimizing complex queries and translating business logic into data models.
• Proven experience delivering ETL/ELT pipelines in a lakehouse environment, including incremental loads and data quality enforcement.
• Experience working in Agile environments (Scrum, Kanban, SAFe, or similar).
Preferred Qualifications
• Experience with Snowflake or other large-scale analytical databases.
• Familiarity with NoSQL databases such as Cassandra.
• Experience with enterprise messaging systems (e.g., TIBCO EMS, IBM MQ) in addition to Kafka.
Role Summary
• Senior Data Engineers are responsible for the architecture, design, and delivery of scalable analytics and streaming solutions that transform enterprise data into governed, high-quality datasets for business intelligence and advanced analytics.
• They operate across the full data lifecycle—partnering with business stakeholders, defining requirements, designing lakehouse architectures, and building production-grade pipelines using technologies such as Databricks, Delta Lake, Delta Live Tables, Unity Catalog, Apache Spark, Kafka, AWS services, and SQL.
• Success in this role requires deep technical expertise, strong problem-solving skills, and the ability to communicate effectively across technical and business teams to deliver measurable, data-driven outcomes.
Senior Data Engineers (Lead/Senior/Intermediate) are responsible for designing, building, and operating scalable data pipelines and lakehouse architectures using Databricks, Apache Spark (Spark SQL/PySpark), Delta Lake, Delta Live Tables, Unity Catalog, Kafka, AWS (S3, IAM), and SQL. Core duties include defining data requirements; ingesting, integrating, and processing large structured and unstructured datasets; developing standardized and ad hoc datasets; implementing data quality and governance frameworks (access control, retention, sensitive data handling); optimizing batch and streaming pipelines; and collaborating with data science, BI, and business stakeholders to deliver analytics and machine learning solutions within Agile environments. Candidates must hold a bachelor’s degree in a related field and have 5+ years of data engineering experience, including 4+ years with Spark, 3+ years with Kafka and AWS, and proven expertise in ETL/ELT pipelines, incremental processing, and production-grade systems in lakehouse environments. Required skills include advanced SQL, data modeling, performance optimization, and operational best practices (testing, observability). Preferred qualifications include experience with Snowflake, NoSQL databases (e.g., Cassandra), and enterprise messaging systems (e.g., TIBCO EMS, IBM MQ).






