Data Engineer III

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is a Data Engineer III position for a 6-month contract, offering a pay rate of "$X/hour". Candidates must have hands-on ETL experience with Databricks, proficiency in SQL and Python, and expertise in AWS services and cloud infrastructure.
🌎 - Country
United States
πŸ’± - Currency
$ USD
-
πŸ’° - Day rate
-
πŸ—“οΈ - Date discovered
September 12, 2025
πŸ•’ - Project duration
Unknown
-
🏝️ - Location type
Unknown
-
πŸ“„ - Contract type
Unknown
-
πŸ”’ - Security clearance
Unknown
-
πŸ“ - Location detailed
New York, NY
-
🧠 - Skills detailed
#Programming #"ETL (Extract #Transform #Load)" #Data Pipeline #Data Architecture #Kubernetes #Code Reviews #Trino #SQL Queries #Databricks #Monitoring #Storage #Data Processing #SQL (Structured Query Language) #Data Lake #Automation #Computer Science #Python #Data Integration #Scala #GitLab #Security #Docker #Deployment #RDS (Amazon Relational Database Service) #Terraform #AWS (Amazon Web Services) #Databases #Data Analysis #SaaS (Software as a Service) #Cloud #AWS Databases #Data Quality #Data Engineering #S3 (Amazon Simple Storage Service) #Apache Spark #Spark (Apache Spark) #Data Ingestion #DynamoDB #PySpark #Agile #EC2 #Debugging
Role description
Your role as a Senior Data Engineer β€’ Work on migrating applications from an on-premises location to the cloud service providers. β€’ Develop products and services on the latest technologies through contributions in development, enhancements, testing and implementation. β€’ Develop, modify, extend code for building cloud infrastructure, and automate using CI/CD pipeline. β€’ Partners with business and peers in the pursuit of solutions that achieve business goals through an agile software development methodology. β€’ Perform problem analysis, data analysis, reporting, and communication. β€’ Work with peers across the system to define and implement best practices and standards. β€’ Assess applications and help determine the appropriate application infrastructure patterns. β€’ Use the best practices and knowledge of internal or external drivers to improve products or services. Qualifications-- What we are looking for: β€’ Hands-on experience in building ETL using Databricks SaaS infrastructure. β€’ Experience in developing data pipeline solutions to ingest and exploit new and existing data sources. β€’ Expertise in leveraging SQL, programming language like Python and ETL tools like Databricks β€’ Perform code reviews to ensure requirements, optimal execution patterns and adherence to established standards. Computer Science or Equivalent β€’ Expertise in AWS Compute (EC2, EMR), AWS Storage (S3, EBS), AWS Databases (RDS, DynamoDB), AWS Data Integration (Glue). β€’ Advanced understanding of Container Orchestration services including Docker and Kubernetes, and a variety of AWS tools and services. β€’ Good understanding of AWS Identify and Access management, AWS Networking and AWS Monitoring tools. β€’ Proficiency in CI/CD and deployment automation using GITLAB pipeline. β€’ Proficiency in Cloud infrastructure provisioning tools e.g., Terraform. β€’ Proficiency in one or more programming languages e.g., Python, Scala. β€’ Experience in Starburst, Trino and building SQL queries in federated architecture. β€’ Good knowledge of Lake house architecture. β€’ Design, develop, and optimize scalable ETL/ELT pipelines using Databricks and Apache Spark (PySpark and Scala). β€’ Build data ingestion workflows from various sources (structured, semi-structured, and unstructured). β€’ Develop reusable components and frameworks for efficient data processing. β€’ Implement best practices for data quality, validation, and governance. β€’ Collaborate with data architects, analysts, and business stakeholders to understand data requirements. β€’ Tune Spark jobs for performance and scalability in a cloud-based environment. β€’ Maintain robust data lake or Lakehouse architecture. β€’ Ensure high availability, security, and integrity of data pipelines and platforms. β€’ Support troubleshooting, debugging, and performance optimization in production workloads.