The Custom Group of Companies

Data Engineer - III

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer - III with a contract length of "unknown," offering a pay rate of "$XX/hour." Key skills required include ETL development using Databricks, AWS expertise, and proficiency in Python and SQL.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
February 3, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
New York, NY
-
🧠 - Skills detailed
#Security #EC2 #Storage #Data Ingestion #AWS Databases #Apache Spark #Deployment #Docker #Agile #Data Architecture #Data Lake #"ETL (Extract #Transform #Load)" #Spark (Apache Spark) #RDS (Amazon Relational Database Service) #Trino #SQL Queries #Data Analysis #Data Pipeline #Data Processing #Scala #Code Reviews #Databricks #DynamoDB #SaaS (Software as a Service) #Data Engineering #GitLab #SQL (Structured Query Language) #Automation #Debugging #Data Quality #Terraform #Kubernetes #Python #Monitoring #PySpark #S3 (Amazon Simple Storage Service) #AWS (Amazon Web Services) #Data Integration #Databases #Programming #Cloud #Computer Science
Role description
Your role as a Senior Data Engineer • Work on migrating applications from an on-premises location to the cloud service providers. • Develop products and services on the latest technologies through contributions in development, enhancements, testing and implementation. • Develop, modify, extend code for building cloud infrastructure, and automate using CI/CD pipeline. • Partners with business and peers in the pursuit of solutions that achieve business goals through an agile software development methodology. • Perform problem analysis, data analysis, reporting, and communication. • Work with peers across the system to define and implement best practices and standards. • Assess applications and help determine the appropriate application infrastructure patterns. • Use the best practices and knowledge of internal or external drivers to improve products or services. Quals-- What we are looking for: • Hands-on experience in building ETL using Databricks SaaS infrastructure. • Experience in developing data pipeline solutions to ingest and exploit new and existing data sources. • Expertise in leveraging SQL, programming language like Python and ETL tools like Databricks • Perform code reviews to ensure requirements, optimal execution patterns and adherence to established standards. Computer Science or Equivalent • Expertise in AWS Compute (EC2, EMR), AWS Storage (S3, EBS), AWS Databases (RDS, DynamoDB), AWS Data Integration (Glue). • Advanced understanding of Container Orchestration services including Docker and Kubernetes, and a variety of AWS tools and services. • Good understanding of AWS Identify and Access management, AWS Networking and AWS Monitoring tools. • Proficiency in CI/CD and deployment automation using GITLAB pipeline. • Proficiency in Cloud infrastructure provisioning tools e.g., Terraform. • Proficiency in one or more programming languages e.g., Python, Scala. • Experience in Starburst, Trino and building SQL queries in federated architecture. • Good knowledge of Lake house architecture. • • Design, develop, and optimize scalable ETL/ELT pipelines using Databricks and Apache Spark (PySpark and Scala). • • Build data ingestion workflows from various sources (structured, semi-structured, and unstructured). • • Develop reusable components and frameworks for efficient data processing. • • Implement best practices for data quality, validation, and governance. • • Collaborate with data architects, analysts, and business stakeholders to understand data requirements. • • Tune Spark jobs for performance and scalability in a cloud-based environment. • • Maintain robust data lake or Lakehouse architecture. • • Ensure high availability, security, and integrity of data pipelines and platforms. • • Support troubleshooting, debugging, and performance optimization in production workloads.