

Alex James Digital
Scala/Data Bricks Engineer
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Scala/Data Bricks Engineer on a contract basis in New York, paying "pay rate". The position requires 5+ years in Scala, 3+ years with Apache Spark, and experience with cloud platforms like Azure or AWS.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
960
-
ποΈ - Date
October 16, 2025
π - Duration
Unknown
-
ποΈ - Location
Unknown
-
π - Contract
Unknown
-
π - Security
Unknown
-
π - Location detailed
New York City Metropolitan Area
-
π§ - Skills detailed
#GCP (Google Cloud Platform) #Data Lake #Compliance #Scala #Presto #Data Modeling #AWS S3 (Amazon Simple Storage Service) #Jenkins #Apache Spark #BigQuery #Cloud #Data Pipeline #Spark (Apache Spark) #Data Quality #Delta Lake #Python #Kafka (Apache Kafka) #Databricks #DevOps #Data Engineering #ML (Machine Learning) #"ETL (Extract #Transform #Load)" #Data Bricks #S3 (Amazon Simple Storage Service) #Azure #Data Science #MLflow #Azure DevOps #Big Data #Programming #SQL (Structured Query Language) #AWS (Amazon Web Services) #SaaS (Software as a Service) #Automation #GitHub
Role description
Weβre partnering with a fast-growing technology company in New York that is scaling its data engineering and analytics platforms.
They are seeking a highly skilled Scala/Databricks Engineer on a contract basis to design, optimize, and maintain large-scale data pipelines that power mission-critical insights across the business.
This role sits within the companyβs core Data Engineering team and will be central to modernizing their big data ecosystem, building production-grade pipelines, and enabling advanced analytics and machine learning use cases.
Key Responsibilities
β’ Design and build scalable, distributed data pipelines in Databricks (Spark) using Scala.
β’ Develop and optimize ETL/ELT workflows for structured and unstructured data sources.
β’ Implement best practices in data modeling, partitioning, and performance tuning.
β’ Collaborate with Data Science and Analytics teams to productionize ML pipelines.
β’ Work with cloud-native data platforms (Azure, AWS, or GCP) to deploy and monitor workloads.
β’ Ensure data quality, governance, and compliance across the pipeline ecosystem.
β’ Contribute to CI/CD automation for data engineering workflows.
β’ Troubleshoot and optimize Spark jobs to improve efficiency and reduce cost.
Required Skills & Experience
β’ 5+ years of professional experience in Scala development, with a strong background in functional programming.
β’ 3+ years of hands-on experience with Apache Spark (preferably in Databricks).
β’ Strong expertise in building and tuning large-scale ETL pipelines.
β’ Experience with cloud data platforms such as Azure Data Lake, AWS S3, or GCP BigQuery.
β’ Solid knowledge of SQL and distributed query engines (e.g., Hive, Presto, Delta Lake).
β’ Familiarity with ML pipeline integration and working alongside Data Science teams.
β’ Strong understanding of CI/CD tools (Jenkins, GitHub Actions, or Azure DevOps).
β’ Excellent problem-solving skills, with the ability to work independently and in fast-paced environments.
Preferred Skills
β’ Experience with Delta Lake and Databricks MLflow.
β’ Knowledge of Python for data engineering tasks.
β’ Background in financial services, fintech, or large-scale SaaS data environments.
β’ Familiarity with streaming frameworks (Kafka, Structured Streaming).
Weβre partnering with a fast-growing technology company in New York that is scaling its data engineering and analytics platforms.
They are seeking a highly skilled Scala/Databricks Engineer on a contract basis to design, optimize, and maintain large-scale data pipelines that power mission-critical insights across the business.
This role sits within the companyβs core Data Engineering team and will be central to modernizing their big data ecosystem, building production-grade pipelines, and enabling advanced analytics and machine learning use cases.
Key Responsibilities
β’ Design and build scalable, distributed data pipelines in Databricks (Spark) using Scala.
β’ Develop and optimize ETL/ELT workflows for structured and unstructured data sources.
β’ Implement best practices in data modeling, partitioning, and performance tuning.
β’ Collaborate with Data Science and Analytics teams to productionize ML pipelines.
β’ Work with cloud-native data platforms (Azure, AWS, or GCP) to deploy and monitor workloads.
β’ Ensure data quality, governance, and compliance across the pipeline ecosystem.
β’ Contribute to CI/CD automation for data engineering workflows.
β’ Troubleshoot and optimize Spark jobs to improve efficiency and reduce cost.
Required Skills & Experience
β’ 5+ years of professional experience in Scala development, with a strong background in functional programming.
β’ 3+ years of hands-on experience with Apache Spark (preferably in Databricks).
β’ Strong expertise in building and tuning large-scale ETL pipelines.
β’ Experience with cloud data platforms such as Azure Data Lake, AWS S3, or GCP BigQuery.
β’ Solid knowledge of SQL and distributed query engines (e.g., Hive, Presto, Delta Lake).
β’ Familiarity with ML pipeline integration and working alongside Data Science teams.
β’ Strong understanding of CI/CD tools (Jenkins, GitHub Actions, or Azure DevOps).
β’ Excellent problem-solving skills, with the ability to work independently and in fast-paced environments.
Preferred Skills
β’ Experience with Delta Lake and Databricks MLflow.
β’ Knowledge of Python for data engineering tasks.
β’ Background in financial services, fintech, or large-scale SaaS data environments.
β’ Familiarity with streaming frameworks (Kafka, Structured Streaming).