

TerraGiG
AWS Data Engineer
β - Featured Role | Apply direct with Data Freelance Hub
This role is for an AWS Data Engineer with 8+ years of experience, offering a 2-month remote contract (extendable) at a competitive pay rate. Key skills include Python, pySpark, ETL pipelines, AWS EMR, and data governance expertise.
π - Country
United Kingdom
π± - Currency
Β£ GBP
-
π° - Day rate
Unknown
-
ποΈ - Date
October 17, 2025
π - Duration
1 to 3 months
-
ποΈ - Location
Remote
-
π - Contract
Unknown
-
π - Security
Unknown
-
π - Location detailed
United Kingdom
-
π§ - Skills detailed
#Storage #Databricks #Programming #PySpark #Monitoring #Data Engineering #AWS EMR (Amazon Elastic MapReduce) #Scala #Data Storage #Apache Spark #AWS S3 (Amazon Simple Storage Service) #Spark (Apache Spark) #Data Pipeline #Python #S3 (Amazon Simple Storage Service) #Deployment #AWS (Amazon Web Services) #Data Lake #"ETL (Extract #Transform #Load)" #Data Governance #API (Application Programming Interface)
Role description
Role
AWS Data Engineer
Experience
8+ years
Location
Remote
Time Zone
UK
Duration
2 months (Extendable)
Job Description
β’ Design, development, and implementation of performant ETL pipelines using python API (pySpark) of Apache Spark on AWS EMR.
β’ Writing reusable, testable, and efficient code
β’ Integration of data storage solutions in spark β especially with AWS S3 object storage. Performance tuning of pySpark scripts.
β’ Need to ensure overall build delivery quality is good and on-time delivery is done at all times.
β’ Should be able to handle meetings with customers with ease.
β’ Need to have excellent communication skills to interact with the customer.
β’ Be a team player and willing to work in an onsite-offshore model, mentor other folks in the team (onsite as well as offshore)
β’ 5+ years of experience in programming with python. Strong proficiency in python
β’ Familiarity with functional programming concepts
β’ 3+ years of hands-on experience in developing ETL data pipelines using pySpark on AWS EMR
β’ Experience in building pipelines and data lake for large enterprises on AWS
β’ Good understanding of Sparkβs Dataframe and API
β’ Experience in configuring EMR clusters on AWS
β’ Experience in dealing with AWS S3 object storage from Spark.
β’ Experience in troubleshooting spark jobs. Knowledge of monitoring spark jobs using Spark UI
β’ Performance tuning of Spark jobs.
β’ Understanding fundamental design principles behind business processes
Process Knowledge and Expertise:
β’ Demonstrated experience in change management processes, including understanding of governance frameworks and preparation of supporting artefacts required for approvals.
β’ Strong clarity on the path to production, with hands-on involvement in deployments, testing cycles, and obtaining business sign-offs.
β’ Proven track record in technical solution design, with the ability to provide architectural guidance and support implementation strategies.
Databricks-Specific Skills:
β’ Experience in at least developing and delivering end-to-end Proof of Concept (POC) solutions covering the below:
β’ Basic proficiency in Databricks, including creating jobs and configuring clusters.
β’ Exposure to connecting external data sources (e.g., Amazon S3) to Databricks.
β’ Understanding of Unity Catalog and its role in data governance.
β’ Familiarity with notebook orchestration and implementing modular code structures to enhance scalability and maintainability.
Important Pointers:
β’ Candidates must have actual hands-on work experience, not just home projects or academic exercises.
β’ Profiles should clearly state how much experience they have in each skill area, as this helps streamline the interview process.
β’ Candidates must know their CV/profile inside out, including all projects and responsibilities listed. Any ambiguity or lack of clarity on the candidateβs part can lead to immediate rejection, as we value accuracy and ownership.
β’ They should be able to confidently explain their past experience, challenges handled, and technical contributions.
Role
AWS Data Engineer
Experience
8+ years
Location
Remote
Time Zone
UK
Duration
2 months (Extendable)
Job Description
β’ Design, development, and implementation of performant ETL pipelines using python API (pySpark) of Apache Spark on AWS EMR.
β’ Writing reusable, testable, and efficient code
β’ Integration of data storage solutions in spark β especially with AWS S3 object storage. Performance tuning of pySpark scripts.
β’ Need to ensure overall build delivery quality is good and on-time delivery is done at all times.
β’ Should be able to handle meetings with customers with ease.
β’ Need to have excellent communication skills to interact with the customer.
β’ Be a team player and willing to work in an onsite-offshore model, mentor other folks in the team (onsite as well as offshore)
β’ 5+ years of experience in programming with python. Strong proficiency in python
β’ Familiarity with functional programming concepts
β’ 3+ years of hands-on experience in developing ETL data pipelines using pySpark on AWS EMR
β’ Experience in building pipelines and data lake for large enterprises on AWS
β’ Good understanding of Sparkβs Dataframe and API
β’ Experience in configuring EMR clusters on AWS
β’ Experience in dealing with AWS S3 object storage from Spark.
β’ Experience in troubleshooting spark jobs. Knowledge of monitoring spark jobs using Spark UI
β’ Performance tuning of Spark jobs.
β’ Understanding fundamental design principles behind business processes
Process Knowledge and Expertise:
β’ Demonstrated experience in change management processes, including understanding of governance frameworks and preparation of supporting artefacts required for approvals.
β’ Strong clarity on the path to production, with hands-on involvement in deployments, testing cycles, and obtaining business sign-offs.
β’ Proven track record in technical solution design, with the ability to provide architectural guidance and support implementation strategies.
Databricks-Specific Skills:
β’ Experience in at least developing and delivering end-to-end Proof of Concept (POC) solutions covering the below:
β’ Basic proficiency in Databricks, including creating jobs and configuring clusters.
β’ Exposure to connecting external data sources (e.g., Amazon S3) to Databricks.
β’ Understanding of Unity Catalog and its role in data governance.
β’ Familiarity with notebook orchestration and implementing modular code structures to enhance scalability and maintainability.
Important Pointers:
β’ Candidates must have actual hands-on work experience, not just home projects or academic exercises.
β’ Profiles should clearly state how much experience they have in each skill area, as this helps streamline the interview process.
β’ Candidates must know their CV/profile inside out, including all projects and responsibilities listed. Any ambiguity or lack of clarity on the candidateβs part can lead to immediate rejection, as we value accuracy and ownership.
β’ They should be able to confidently explain their past experience, challenges handled, and technical contributions.