

Hays
Data Engineer
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is a Data Engineer position for a 6-month contract, offering a pay rate of "$X per hour." It requires strong Python and PySpark skills, experience with Behave testing, Delta Lake optimization, and Azure services.
🌎 - Country
United Kingdom
💱 - Currency
£ GBP
-
💰 - Day rate
Unknown
-
🗓️ - Date
November 29, 2025
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
England, United Kingdom
-
🧠 - Skills detailed
#Data Ingestion #Data Processing #Data Lake #Cloud #Documentation #Storage #Python #Unit Testing #Compliance #Vault #Version Control #Data Security #"ETL (Extract #Transform #Load)" #Azure #Azure Blob Storage #Docker #Programming #Data Science #Agile #PySpark #DevOps #Security #Data Governance #Databricks #Azure DevOps #Deployment #Delta Lake #Synapse #"ACID (Atomicity #Consistency #Isolation #Durability)" #Azure cloud #Spark (Apache Spark) #Scala #Data Engineering
Role description
We are seeking a highly skilled Python Data Engineer with hands-on experience in Behave-based unit testing, PySpark development, Delta Lake optimization, and Azure cloud services. This role involves designing, developing, and deploying scalable data processing solutions in a containerized environment, with an emphasis on maintainable, configurable, and test-driven code delivery.
Key Responsibilities:
• Develop and maintain data ingestion, transformation, and validation pipelines using Python and PySpark.
• Implement unit and behavior-driven testing using Behave, ensuring robust mocking and patching of dependencies.
• Design and maintain Delta Lake tables for optimized query performance, ACID compliance, and incremental data loads.
• Build and manage containerized environments using Docker for consistent development, testing, and deployment.
• Develop configurable, parameter-driven codebases to support modular and reusable data solutions.
• Integrate Azure services, including Azure Functions for serverless transformation logic, Azure Key Vault for secure credential management, and Azure Blob Storage for data lake operations.
• Collaborate closely with cloud architects, data scientists, and DevOps teams to ensure seamless CI/CD workflows, version control, and environment consistency.
• Troubleshoot and optimize Spark jobs for performance and scalability in production environments.
• Maintain technical documentation and adhere to best practices in cloud security and data governance.
Required Skills and Experience:
• Strong proficiency in Python programming with emphasis on modular and test-driven design.
• Demonstrated experience in writing unit tests and BDD scenarios using Behave or similar frameworks.
• In-depth understanding of mocking, patching, and dependency injection in Python testing.
• Proficiency in PySpark with hands-on experience in distributed data processing and performance tuning.
• Solid understanding of Delta Lake concepts, transactional guarantees, and schema evolution.
• Experience with Docker for development, testing, and deployment workflows.
• Familiarity with Azure components such as Azure Functions, Key Vault, Blob Storage, and Data Lake Storage Gen2.
• Ability to implement configuration-driven applications for flexible deployment across environments.
• Experience with CI/CD pipelines (Azure DevOps or similar) and infrastructure-as-code tools is a plus.
• Strong problem-solving skills and ability to work independently in fast-paced, agile environments.
Preferred Qualifications:
• Experience developing in Databricks or Synapse with Delta Lake integration.
• Knowledge of best practices in data security and governance within Azure ecosystems.
• Strong communication skills and experience collaborating with distributed teams.
We are seeking a highly skilled Python Data Engineer with hands-on experience in Behave-based unit testing, PySpark development, Delta Lake optimization, and Azure cloud services. This role involves designing, developing, and deploying scalable data processing solutions in a containerized environment, with an emphasis on maintainable, configurable, and test-driven code delivery.
Key Responsibilities:
• Develop and maintain data ingestion, transformation, and validation pipelines using Python and PySpark.
• Implement unit and behavior-driven testing using Behave, ensuring robust mocking and patching of dependencies.
• Design and maintain Delta Lake tables for optimized query performance, ACID compliance, and incremental data loads.
• Build and manage containerized environments using Docker for consistent development, testing, and deployment.
• Develop configurable, parameter-driven codebases to support modular and reusable data solutions.
• Integrate Azure services, including Azure Functions for serverless transformation logic, Azure Key Vault for secure credential management, and Azure Blob Storage for data lake operations.
• Collaborate closely with cloud architects, data scientists, and DevOps teams to ensure seamless CI/CD workflows, version control, and environment consistency.
• Troubleshoot and optimize Spark jobs for performance and scalability in production environments.
• Maintain technical documentation and adhere to best practices in cloud security and data governance.
Required Skills and Experience:
• Strong proficiency in Python programming with emphasis on modular and test-driven design.
• Demonstrated experience in writing unit tests and BDD scenarios using Behave or similar frameworks.
• In-depth understanding of mocking, patching, and dependency injection in Python testing.
• Proficiency in PySpark with hands-on experience in distributed data processing and performance tuning.
• Solid understanding of Delta Lake concepts, transactional guarantees, and schema evolution.
• Experience with Docker for development, testing, and deployment workflows.
• Familiarity with Azure components such as Azure Functions, Key Vault, Blob Storage, and Data Lake Storage Gen2.
• Ability to implement configuration-driven applications for flexible deployment across environments.
• Experience with CI/CD pipelines (Azure DevOps or similar) and infrastructure-as-code tools is a plus.
• Strong problem-solving skills and ability to work independently in fast-paced, agile environments.
Preferred Qualifications:
• Experience developing in Databricks or Synapse with Delta Lake integration.
• Knowledge of best practices in data security and governance within Azure ecosystems.
• Strong communication skills and experience collaborating with distributed teams.






