Hays

Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is a Data Engineer position for a 6-month contract, offering a pay rate of "$X per hour." It requires strong Python and PySpark skills, experience with Behave testing, Delta Lake optimization, and Azure services.
🌎 - Country
United Kingdom
💱 - Currency
£ GBP
-
💰 - Day rate
Unknown
-
🗓️ - Date
November 29, 2025
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
England, United Kingdom
-
🧠 - Skills detailed
#Data Ingestion #Data Processing #Data Lake #Cloud #Documentation #Storage #Python #Unit Testing #Compliance #Vault #Version Control #Data Security #"ETL (Extract #Transform #Load)" #Azure #Azure Blob Storage #Docker #Programming #Data Science #Agile #PySpark #DevOps #Security #Data Governance #Databricks #Azure DevOps #Deployment #Delta Lake #Synapse #"ACID (Atomicity #Consistency #Isolation #Durability)" #Azure cloud #Spark (Apache Spark) #Scala #Data Engineering
Role description
We are seeking a highly skilled Python Data Engineer with hands-on experience in Behave-based unit testing, PySpark development, Delta Lake optimization, and Azure cloud services. This role involves designing, developing, and deploying scalable data processing solutions in a containerized environment, with an emphasis on maintainable, configurable, and test-driven code delivery. Key Responsibilities: • Develop and maintain data ingestion, transformation, and validation pipelines using Python and PySpark. • Implement unit and behavior-driven testing using Behave, ensuring robust mocking and patching of dependencies. • Design and maintain Delta Lake tables for optimized query performance, ACID compliance, and incremental data loads. • Build and manage containerized environments using Docker for consistent development, testing, and deployment. • Develop configurable, parameter-driven codebases to support modular and reusable data solutions. • Integrate Azure services, including Azure Functions for serverless transformation logic, Azure Key Vault for secure credential management, and Azure Blob Storage for data lake operations. • Collaborate closely with cloud architects, data scientists, and DevOps teams to ensure seamless CI/CD workflows, version control, and environment consistency. • Troubleshoot and optimize Spark jobs for performance and scalability in production environments. • Maintain technical documentation and adhere to best practices in cloud security and data governance. Required Skills and Experience: • Strong proficiency in Python programming with emphasis on modular and test-driven design. • Demonstrated experience in writing unit tests and BDD scenarios using Behave or similar frameworks. • In-depth understanding of mocking, patching, and dependency injection in Python testing. • Proficiency in PySpark with hands-on experience in distributed data processing and performance tuning. • Solid understanding of Delta Lake concepts, transactional guarantees, and schema evolution. • Experience with Docker for development, testing, and deployment workflows. • Familiarity with Azure components such as Azure Functions, Key Vault, Blob Storage, and Data Lake Storage Gen2. • Ability to implement configuration-driven applications for flexible deployment across environments. • Experience with CI/CD pipelines (Azure DevOps or similar) and infrastructure-as-code tools is a plus. • Strong problem-solving skills and ability to work independently in fast-paced, agile environments. Preferred Qualifications: • Experience developing in Databricks or Synapse with Delta Lake integration. • Knowledge of best practices in data security and governance within Azure ecosystems. • Strong communication skills and experience collaborating with distributed teams.