Infoplus Technologies UK Limited

Data Test Automation SME

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Test Automation SME with 8–12+ years of experience, focused on data engineering test automation. Contract length is "unknown" with a pay rate of "unknown." Key skills include Python, SQL, PySpark, and cloud data platforms (AWS/Azure/GCP).
🌎 - Country
United Kingdom
💱 - Currency
£ GBP
-
💰 - Day rate
Unknown
-
🗓️ - Date
March 14, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Norwich, England, United Kingdom
-
🧠 - Skills detailed
#Metadata #Kafka (Apache Kafka) #Strategy #Data Profiling #Spark (Apache Spark) #AWS Glue #Data Pipeline #Databricks #Cloud #Scala #Automated Testing #Data Modeling #"ETL (Extract #Transform #Load)" #dbt (data build tool) #GCP (Google Cloud Platform) #PySpark #SQL (Structured Query Language) #Datasets #Automation #Airflow #Batch #Jenkins #Data Quality #IICS (Informatica Intelligent Cloud Services) #Azure DevOps #DevOps #Collibra #Informatica #Azure #GitHub #Data Transformations #Anomaly Detection #Debugging #Data Lake #Talend #Alation #AWS (Amazon Web Services) #Data Engineering #Python #MDM (Master Data Management) #Data Ingestion
Role description
Summary: We are seeking a Data Test Automation SME with deep expertise in designing, implementing, and scaling automated testing frameworks for data pipelines, data transformations, metadata, and quality validation across modern cloud data platforms. This role is focused on data engineering test automation, not UI/front-end testing. You will drive automation strategy, create reusable frameworks, and enable teams to deliver high-quality, reliable data products. Ideal for: A senior test automation expert who has built enterprise-grade data testing solutions (in-house or using COTS/OOTS tools) and has strong hands-on skills in Python, SQL, PySpark, DQ frameworks, CI/CD, and cloud data services (AWS/Azure/GCP). Your responsibilities: • Define and own the end-to-end test automation strategy for data platforms, pipelines, and transformations. • Architect and implement scalable, reusable automation frameworks covering: • Data ingestion (batch, streaming) • ETL/ELT pipelines • Data transformations and SCD patterns • Quality rules & DQ checks • Schema evolution and metadata validation • Establish standards for test modularity, data generation, assertions, and environment management. • Build automated tests for pipelines running on technologies such as: • AWS Glue / EMR / Databricks / Spark • Airflow / MWAA / Step Functions • Kafka / Kinesis / EventBridge integrations • Validate transformations, aggregations, partitioning, performance, idempotency, bookmarks, and error handling. • Implement Data Quality automation using either in-house frameworks or COTS tools such as: • Great Expectations, Deequ, dbt tests, Informatica DQ, Collibra DQ • Automate: • Recon (row counts, checksums, match rates) • Business rule validations • Data profiling and anomaly detection • Golden Record validations (for MDM) • Automate validation of: • Schema evolution • Catalog/metadata completeness • Lineage propagation across ETL • Integrate testing with catalog/lineage tools such as Glue Catalog, Collibra, Alation, OpenLineage, or Marquez. • Integrate test suites with CI/CD pipelines (GitHub Actions, Jenkins, Azure DevOps, AWS CodePipeline). • Build automated gates to prevent bad data from being promoted to higher environments. • Enable continuous testing with synthetic and production-masked datasets. • Work with Data Engineers, Architects, Data Stewards, and QA/Test Leads to refine test requirements. • Provide coaching and SME guidance to test engineers on automation best practices. • Perform root cause analysis of data defects and drive prevention strategies. • Contribute to sprint ceremonies, coverage reviews, and release readiness checks. Essential skills/knowledge/experience: • 8–12+ years overall experience with 5+ years in data test automation. • Strong hands-on expertise in: • Python, SQL, PySpark • ETL testing (Glue/Spark/Informatica/Talend/IICS/etc.) • Streaming testing (Kafka/Kinesis) • Data quality automation frameworks • Experience designing automation frameworks from scratch or customizing OOTS/COTS tools. • Strong understanding of: • SCD types, partitioning, data modeling • Data lake/lakehouse patterns • Test case design for large data volumes • Experience validating data pipelines on AWS, Azure, or GCP. • Strong debugging, analytical, and data profiling skills.