Arkhya Tech. Inc.

Data Quality Manager

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is a Data Quality Engineer for a remote contract, focusing on a Provider 360 data product in healthcare. Key skills include ETL development, data quality rule implementation, and proficiency in SQL, Python, and cloud platforms. Contract length and pay rate are unspecified.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
April 14, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Remote
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
United States
-
🧠 - Skills detailed
#Leadership #Data Lineage #Deployment #"ETL (Extract #Transform #Load)" #Agile #Programming #Data Quality #GIT #Scala #Data Ingestion #Integration Testing #Automated Testing #Data Dictionary #Documentation #AWS (Amazon Web Services) #Datasets #Jenkins #Azure #Data Architecture #Version Control #Logging #Data Engineering #PySpark #Cloud #GCP (Google Cloud Platform) #Spark (Apache Spark) #Data Modeling #Metadata #AWS Glue #ADF (Azure Data Factory) #Azure Data Factory #Scrum #SQL (Structured Query Language) #Data Lake #Data Storage #Indexing #Big Data #Jira #Automation #DevOps #Storage #Data Processing #Python #Data Pipeline #Anomaly Detection #Data Transformations #Data Governance #Data Framework #Data Management #Azure DevOps #Dataflow #Apache Spark #Physical Data Model #Data Analysis
Role description
Role: Data Quality Engineer Location: Remote Contract Overview: We are seeking a Data Quality Engineer for supporting a Provider 360 data product program for a leading healthcare client. In this role, you will join a cross-functional team (Tech Lead, Data Modeler, Data Engineers) to deliver a high-quality data pipeline and data quality framework within a tight timeline, operating in a dynamic and fast-moving development lifecycle. The focus is on building robust ingestion pipelines, enforcing strict data quality rules, implementing thorough unit/integration testing, and contributing to a comprehensive data dictionary and metadata governance standards. Key Responsibilities: • Design & Build Ingestion Pipelines – Develop end-to-end data pipelines from source systems through bronze, silver, and gold layers, adhering to a medallion (multi-tier) data architecture. This includes setting up source data ingestion (bronze layer) with initial quality checks (schema validation, completeness), transforming and refining data in intermediate layers (silver), and preparing curated datasets in the gold layer aligned with business use cases. • Implement Data Quality Controls – Define and embed data quality rules into the pipelines (e.g. checks for data completeness, consistency, accuracy) and configure threshold-based alerts for data quality metrics. Ensure that any data anomalies trigger logging and notifications, with mechanisms for handling and recovering from data errors (e.g. retry logic, error handling procedures). • Testing & Validation – Establish a robust testing framework for the data pipeline. Develop automated unit tests to cover at least 90% of data transformation logic, and create integration tests to validate end-to-end data flows and dependencies. Collaborate on performance testing (throughput, latency) to ensure the data pipelines meet or exceed SLAs and can scale for future growth. • Metadata Management & Documentation – Contribute to the data dictionary and metadata standards. Document all critical data elements and transformations: capturing field definitions, data types, sources, and owners for the gold layer (and relevant bronze/silver fields). Help establish clear metadata conventions (naming standards, data lineage, data quality metrics) and ensure that all documentation (dictionary, lineage, quality rules) is reviewed and approved by project stakeholders. • Collaboration & Agile Delivery – Work closely with Data Engineers, Data Modeler, and Tech Lead in an Agile environment to meet sprint commitments and project milestones. Communicate progress, issues, and solutions effectively with both technical team members and project leadership, aligning with best practices and project standards for data engineering. Skillset & Proficiency Requirements: The ideal candidate will have the following skills and experience (with required proficiency levels): Skill / Technology Proficiency Details Data Pipeline Development (ETL) Expert Designing and building end-to-end ETL pipelines; strong ability to develop data ingestion and transformation processes. Medallion Architecture (Bronze/Silver/Gold) Intermediate Understanding of medallion (multi-layer) data lake architecture for organizing raw, refined, and curated data. Data Quality Rule Implementation Expert Defining and coding data quality rules (completeness, validity, consistency) and anomaly detection with alerts. Unit & Integration Testing Expert Developing automated unit tests for data transformations and integration tests for pipelines, ensuring ~90% coverage. Data Dictionary & Metadata Management Intermediate Documenting data definitions, lineage, and metadata; contributing to data governance and metadata standards. SQL & Data Modeling Intermediate Proficiency in SQL for data analysis/validation; understanding of logical/physical data models and relational schemas (including indexing, partitioning). Big Data Frameworks (Spark, etc.) Expert Hands-on experience with Apache Spark/PySpark or similar for large-scale data processing and pipeline development. Cloud Data Platforms (Azure/AWS/GCP) Intermediate Experience building data pipelines on cloud platforms (e.g., Azure Data Factory, AWS Glue, or GCP Dataflow) and using cloud data storage/processing services. Programming (Python/Scala) Expert Strong coding skills for data engineering tasks (data transformation, automation scripts, quality checks). Tools: Version Control & CI/CD Intermediate Familiarity with Git for version control, and CI/CD pipelines (e.g., Jenkins or Azure DevOps) for automated testing and deployment. Project Tools (Jira, Test Management) Intermediate Experience working in Agile/Scrum teams using tools like Jira and test management suites (e.g., IBM Rational) for tracking tasks and defects.