

Biblioso
Data Foundations & Lineage Engineer
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Foundations & Lineage Engineer on a 12 to 18 month contract, paying $50-$60 hourly. Remote work is required during Pacific hours. Key skills include SQL, Azure Data Lake, and data governance, with 4+ years of relevant experience preferred.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
480
-
🗓️ - Date
January 8, 2026
🕒 - Duration
More than 6 months
-
🏝️ - Location
Remote
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
United States
-
🧠 - Skills detailed
#Azure #Data Modeling #Data Lineage #"ETL (Extract #Transform #Load)" #Data Lake #Data Quality #Data Catalog #SQL (Structured Query Language) #Data Engineering #BI (Business Intelligence) #Semantic Models #Data Design #Data Architecture #Databricks #Microsoft Power BI #AI (Artificial Intelligence) #Synapse #Documentation #Scala #Datasets #Metadata #Data Analysis #Data Governance
Role description
Job Title: Data Foundations & Lineage Engineer
Location: Remote, working Pacific hours
Contract Type: 12 to 18 month contract
Start Date: Immediate
Compensation: $50-$60 hourly
Position not open for C2C or any third-party arrangements.
Summary
Seeking a Data Foundations & Lineage Engineer to build, document, and maintain the core data ecosystem that drives Learning Data Intelligence. This role involves defining the structure, lineage, quality, and meaning of datasets within the Learning Lake (including HCM, Finance, HRDP, and FDL). The engineer will ensure every dataset is discoverable, well-documented, and trustworthy.
This position requires hands-on work across the lakehouse, including mapping schemas, tracing lineage, profiling quality, eliminating manual dependencies, and constructing a durable documentation layer that serves engineering, analytics, AI agents, and business stakeholders.
Data Discovery & Documentation
• Perform deep, brute‑force exploration of all Learning Lake schemas and tables to understand their meaning, business purpose, and dependencies.
• Build a comprehensive documentation repository describing dataset definitions, column‑level semantics, business logic, refresh cadences, source systems, and downstream consumption patterns.
• Translate implicit, tribal‑knowledge data flows into explicit, searchable documentation consistent with guidance
Data Lineage & Architecture Clarity
• Develop end‑to‑end lineage for Learning datasets, mapping sources, transformations, pipelines, and consumption (Power BI, semantic models, AI agents, etc.).
• Identify and eliminate manual or undocumented data feeds, aligning with the Manual Dependency Elimination initiative
Collaboration & Stakeholder Alignment
• Work closely with the DRI team as subject‑matter partners; escalate questions and validate assumptions.
• Partner with analytics, engineering, content, and program teams to ensure data design supports downstream reporting, modeling, and AI use cases.
Enablement & Self‑Service
• Build the foundational metadata that powers data discovery, semantic models, and self‑service analytics.
• Produce guides, readme files, and onboarding materials for all teams relying on Learning Lake.
Required Qualifications
• 4+ years of experience in data engineering, data analysis, data governance, or related fields.
• Expert SQL and data‑profiling skills, with the ability to reverse‑engineer undocumented or ambiguous datasets. Hands‑on experience with Azure Data Lake, Microsoft Fabric, Databricks, or Synapse in production environments.
• Familiarity with metadata systems, data cataloging, lineage tooling, and orchestration best practices.
• Demonstrated ability to operate effectively in ambiguous, poorly documented, and fast‑changing data environments.
Preferred Qualifications
• Experience working across large‑scale data ecosystems with shifting taxonomies and inconsistent data quality, combined with strong foundations in data modeling, documentation systems, data product ownership, or semantic model design.
• Proven ability to partner with engineering teams on data governance, lineage, metadata standards, and quality frameworks to improve reliability and trust.
• Exposure to Learning or HR data domains (e.g., HCM, HRDP, Finance, Skills/Learning datasets), including familiarity with soft‑skilling, competency models, or employee capability frameworks.
• Experience or working knowledge of data architecture concepts (lakehouse, domain‑driven design, data contracts, schema governance).
• A strategic thinker who can link data foundations to business impact and AI‑driven outcomes, with strong prioritization and cross‑functional influence.
Benefits
At Biblioso, we are committed to the well-being of our employees and offer a competitive benefits package to support their needs, including:
• 401(k) retirement plan
• Disability coverage
• Employee Assistance Program (EAP)
• Life insurance
• Health insurance
• Paid sick time
We believe that investing in our team's well-being is essential for the success of our company.
Team Environment
In this role, the nature of the work is dynamic and requires a collaborative attitude. While you have specific duties, it's important to understand that the entire team is responsible for the final delivery, and this may occasionally involve taking on additional tasks outside your primary responsibilities. The ability to adapt and contribute wherever needed is key to succeeding in this environment.
Contact
Abier Nupen | abier@biblioso.com
This role is not open for C2C or any third-party arrangements.
Job Title: Data Foundations & Lineage Engineer
Location: Remote, working Pacific hours
Contract Type: 12 to 18 month contract
Start Date: Immediate
Compensation: $50-$60 hourly
Position not open for C2C or any third-party arrangements.
Summary
Seeking a Data Foundations & Lineage Engineer to build, document, and maintain the core data ecosystem that drives Learning Data Intelligence. This role involves defining the structure, lineage, quality, and meaning of datasets within the Learning Lake (including HCM, Finance, HRDP, and FDL). The engineer will ensure every dataset is discoverable, well-documented, and trustworthy.
This position requires hands-on work across the lakehouse, including mapping schemas, tracing lineage, profiling quality, eliminating manual dependencies, and constructing a durable documentation layer that serves engineering, analytics, AI agents, and business stakeholders.
Data Discovery & Documentation
• Perform deep, brute‑force exploration of all Learning Lake schemas and tables to understand their meaning, business purpose, and dependencies.
• Build a comprehensive documentation repository describing dataset definitions, column‑level semantics, business logic, refresh cadences, source systems, and downstream consumption patterns.
• Translate implicit, tribal‑knowledge data flows into explicit, searchable documentation consistent with guidance
Data Lineage & Architecture Clarity
• Develop end‑to‑end lineage for Learning datasets, mapping sources, transformations, pipelines, and consumption (Power BI, semantic models, AI agents, etc.).
• Identify and eliminate manual or undocumented data feeds, aligning with the Manual Dependency Elimination initiative
Collaboration & Stakeholder Alignment
• Work closely with the DRI team as subject‑matter partners; escalate questions and validate assumptions.
• Partner with analytics, engineering, content, and program teams to ensure data design supports downstream reporting, modeling, and AI use cases.
Enablement & Self‑Service
• Build the foundational metadata that powers data discovery, semantic models, and self‑service analytics.
• Produce guides, readme files, and onboarding materials for all teams relying on Learning Lake.
Required Qualifications
• 4+ years of experience in data engineering, data analysis, data governance, or related fields.
• Expert SQL and data‑profiling skills, with the ability to reverse‑engineer undocumented or ambiguous datasets. Hands‑on experience with Azure Data Lake, Microsoft Fabric, Databricks, or Synapse in production environments.
• Familiarity with metadata systems, data cataloging, lineage tooling, and orchestration best practices.
• Demonstrated ability to operate effectively in ambiguous, poorly documented, and fast‑changing data environments.
Preferred Qualifications
• Experience working across large‑scale data ecosystems with shifting taxonomies and inconsistent data quality, combined with strong foundations in data modeling, documentation systems, data product ownership, or semantic model design.
• Proven ability to partner with engineering teams on data governance, lineage, metadata standards, and quality frameworks to improve reliability and trust.
• Exposure to Learning or HR data domains (e.g., HCM, HRDP, Finance, Skills/Learning datasets), including familiarity with soft‑skilling, competency models, or employee capability frameworks.
• Experience or working knowledge of data architecture concepts (lakehouse, domain‑driven design, data contracts, schema governance).
• A strategic thinker who can link data foundations to business impact and AI‑driven outcomes, with strong prioritization and cross‑functional influence.
Benefits
At Biblioso, we are committed to the well-being of our employees and offer a competitive benefits package to support their needs, including:
• 401(k) retirement plan
• Disability coverage
• Employee Assistance Program (EAP)
• Life insurance
• Health insurance
• Paid sick time
We believe that investing in our team's well-being is essential for the success of our company.
Team Environment
In this role, the nature of the work is dynamic and requires a collaborative attitude. While you have specific duties, it's important to understand that the entire team is responsible for the final delivery, and this may occasionally involve taking on additional tasks outside your primary responsibilities. The ability to adapt and contribute wherever needed is key to succeeding in this environment.
Contact
Abier Nupen | abier@biblioso.com
This role is not open for C2C or any third-party arrangements.






