

VRK IT Vision Inc.
Data Engineer Architect
โญ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer Architect, contract length unspecified, with a pay rate of "unknown". Work is remote. Key skills include Azure Databricks, PySpark, Azure Data Factory, and SQL. Experience in data governance and Azure cloud integration is required.
๐ - Country
United States
๐ฑ - Currency
$ USD
-
๐ฐ - Day rate
Unknown
-
๐๏ธ - Date
June 24, 2026
๐ - Duration
Unknown
-
๐๏ธ - Location
Unknown
-
๐ - Contract
Unknown
-
๐ - Security
Unknown
-
๐ - Location detailed
United States
-
๐ง - Skills detailed
#DevOps #ADLS (Azure Data Lake Storage) #Azure cloud #Delta Lake #Azure #Documentation #Azure Data Factory #REST (Representational State Transfer) #Agile #Kafka (Apache Kafka) #Data Engineering #Code Reviews #REST API #Data Ingestion #Scrum #Scala #ADF (Azure Data Factory) #AutoScaling #Synapse #Data Lake #Data Architecture #Azure Databricks #Spark SQL #Azure Synapse Analytics #Storage #GIT #Azure DevOps #Triggers #"ETL (Extract #Transform #Load)" #Data Quality #SQL (Structured Query Language) #Databricks #Debugging #Security #Vault #Spark (Apache Spark) #Cloud #PySpark #Data Cleansing #Azure SQL #Data Governance
Role description
Key Responsibilities:
Data Engineering & Pipeline Development
Design, develop, and optimize ETL/ELT pipelines using Azure Databricks (PySpark).
Build scalable data ingestion workflows from various structured and unstructured sources.
Implement transformation logic, data cleansing, enrichment, and validation frameworks.
Work with Delta Lake to build medallion architecture (Bronze/Silver/Gold layers).
Develop reusable Databricks notebooks and jobs for production data workflows.
Azure Cloud & Integration
Build and orchestrate pipelines using Azure Data Factory (ADF).
Integrate Databricks with other Azure servicesโADLS, Azure SQL, Event Hub, Key Vault, Synapse.
Optimize compute environments (clusters, pools, autoscaling).
Implement DevOps processes using Git, CICD, Azure DevOps.
Performance, Quality & Governance
Optimize PySpark jobs for performance and cost efficiency.
Implement best practices for data governance, security, and access control.
Troubleshoot production issues and perform root-cause analysis.
Conduct code reviews ensuring coding standards and data quality.
Collaboration & Documentation
Work with Data Architects to define architecture and design patterns.
Prepare technical documents, solution diagrams, and runbooks.
Collaborate with business stakeholders to understand requirements and translate them into technical solutions.
Mandatory Skills:
Azure Databricks โ notebooks, jobs, workflows, Delta Lake.
PySpark โ dataframes, Spark SQL, optimization & debugging.
Azure Data Factory (ADF) โ triggers, pipelines, integration runtime.
Data Lake Storage (ADLS Gen2) โ folder structures, partitioning, security.
CI/CD โ Git (branching strategies), Azure DevOps pipelines.
SQL โ strong proficiency in writing optimized queries.
Good-to-Have Skills:
Azure Synapse Analytics
Azure Event Hub / Kafka
Azure Functions
DataBricks REST APIs
Streaming pipelines (Structured Streaming)
Experience with data modelling
Knowledge of Lakehouse architecture
Behavioral & Soft Skills:
Strong analytical and problem-solving skills.
Ability to work independently and in cross-functional teams.
Good communication skills for stakeholder interaction.
Comfortable working in Agile/Scrum models.
Key Responsibilities:
Data Engineering & Pipeline Development
Design, develop, and optimize ETL/ELT pipelines using Azure Databricks (PySpark).
Build scalable data ingestion workflows from various structured and unstructured sources.
Implement transformation logic, data cleansing, enrichment, and validation frameworks.
Work with Delta Lake to build medallion architecture (Bronze/Silver/Gold layers).
Develop reusable Databricks notebooks and jobs for production data workflows.
Azure Cloud & Integration
Build and orchestrate pipelines using Azure Data Factory (ADF).
Integrate Databricks with other Azure servicesโADLS, Azure SQL, Event Hub, Key Vault, Synapse.
Optimize compute environments (clusters, pools, autoscaling).
Implement DevOps processes using Git, CICD, Azure DevOps.
Performance, Quality & Governance
Optimize PySpark jobs for performance and cost efficiency.
Implement best practices for data governance, security, and access control.
Troubleshoot production issues and perform root-cause analysis.
Conduct code reviews ensuring coding standards and data quality.
Collaboration & Documentation
Work with Data Architects to define architecture and design patterns.
Prepare technical documents, solution diagrams, and runbooks.
Collaborate with business stakeholders to understand requirements and translate them into technical solutions.
Mandatory Skills:
Azure Databricks โ notebooks, jobs, workflows, Delta Lake.
PySpark โ dataframes, Spark SQL, optimization & debugging.
Azure Data Factory (ADF) โ triggers, pipelines, integration runtime.
Data Lake Storage (ADLS Gen2) โ folder structures, partitioning, security.
CI/CD โ Git (branching strategies), Azure DevOps pipelines.
SQL โ strong proficiency in writing optimized queries.
Good-to-Have Skills:
Azure Synapse Analytics
Azure Event Hub / Kafka
Azure Functions
DataBricks REST APIs
Streaming pipelines (Structured Streaming)
Experience with data modelling
Knowledge of Lakehouse architecture
Behavioral & Soft Skills:
Strong analytical and problem-solving skills.
Ability to work independently and in cross-functional teams.
Good communication skills for stakeholder interaction.
Comfortable working in Agile/Scrum models.






