MSR Technology Group

Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer with a contract length of "unknown", offering a pay rate of "unknown". Key skills include Azure Data Factory, PySpark, and data modeling. 5+ years of data warehouse development experience is required; Medicaid domain knowledge is a plus.
🌎 - Country
United States
πŸ’± - Currency
$ USD
-
πŸ’° - Day rate
440
-
πŸ—“οΈ - Date
June 17, 2026
πŸ•’ - Duration
Unknown
-
🏝️ - Location
Unknown
-
πŸ“„ - Contract
Unknown
-
πŸ”’ - Security
Unknown
-
πŸ“ - Location detailed
United States
-
🧠 - Skills detailed
#Azure SQL #Terraform #Data Bricks #Monitoring #PySpark #Data Modeling #SQL (Structured Query Language) #Azure Databricks #Data Warehouse #Informatica #Azure #Vault #Slowly Changing Dimensions #DevOps #Azure Data Factory #Bash #Triggers #SQL Server #Teradata #Python #Logging #ADLS (Azure Data Lake Storage) #Snowflake #REST (Representational State Transfer) #YAML (YAML Ain't Markup Language) #Pandas #ERWin #Storage #Azure DevOps #EDW (Enterprise Data Warehouse) #Databricks #Data Engineering #Migration #Unit Testing #Datasets #API (Application Programming Interface) #REST API #"ETL (Extract #Transform #Load)" #Spark SQL #Azure ADLS (Azure Data Lake Storage) #Azure Synapse Analytics #Cloud #Spark (Apache Spark) #Data Pipeline #KQL (Kusto Query Language) #Delta Lake #Informatica PowerCenter #ADF (Azure Data Factory) #Data Vault #Oracle #UAT (User Acceptance Testing) #Synapse #Data Lake
Role description
Key Responsibilities: Pipeline Design & Development β€’ Design and build robust, reusable, parameter-driven ingestion and transformation pipelines β€’ using Azure Data Factory, Synapse Pipelines, Data Bricks and/or Microsoft Fabric Data Factory. β€’ Implement medallion architecture (Bronze / Silver / Gold) on Azure Data Lake Storage Gen2 using Delta Lake, Parquet, and structured streaming patterns. β€’ Build performant ELT workflows that leverage pushdown to source systems (Synapse Dedicated SQL Pool, Azure SQL, Teradata) where appropriate. β€’ Develop and optimize PySpark notebooks and jobs on Azure Databricks or Synapse Spark. Data Modeling & Warehousin β€’ gDesign dimensional models (Kimball star/snowflake) and data vault patterns for analytics consumption β€’ .Implement Slowly Changing Dimensions (Type 1/2/3), Change Data Capture, and late-arriving data patterns β€’ .Tune distributed SQL workloads in Synapse Dedicated SQL Pool / Fabric Warehouse, including distribution keys, partitioning, and clustered column store indexes . Platform Engineering & DevO β€’ psImplement CI/CD for data pipelines using Azure DevOps (YAML pipelines, ARM/Bicep/Terraform) across Dev / SIT / UAT / Prod environment β€’ s.Instrument pipelines with robust logging, auditing, and monitoring using Azure Monitor, Log Analytics, and KQ β€’ L.Enterprise Data Warehouse (EDW) ETL/Informatica Develop β€’ erDefine and enforce coding standards, code review practices, branching strategies, and release managemen t. Migration & Modernizat β€’ ionLead or contribute to legacy-to-cloud migrations β€” e.g., Informatica PowerCenter to Azure Data Factory, on-premises Teradata / Oracle / SQL Server to Synapse or Fabr β€’ ic.Perform workload assessment, capacity planning, and cost modeling for target-state architectur β€’ es.production incident response for critical pipelin es. Required Qualificati β€’ ons:Deep hands-on expertise with Azure Data Factory: pipelines, datasets, linked services, triggers, parameterization, mapping data flows, and all three Integration Runtime types (Azure, Selfhosted, SS β€’ IS).Strong Experience in Data Bricks and PySp β€’ ark.Production experience with one or more of: Azure Synapse Analytics (Dedicated and Serverless SQL Pools, Spark Pools) OR Azure Databricks (Delta Lake, Unity Catalog) OR Microsoft Fa β€’ bric(Warehouse, Lakehouse, OneLa β€’ ke).Strong working knowledge of Azure Data Lake Storage Gen2 (hierarchical namespace, RBAC + ACLs, lifecycle management, securi β€’ ty).Experience with Azure Key Vault, Azure AD / Entra ID (including managed identities and service principals), and private networking (VNet integration, private endpoin β€’ ts).Monitoring and troubleshooting with Azure Monitor, Log Analytics, and β€’ KQL.Advanced SQL β€” window functions, CTEs, query optimization, execution plan analysis, performance tun β€’ ing.Strong Python for data engineering β€” pandas, PySpark, REST API integration, unit testing (pyte β€’ st).Proficient in T-SQL; familiarity with Spark SQL, KQL, PowerShell, and Bash shell script ing. Preferred Qualificat β€’ ions:5+ years of data warehouse development experi β€’ ence.5+ years of data modeling experience using ERWIN or similar t β€’ ools.2+ years of experience with Azure Data Factory and Snowf β€’ lake.Medicaid Domain Knowledge is a plus