

Lead Data Engineer
Key Responsibilities:
• Design, develop, and maintain scalable and reliable ETL/ELT pipelines on Azure.
• Work closely with data scientists, analysts, and other stakeholders to deliver end-to-end data solutions.
• Build and optimize big data pipelines using PySpark and Azure Databricks.
• Orchestrate data workflows using Apache Airflow or Azure Data Factory.
• Implement and support MLOps frameworks and deployment pipelines.
• Monitor, troubleshoot, and ensure data quality and reliability across systems.
• Drive architectural decisions for data platform components, ensuring best practices are followed.
• Mentor junior data engineers and contribute to knowledge sharing within the team.
Key Skills and Technologies:
• Cloud Platforms: Microsoft Azure (Azure Data Lake, Azure Databricks, Azure Synapse, Azure Blob Storage)
• Programming Languages: Python (strong), SQL
• Big Data Technologies: PySpark, Spark
• ETL Tools: Azure Data Factory, custom ETL development
• Workflow Orchestration: Apache Airflow
• DevOps/MLOps: Azure ML, MLflow, CI/CD pipelines for ML models
• Version Control: Git, GitHub/Azure DevOps
• Other Tools (Preferred): Terraform, Docker, Kubernetes
Key Responsibilities:
• Design, develop, and maintain scalable and reliable ETL/ELT pipelines on Azure.
• Work closely with data scientists, analysts, and other stakeholders to deliver end-to-end data solutions.
• Build and optimize big data pipelines using PySpark and Azure Databricks.
• Orchestrate data workflows using Apache Airflow or Azure Data Factory.
• Implement and support MLOps frameworks and deployment pipelines.
• Monitor, troubleshoot, and ensure data quality and reliability across systems.
• Drive architectural decisions for data platform components, ensuring best practices are followed.
• Mentor junior data engineers and contribute to knowledge sharing within the team.
Key Skills and Technologies:
• Cloud Platforms: Microsoft Azure (Azure Data Lake, Azure Databricks, Azure Synapse, Azure Blob Storage)
• Programming Languages: Python (strong), SQL
• Big Data Technologies: PySpark, Spark
• ETL Tools: Azure Data Factory, custom ETL development
• Workflow Orchestration: Apache Airflow
• DevOps/MLOps: Azure ML, MLflow, CI/CD pipelines for ML models
• Version Control: Git, GitHub/Azure DevOps
• Other Tools (Preferred): Terraform, Docker, Kubernetes