

Rivago Infotech Inc
Data Engineer
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for an AWS Databricks Data Engineer in Los Angeles, CA (Hybrid), with a contract length of FTE/CTH. Key skills include SQL, Python, PySpark, and Databricks. Certifications like Databricks Certified Data Engineer are optional.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
504
-
🗓️ - Date
February 25, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Hybrid
-
📄 - Contract
W2 Contractor
-
🔒 - Security
Unknown
-
📍 - Location detailed
Los Angeles Metropolitan Area
-
🧠 - Skills detailed
#Datasets #GIT #Data Governance #Deployment #Version Control #Delta Lake #Batch #S3 (Amazon Simple Storage Service) #BI (Business Intelligence) #SQL (Structured Query Language) #AWS (Amazon Web Services) #Cloud #Storage #Databricks #Data Pipeline #Security #Scala #DevOps #Triggers #PySpark #"ETL (Extract #Transform #Load)" #GitLab #Spark (Apache Spark) #Python #Data Engineering #IAM (Identity and Access Management) #Compliance #Databases
Role description
H1b Workable
Near by Relocation only
Job Title: AWS Databricks Data Engineer
Location : Los Angeles CA (Hybrid)
Hire type : FTE / CTH
Implementation partner -
•
•
•
•
•
•
•
•
•
• End Client - Confidential
Interview mode: Video/Virtual
Job Description –
We are seeking a highly skilled AWS Data Engineer with strong expertise in SQL, Python, PySpark, Data Warehousing, and Cloud-based ETL to join our data engineering team. The ideal candidate will design, implement, and optimize large-scale data pipelines, ensuring scalability, reliability, and high performance. This role requires close collaboration with cross-functional teams and business stakeholders to deliver modern, efficient data solutions.
Key Responsibilities
1. Data Pipeline Development
• Build and maintain scalable ETL/ELT pipelines using Databricks on AWS.
• Leverage PySpark/Spark and SQL to transform and process large, complex datasets.
• Integrate data from multiple sources including S3, relational/non-relational databases, and AWS-native services.
1. Collaboration & Analysis
• Partner with downstream teams to prepare data for dashboards, analytics, and BI tools.
• Work closely with business stakeholders to understand requirements and deliver tailored, high‑quality data solutions.
1. Performance & Optimization
• Optimize Databricks workloads for cost, performance, and efficient compute utilization.
• Monitor and troubleshoot pipelines to ensure reliability, accuracy, and SLA adherence.
• Apply query optimization, Spark tuning, and shuffle minimization best practices when handling tens of millions of rows.
1. Governance & Security
• Implement and manage data governance, access control, and security policies using Unity Catalog.
• Ensure compliance with organizational and regulatory data‑handling standards.
1. Deployment & DevOps
• Use Databricks Asset Bundles for deployment of jobs, notebooks, and configuration across environments.
• Maintain effective version control of Databricks artifacts using GitLab or similar tools.
• Use CI/CD pipelines to support automated deployments and environment setups.
Technical Skills (Required)
• Strong expertise in Databricks (Delta Lake, Unity Catalog, Lakehouse Architecture, Table Triggers, Workflows, Delta Live Pipelines, Databricks Runtime, etc.).
• Proven ability to implement robust PySpark solutions.
• Hands‑on experience with Databricks Workflows & orchestration.
• Solid knowledge of Medallion Architecture (Bronze/Silver/Gold).
• Significant experience designing or rebuilding batch‑heavy data pipelines.
• Strong background in query optimization, performance tuning, and Spark shuffle optimization.
• Ability to handle and process tens of millions of records efficiently.
• Familiarity with Genie enablement concepts (understanding required; deep experience optional).
• Experience with CI/CD, environment setup, and Git-based development workflows.
• Solid understanding of AWS cloud, including:
• IAM
• Networking fundamentals
• Storage integration (S3, Glue Catalog, etc.)
Preferred Experience
• Experience with Databricks Runtime configurations and advanced features.
• Knowledge of streaming frameworks such as Spark Structured Streaming.
• Experience developing real-time or near real-time data solutions.
• Exposure to GitLab pipelines or similar CI/CD systems.
Certifications (Optional)
• Databricks Certified Data Engineer Associate / Professional
• AWS Data Engineer or AWS Solutions Architect certification
H1b Workable
Near by Relocation only
Job Title: AWS Databricks Data Engineer
Location : Los Angeles CA (Hybrid)
Hire type : FTE / CTH
Implementation partner -
•
•
•
•
•
•
•
•
•
• End Client - Confidential
Interview mode: Video/Virtual
Job Description –
We are seeking a highly skilled AWS Data Engineer with strong expertise in SQL, Python, PySpark, Data Warehousing, and Cloud-based ETL to join our data engineering team. The ideal candidate will design, implement, and optimize large-scale data pipelines, ensuring scalability, reliability, and high performance. This role requires close collaboration with cross-functional teams and business stakeholders to deliver modern, efficient data solutions.
Key Responsibilities
1. Data Pipeline Development
• Build and maintain scalable ETL/ELT pipelines using Databricks on AWS.
• Leverage PySpark/Spark and SQL to transform and process large, complex datasets.
• Integrate data from multiple sources including S3, relational/non-relational databases, and AWS-native services.
1. Collaboration & Analysis
• Partner with downstream teams to prepare data for dashboards, analytics, and BI tools.
• Work closely with business stakeholders to understand requirements and deliver tailored, high‑quality data solutions.
1. Performance & Optimization
• Optimize Databricks workloads for cost, performance, and efficient compute utilization.
• Monitor and troubleshoot pipelines to ensure reliability, accuracy, and SLA adherence.
• Apply query optimization, Spark tuning, and shuffle minimization best practices when handling tens of millions of rows.
1. Governance & Security
• Implement and manage data governance, access control, and security policies using Unity Catalog.
• Ensure compliance with organizational and regulatory data‑handling standards.
1. Deployment & DevOps
• Use Databricks Asset Bundles for deployment of jobs, notebooks, and configuration across environments.
• Maintain effective version control of Databricks artifacts using GitLab or similar tools.
• Use CI/CD pipelines to support automated deployments and environment setups.
Technical Skills (Required)
• Strong expertise in Databricks (Delta Lake, Unity Catalog, Lakehouse Architecture, Table Triggers, Workflows, Delta Live Pipelines, Databricks Runtime, etc.).
• Proven ability to implement robust PySpark solutions.
• Hands‑on experience with Databricks Workflows & orchestration.
• Solid knowledge of Medallion Architecture (Bronze/Silver/Gold).
• Significant experience designing or rebuilding batch‑heavy data pipelines.
• Strong background in query optimization, performance tuning, and Spark shuffle optimization.
• Ability to handle and process tens of millions of records efficiently.
• Familiarity with Genie enablement concepts (understanding required; deep experience optional).
• Experience with CI/CD, environment setup, and Git-based development workflows.
• Solid understanding of AWS cloud, including:
• IAM
• Networking fundamentals
• Storage integration (S3, Glue Catalog, etc.)
Preferred Experience
• Experience with Databricks Runtime configurations and advanced features.
• Knowledge of streaming frameworks such as Spark Structured Streaming.
• Experience developing real-time or near real-time data solutions.
• Exposure to GitLab pipelines or similar CI/CD systems.
Certifications (Optional)
• Databricks Certified Data Engineer Associate / Professional
• AWS Data Engineer or AWS Solutions Architect certification






