Pashtek • Salesforce and SAP Partner

Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for a Data Engineer on a contract basis, remote in the U.S. for 6 months at a pay rate of "X". Requires 5+ years of data engineering experience, proficiency in AWS services, Apache Spark, and data governance.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

Unknown

🗓️ - Date

October 11, 2025

🕒 - Duration

Unknown

🏝️ - Location

Remote

📄 - Contract

W2 Contractor

🔒 - Security

Unknown

📍 - Location detailed

United States

🧠 - Skills detailed

#Informatica #Data Modeling #Snowflake #Snowpark #BI (Business Intelligence) #Observability #Oracle #Tableau #Terraform #AWS (Amazon Web Services) #Azure DevOps #PCI (Payment Card Industry) #Apache Spark #Cloud #DMS (Data Migration Service) #Metadata #S3 (Amazon Simple Storage Service) #IAM (Identity and Access Management) #Data Engineering #SAP BW #DevOps #Fivetran #GitHub #Azure #Athena #Redshift #"ACID (Atomicity #Consistency #Isolation #Durability)" #Kubernetes #Dremio #Teradata #SAP #Infrastructure as Code (IaC) #GitLab #Security #Databricks #Delta Lake #Vault #Collibra #Clustering #Spark (Apache Spark) #EDW (Enterprise Data Warehouse) #Compliance #Microsoft Power BI #Batch #Classification #GDPR (General Data Protection Regulation) #Data Pipeline #Hadoop #Migration #Kafka (Apache Kafka) #Replication #Apache Iceberg #dbt (data build tool) #SQL (Structured Query Language) #SSIS (SQL Server Integration Services) #AWS Glue #"ETL (Extract #Transform #Load)" #REST (Representational State Transfer) #Data Catalog #Data Vault #Looker #Trino #Airflow #SQL Server

Role description

Location: Remote (United States) Employment Type: contract About the Role As a Data Engineer, you’ll build and operate the pipelines, tables, and services that power our hybrid data platform across on-premises and AWS. You’ll implement lakehouse patterns, productionize batch and streaming workloads, and partner with security, platform, analytics, and application teams to deliver governed, high-performance data products at scale. What You’ll Do • Build data pipelines: Develop reliable ELT/ETL jobs in Spark/SQL/dbt/Airflow/Glue to ingest from on-prem and cloud sources into S3-backed lakes and warehouses. • Implement lakehouse tables: Create and maintain Iceberg (or Delta/Hudi) tables using the appropriate catalog (AWS Glue, Hive Metastore, Polaris/REST) with ACID, time travel, and schema evolution. • Operate compute engines: Run and tune Spark (Databricks/EMR), Trino/Starburst, Dremio, and Snowflake workloads; leverage pushdown and query acceleration where applicable. • Model data for analytics: Deliver dimensional models, semantic layers, and domain-oriented data products; document contracts and SLAs with consumers. • Governance & security: Apply data cataloging, lineage, PII classification, Lake Formation permissions, IAM roles, and row/column-level security; contribute to audit readiness. • Performance & cost tuning: Optimize partitioning, clustering/Z-order, predicate pushdown, file sizing/compaction, caching, and workload isolation; monitor and right-size clusters. • Migrations: Execute migration workstreams from legacy/on-prem EDW (Informatica/SSIS/SAP BW, SQL Server/Oracle, Hadoop) to lakehouse patterns (Spark/dbt/ELT), including dual-run cutovers and reconciliation. • Streaming & CDC: Build real-time and near-real-time pipelines using Kafka/MSK/Kinesis and Spark Structured Streaming; implement CDC with Debezium, DMS, or Fivetran. • Quality & observability: Add unit/integration tests, expectations/rules, data contracts, lineage, alerting, and SLO dashboards; participate in on-call rotations. • Platform guardrails: Contribute to standards for naming, zones, schemas, S3 layout, encryption, backup/DR, and multi-region replication; write clear runbooks and docs. • DevOps for data: Use Terraform/CloudFormation and CI/CD (GitHub Actions/GitLab/Azure DevOps) to version, test, and deploy data assets. Required Experience • 5+ years in data engineering (or equivalent), delivering production pipelines and tables on at least two large-scale platforms. • Hands-on with AWS data services: S3, Glue/EMR, Lake Formation, IAM, and at least one warehouse (Snowflake or Redshift). • Deep experience with Apache Spark and at least one of: Databricks, EMR Spark, Snowflake Snowpark, Dremio, or Starburst/Trino. • Production experience with open table formats: Apache Iceberg (preferred), Delta Lake, or Apache Hudi; strong grasp of metadata/manifest, compaction, and schema evolution. • Comfortable with on-prem stacks: Hadoop/Hive, Spark on Kubernetes, SQL Server/SSIS and/or Oracle/Exadata; Netezza/Teradata a plus. • Proven data modeling (3NF, dimensional, Data Vault), ELT/ETL design, and SQL performance tuning. • Security/governance: RBAC/ABAC, row/column-level security, masking/tokenization, KMS/key management. • IaC & CI/CD for data workloads. • Excellent communicator who can collaborate with platform engineers, analysts, and stakeholders to meet SLAs and roadmap goals. Nice to Have • dbt for ELT, Airflow orchestration, Great Expectations/Deequ for quality, OpenLineage/Marquez for lineage. • Streaming experience with Kafka/MSK, Kinesis, or Flink. • Catalogs/semantic: AWS Glue Data Catalog, Unity Catalog, Amundsen/DataHub, Atlan/Collibra. • BI/serving: DuckDB, Athena, QuickSight, Tableau/Power BI/Looker. • Compliance: SOC 2, HIPAA/PHI, GDPR, PCI; SSO/OIDC with Okta. • Multi-tenant platforms or federated governance exposure. Location & Work Style Remote with core hours in PST/CST;

Apply now Apply with DFH Sign up

← See all roles

Go to role

Luzon Technologies Inc

is hiring for a:

Pashtek • Salesforce and SAP Partner

Data Engineer

Credit Risk Data Scientist (Remote) – W2 Role Only , NO C2C

Data Engineer with Redpoint-6

Business Solutions Analyst (3 Openings)

Sr. MS Dynamics Engineers

Book a

chat

with us

Company