

Jobs via Dice
Data Engineer (W2 Contract)
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer (W2 Contract) with an 8+ years' experience in data engineering, focusing on AWS, PySpark, and Databricks. Contract length and pay rate are unspecified. Industry experience in cybersecurity and streaming data is essential.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
April 8, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
W2 Contractor
-
🔒 - Security
Unknown
-
📍 - Location detailed
Plano, TX
-
🧠 - Skills detailed
#Databricks #PySpark #Spark (Apache Spark) #Data Ingestion #AWS (Amazon Web Services) #Batch #Slowly Changing Dimensions #"ETL (Extract #Transform #Load)" #Java #Data Modeling #Data Quality #Kafka (Apache Kafka) #Data Science #Cybersecurity #Python #Automation #Leadership #Debugging #ML Ops (Machine Learning Operations) #Data Pipeline #Apache Spark #Scala #Data Engineering #Pandas #ML (Machine Learning) #S3 (Amazon Simple Storage Service) #Security
Role description
Dice is the leading career destination for tech experts at every stage of their careers. Our client, HPTech Inc., is seeking the following. Apply via Dice today!
We are seeking a strong, handson Data Engineer to join a fastmoving Cybersecurity organization focused on threat detection, correlation, and automated remediation. This role is heavily dataengineering focused (approximately 80% Data Engineering / 20% ML exposure) and requires deep fundamentals, not surfacelevel experience.
This team works with largescale, highvolume data pipelines that support nearrealtime security analytics and GenAIdriven tools used by Cyber Operations teams and executive leadership.
Key Responsibilities
• Design, build, and maintain scalable data pipelines handling large volumes of structured and semistructured data
• Develop and optimize pipelines using PySpark and Databricks
• Implement data ingestion, transformation, and automation workflows in AWS
• Work with realtime and nearrealtime data sources, including Kafka and APIs
• Design pipelines supporting highvolume processing (beyond simple microbatching)
• Apply best practices around:
• Data quality
• Performance optimization
• Pipeline reliability and scalability
• Collaborate with cybersecurity, data science, and platform teams to support:
• Threat detection use cases
• Log analysis and security telemetry
• GenAIpowered data products
• Participate in technical and behavioral interviews, including handson discussions and screensharing exercises
Required Qualifications
Data Engineering Fundamentals (Must Have)
• 8+ years of professional data engineering experience (10+ years of experience is preferred)
• Strong Python skills for data engineering (pandas, Data Frames not application development)
• Solid handson experience with PySpark / Apache Spark
• Proven experience building pipelines in Databricks
• Strong AWS experience, including:
• S3
• Core AWS services used in data pipelines
• Experience designing and automating endtoend data pipelines
• Strong understanding of data modeling and SCD (Slowly Changing Dimensions)
Streaming & Integration
• Experience working with streaming or nearrealtime data
• Kafka (must understand how to consume data, even if not used daily)
• APIs
• Microbatching with highvolume use cases
• Ability to articulate tradeoffs between batch, microbatch, and streaming architectures
Engineering Depth
• Able to clearly explain core skills and handson contributions
• Comfortable demonstrating fundamentals during screenshare sessions
• Strong problemsolving and debugging skills
Nice to Have
• Exposure to ML / ML Ops (not the primary focus)
• Java experience
• Experience working in security, analytics, or logheavy environments
• Valid technical certifications (must be able to clearly demonstrate and explain them)
Dice is the leading career destination for tech experts at every stage of their careers. Our client, HPTech Inc., is seeking the following. Apply via Dice today!
We are seeking a strong, handson Data Engineer to join a fastmoving Cybersecurity organization focused on threat detection, correlation, and automated remediation. This role is heavily dataengineering focused (approximately 80% Data Engineering / 20% ML exposure) and requires deep fundamentals, not surfacelevel experience.
This team works with largescale, highvolume data pipelines that support nearrealtime security analytics and GenAIdriven tools used by Cyber Operations teams and executive leadership.
Key Responsibilities
• Design, build, and maintain scalable data pipelines handling large volumes of structured and semistructured data
• Develop and optimize pipelines using PySpark and Databricks
• Implement data ingestion, transformation, and automation workflows in AWS
• Work with realtime and nearrealtime data sources, including Kafka and APIs
• Design pipelines supporting highvolume processing (beyond simple microbatching)
• Apply best practices around:
• Data quality
• Performance optimization
• Pipeline reliability and scalability
• Collaborate with cybersecurity, data science, and platform teams to support:
• Threat detection use cases
• Log analysis and security telemetry
• GenAIpowered data products
• Participate in technical and behavioral interviews, including handson discussions and screensharing exercises
Required Qualifications
Data Engineering Fundamentals (Must Have)
• 8+ years of professional data engineering experience (10+ years of experience is preferred)
• Strong Python skills for data engineering (pandas, Data Frames not application development)
• Solid handson experience with PySpark / Apache Spark
• Proven experience building pipelines in Databricks
• Strong AWS experience, including:
• S3
• Core AWS services used in data pipelines
• Experience designing and automating endtoend data pipelines
• Strong understanding of data modeling and SCD (Slowly Changing Dimensions)
Streaming & Integration
• Experience working with streaming or nearrealtime data
• Kafka (must understand how to consume data, even if not used daily)
• APIs
• Microbatching with highvolume use cases
• Ability to articulate tradeoffs between batch, microbatch, and streaming architectures
Engineering Depth
• Able to clearly explain core skills and handson contributions
• Comfortable demonstrating fundamentals during screenshare sessions
• Strong problemsolving and debugging skills
Nice to Have
• Exposure to ML / ML Ops (not the primary focus)
• Java experience
• Experience working in security, analytics, or logheavy environments
• Valid technical certifications (must be able to clearly demonstrate and explain them)






