Data Scientist

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for a Data Scientist/Data Engineer with 3–6+ years of experience, focusing on ETL pipelines using Spark on AWS. Contract length exceeds 6 months, offering $70-$80/hour, located in Greenwood Village, CO, with hybrid work flexibility.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

640

🗓️ - Date discovered

June 6, 2025

🕒 - Project duration

More than 6 months

🏝️ - Location type

Hybrid

📄 - Contract type

Unknown

🔒 - Security clearance

Unknown

📍 - Location detailed

Greenwood Village, CO

🧠 - Skills detailed

#Spark (Apache Spark) #JSON (JavaScript Object Notation) #Apache Spark #Logistic Regression #Anomaly Detection #Debugging #AWS (Amazon Web Services) #Python #"ETL (Extract #Transform #Load)" #AWS EMR (Amazon Elastic MapReduce) #Cloud #Data Engineering #Programming #Regression #Firewalls #S3 (Amazon Simple Storage Service) #Scala #Data Science

Role description

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5

Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

Item 1
Item 2
Item 3

Unordered list

Item A
Item B
Item C

Text link

Bold text

Emphasis

^Superscript

_Subscript

Data Scientist / Data Engineer (Full Stack) Location: 4 days on-site, 1 day remote preferred in Greenwood Village, CO Type: Contract with potential for full-time conversion Compensation: $70-$80/hour DOE About the Role Our client is seeking two versatile, full-stack Data Scientists / Data Engineers to join a high-impact team focused on improving enterprise network reliability and user experience. This role is part of a new initiative in collaboration with the Enterprise Network team, analyzing system logs from network devices (e.g., routers, firewalls) to detect anomalies and preemptively identify potential failures. You’ll be working in a fast-paced, data-rich environment, helping to build and scale a pipeline that ingests up to 50TB of raw syslog data per day. The project is currently in its proof-of-concept (POC) phase, with plans to operationalize by mid-to-late summer. Key Responsibilities • Design and implement ETL pipelines using Spark (on AWS EMR) to process raw JSON syslog data from S3 • Parse and transform unstructured logs into structured, meaningful features for modeling • Collaborate with data scientists and network engineers to develop anomaly detection models • Contribute to both data engineering and modeling efforts (initially ~70% ETL / 30% modeling, with flexibility) • Participate in debugging, optimization, and potential future real-time/streaming implementations • Work closely with a cross-functional team including senior data scientists, network engineers, and project leads Ideal Candidate Profile • 3–6+ years of experience in data science, data engineering, or full-stack analytics roles • Strong programming skills in Python or Scala • Hands-on experience with Apache Spark, especially in distributed environments (EMR preferred) • Proficient with AWS services: S3, Glue, EMR, and general cloud-based data workflows • Comfortable working with raw, unstructured data and building pipelines from scratch • Familiarity with basic networking concepts (e.g., routers, firewalls, packets) at the application level • Experience with modeling techniques such as logistic regression, XGBoost, or similar • Ability to work independently and collaboratively in a dynamic, evolving project environment Nice to Have • Prior experience with streaming data or real-time analytics • Exposure to cost optimization in cloud environments • Familiarity with Cribl or similar log ingestion tools • Experience working in POC-to-production transitions

Apply now Apply with DFH Sign up

← See all roles