

Data Scientist
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Scientist/Data Engineer with 3–6+ years of experience, focusing on ETL pipelines using Spark on AWS. Contract length exceeds 6 months, offering $70-$80/hour, located in Greenwood Village, CO, with hybrid work flexibility.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
640
-
🗓️ - Date discovered
June 6, 2025
🕒 - Project duration
More than 6 months
-
🏝️ - Location type
Hybrid
-
📄 - Contract type
Unknown
-
🔒 - Security clearance
Unknown
-
📍 - Location detailed
Greenwood Village, CO
-
🧠 - Skills detailed
#Spark (Apache Spark) #JSON (JavaScript Object Notation) #Apache Spark #Logistic Regression #Anomaly Detection #Debugging #AWS (Amazon Web Services) #Python #"ETL (Extract #Transform #Load)" #AWS EMR (Amazon Elastic MapReduce) #Cloud #Data Engineering #Programming #Regression #Firewalls #S3 (Amazon Simple Storage Service) #Scala #Data Science
Role description
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
- Item 1
- Item 2
- Item 3
Unordered list
- Item A
- Item B
- Item C
Bold text
Emphasis
Superscript
Subscript
Data Scientist / Data Engineer (Full Stack)
Location: 4 days on-site, 1 day remote preferred in Greenwood Village, CO
Type: Contract with potential for full-time conversion
Compensation: $70-$80/hour DOE
About the Role
Our client is seeking two versatile, full-stack Data Scientists / Data Engineers to join a high-impact team focused on improving enterprise network reliability and user experience. This role
is part of a new initiative in collaboration with the Enterprise Network team, analyzing system logs from network devices (e.g., routers, firewalls) to detect anomalies and preemptively identify potential failures. You’ll be working in a fast-paced, data-rich environment, helping to build and scale a pipeline that ingests up to 50TB of raw syslog data per day. The project is currently in its proof-of-concept (POC) phase, with plans to operationalize by mid-to-late summer.
Key Responsibilities
• Design and implement ETL pipelines using Spark (on AWS EMR) to process raw JSON syslog data from S3
• Parse and transform unstructured logs into structured, meaningful features for modeling
• Collaborate with data scientists and network engineers to develop anomaly detection models
• Contribute to both data engineering and modeling efforts (initially ~70% ETL / 30% modeling, with flexibility)
• Participate in debugging, optimization, and potential future real-time/streaming implementations
• Work closely with a cross-functional team including senior data scientists, network engineers, and project leads
Ideal Candidate Profile
• 3–6+ years of experience in data science, data engineering, or full-stack analytics roles
• Strong programming skills in Python or Scala
• Hands-on experience with Apache Spark, especially in distributed environments (EMR preferred)
• Proficient with AWS services: S3, Glue, EMR, and general cloud-based data workflows
• Comfortable working with raw, unstructured data and building pipelines from scratch
• Familiarity with basic networking concepts (e.g., routers, firewalls, packets) at the application level
• Experience with modeling techniques such as logistic regression, XGBoost, or similar
• Ability to work independently and collaboratively in a dynamic, evolving project environment
Nice to Have
• Prior experience with streaming data or real-time analytics
• Exposure to cost optimization in cloud environments
• Familiarity with Cribl or similar log ingestion tools
• Experience working in POC-to-production transitions