Hive Science

Full-Stack Data Engineer - R&D

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is a Full-Stack Data Engineer (R&D) for a 6-month contract in Durham, NC, offering a competitive pay rate. Key skills required include PostgreSQL/MySQL, AWS, Python, and modern data engineering practices. Candidates should have 5+ years in research-oriented data engineering roles.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
October 2, 2025
🕒 - Duration
Unknown
-
🏝️ - Location
On-site
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Raleigh-Durham-Chapel Hill Area
-
🧠 - Skills detailed
#PostgreSQL #Monitoring #Scala #R #Python #Lambda (AWS Lambda) #Data Governance #Data Engineering #AWS (Amazon Web Services) #Pandas #Airflow #Data Science #Prometheus #REST (Representational State Transfer) #Firewalls #MySQL #AI (Artificial Intelligence) #API (Application Programming Interface) #NumPy #Cloud #React #S3 (Amazon Simple Storage Service) #Data Lineage #Grafana #Java #Security #ML (Machine Learning) #Athena #Agile #SciPy #Databases #"ETL (Extract #Transform #Load)" #IP (Internet Protocol) #Infrastructure as Code (IaC) #ML Ops (Machine Learning Operations) #Terraform #GraphQL
Role description
Hive Science is transforming how the world's largest brands understand consumer behavior through the intersection of psychological science and modern data engineering. We're building the future of behavioral intelligence, and we need an engineer-scientist who can architect the data systems that power breakthrough research. The Role: We're seeking a Full-Stack Data Engineer (R&D) who thinks like a scientist and builds like an engineer. You'll be the bridge between cutting-edge behavioral research and scalable data infrastructure, maintaining and evolving systems that process millions of survey responses, millions of multi-modal content and millions of behavioral signals into actionable insights. This isn't just about managing databases - it's about understanding how social scientists think about data and building the tools they need to uncover human truths at scale. It’s also about our clients and our clients’ consumers. Designing interaction interfaces and secure access [protecting our IP] to our intelligence so it can be used worldwide - this means we also need you to create and manage GPTs and MCP servers for our clients to be able to access our intelligence and implement it into their day-to-day activities. Core Technical Requirements: Research-Oriented Data Infrastructure • Expert-level PostgreSQL/MySQL with deep understanding of survey data structures, panel management, and longitudinal study requirements • AWS ecosystem mastery (S3, Glue, Athena, Lambda, Step Functions) with focus on research reproducibility and data lineage • Python for scientific computing pipelines (pandas, numpy, scipy) integrated with modern orchestration tools (Airflow/Dagster) • Experience with research-specific challenges: missing data patterns, survey weights, complex skip logic, and multi-wave panel designs Modern Engineering Practices • Infrastructure as Code (Terraform/CloudFormation) for reproducible research environments • CI/CD, ELT, Containerisation and MLOps pipelines optimized for data science workflows • Proficiency with modern data formats (Parquet, Arrow, Graph) and computation frameworks (Dask, Polars) • Experience with vector databases and embedding pipelines for behavioral pattern analysis • Architect and build modern web applications with responsive, performant UIs (React/Vue) and robust, secure backends (Python/Node/Java). API & Integration Architecture • Design RESTful and GraphQL APIs that expose complex research data in intuitive ways • Architect, build, maintain and optimise GPTs / MCP servers [LLMs / Agentic / RAG techniques] • Build real-time data streaming pipelines for behavioral event capture • Create standardized interfaces for integrating third-party survey platforms and behavioral tracking tools • Experience with webhook architectures and event-driven systems Research Data Governance • Implement monitoring, uptime tracking, and alerting systems (e.g., Prometheus, CloudWatch, Grafana) to maintain system reliability. • Implement differential privacy and statistical disclosure control methods • Build consent management and data retention systems for human subjects research • Create audit trails that satisfy both IRB requirements and enterprise security standards • Experience with PII tokenization and secure data rooms for collaborative research • Harden the platform with secure coding practices, firewalls, role-based access, encryption at rest and in transit, and secrets management (e.g., AWS Secrets Manager). • Develop and enforce governance best practices for working with LLM APIs / MCP servers, including data redaction, prompt injection protection, and PII safeguards. The Engineer-Scientist Profile We're Seeking: • 5+ years in data engineering roles at research-oriented organizations (market research, academic labs, or data-for-good initiatives) • Demonstrated ability to translate between research methodologies and technical implementations • Portfolio showing both technical depth (complex ETL pipelines, API design) and research understanding (survey methodology, experimental design) • Experience with modern ML ops practices for deploying behavioral models • Comfort with ambiguity - able to scope solutions when researchers say "we need to explore the data" • Active engagement with the data engineering community (contributions to open source, technical writing, conference talks) Cultural Fit: You should be energized by: • Building tools that accelerate scientific discovery • Collaborating with scientists who push methodological boundaries • Creating elegant solutions to complex data challenges • Working in a high-velocity environment where your infrastructure directly impacts customer outcomes • Being at the cutting edge of technological change / disruption • Agile and adaptable Additional Details: At Hive Science, you won't just be maintaining databases - you'll be building the computational foundation for the world’s first psychological intelligence technology platform. Your work will directly enable researchers to uncover insights that transform how major brands understand and serve their customers. This is an opportunity to apply modern engineering practices to accelerate scientific discovery. As a fast-paced startup, each day is different from the one before. We’re nimble and creative, and value intellectual humility. We work really hard because we’re all 100% dedicated to the future we’re building. Our work is stimulating, challenging, and exciting. And our team is awesome. At Hive we only hire exceptional people, so you’ll be in good company -- surrounded by passionate, insanely smart people who want to build the future of customer intelligence. Specifically we’re looking for someone who will thrive in this type of environment: • Fast-paced startup with competing demands and multiple priorities ongoing • A ‘solve the problem’ mentality • Scrappy and creative • Strong passion for the Hive Science mission and a love of the scientific method This is an in-person role in Durham, NC - we cannot consider candidates who do not currently live within commuting distance. To apply, please send resume to careers@hivescience.ai