Theoris

Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for a Data Engineer with over 3 years of experience in data engineering and software development, focusing on pharmaceutical data. Contract length exceeds 6 months, remote work, with a pay rate of "unknown." Key skills include Python, AWS, ETL, and data visualization tools like Power BI.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

Unknown

🗓️ - Date

March 20, 2026

🕒 - Duration

More than 6 months

🏝️ - Location

Remote

📄 - Contract

Unknown

🔒 - Security

Unknown

📍 - Location detailed

Indianapolis, IN

🧠 - Skills detailed

#AWS S3 (Amazon Simple Storage Service) #React #Pandas #BI (Business Intelligence) #Computer Science #Unit Testing #Base #S3 (Amazon Simple Storage Service) #GIT #Spotfire #Code Reviews #EC2 #Presto #Version Control #LDAP (Lightweight Directory Access Protocol) #Indexing #Visualization #TypeScript #Flask #Database Performance #Snowflake #Athena #Programming #Scala #AWS (Amazon Web Services) #PySpark #Metadata #Microsoft Power BI #Spark (Apache Spark) #JavaScript #Datasets #Documentation #Cloud #Data Manipulation #Data Quality #FastAPI #ML (Machine Learning) #RDS (Amazon Relational Database Service) #Data Pipeline #Libraries #Security #Automated Testing #Compliance #Trino #"ETL (Extract #Transform #Load)" #Deployment #AI (Artificial Intelligence) #Data Engineering #Data Management #Databases #API (Application Programming Interface) #Redshift #Python #Monitoring #PostgreSQL #AWS EC2 (Amazon Elastic Compute Cloud) #Lambda (AWS Lambda)

Role description

Job Title: Data/Software Engineer Location: Remote Industry: Pharmaceutical • • • NO C2C • • • Job Description: Theoris Services is assisting our client in their search for a Data/Software Engineer to add to their growing team. Our client is seeking someone with data visualization experience and software engineering (create reusable libraries, best practices, troubleshooting). Responsibilities: • Data Pipeline & Backend Development • Design, build, and optimize scalable data pipelines and ETL/ELT processes to integrate and harmonize scientific data (compounds, assays, experiments) from 30+ heterogeneous sources. • Implement and maintain lakehouse architectures on AWS (S3, Glue, Athena, Iceberg) to support multibillion-record datasets. • Develop federated query capabilities using Trino (or similar distributed engines) for unified access across platforms like PostgreSQL, Snowflake, and others. • Build robust backend services, RESTful APIs, and data services using Python (FastAPI, Flask preferred) to enable seamless data flow and integration with scientific tools (e.g., Benchling, computational chemistry systems, AI/ML endpoints). • Performance Optimization & Troubleshooting • Optimize query and database performance for complex analytical workloads across PostgreSQL, Iceberg, Trino, and other platforms. • Implement caching, indexing, and query tuning techniques to improve response times and scalability as data volumes and user bases grow. • Apply reverse engineering and advanced troubleshooting skills to debug complex data issues, pipeline bottlenecks, application failures, and performance problems proactively. • Monitor systems, identify root causes, and implement fixes for data and application reliability. • Data Visualization & User-Facing Analytics • Design and develop interactive dashboards, visual analytics, and scientific data visualizations using Power BI and Spotfire (or equivalent tools). • Create reusable visualization components and data-rich UIs (React/TypeScript preferred) to enable scientists to search, filter, explore, and interpret complex datasets—including dose-response curves, chemical structures, and analytical results. • Translate scientific and engineering data into clear, actionable visual insights for researchers and stakeholders. • Software Engineering & Quality Practices • Apply best software engineering practices: modular/reusable design, clean code principles, code reviews, comprehensive documentation, and creation of maintainable libraries/services. • Write high-quality unit, integration, and end-to-end tests; use mock data effectively to create reliable automated test cases and ensure code stability. • Implement CI/CD pipelines for automated testing, deployment, and monitoring on AWS (EC2, ECS, Lambda, S3). • Collaborate on full-stack features from database to frontend, ensuring end-to-end functionality, security (SSO/LDAP), and performance. • Collaboration & Governance • Partner with scientists, UX designers, and cross-functional teams to gather requirements, conduct user testing, and iterate on usability. • Implement data validation, quality checks, metadata management, and governance to ensure compliance and accuracy. • Contribute to engineering best practices and foster a culture of quality and scalability. Requirements: • Education & Experience • Bachelor's degree in Computer Science, Data Engineering, Software Engineering, Information Systems, or a related technical field. • 3+ years of professional experience in data engineering, full-stack development, or closely related roles. • Proven track record of building and delivering production-grade data pipelines, platforms, and/or user-facing scientific applications. • Technical Skills • Programming: Intermediate to strong proficiency in Python (core for pipelines, backend, and data manipulation with pandas/PySpark); familiarity with JavaScript/TypeScript for frontend. • Data Engineering: Hands-on experience creating scalable pipelines, ETL/ELT processes, and distributed processing (Spark, Trino/Presto). • Databases & Querying: Deep expertise in relational databases (PostgreSQL), modern warehouses (Snowflake, Redshift), and query engines; strong focus on query performance improvement and optimization. • Cloud Platforms: Practical experience with AWS services (S3, Glue, Athena, Lambda, RDS, EC2/ECS). • Data Visualization: Proven experience with Power BI and Spotfire (or similar) for scientific and analytical dashboards/visualizations. • Frontend (preferred): Modern JavaScript/TypeScript frameworks (React preferred), responsive UI development, and component libraries. • Testing & Quality: Strong unit testing skills; experience writing automated tests with mock data for robust coverage. • Tools & Practices: Git for version control; API design (RESTful); CI/CD; clean code and reusable library development. • Core Competencies • Excellent reverse engineering and troubleshooting capabilities for complex data and system issues. • Strong problem-solving skills with attention to detail and commitment to data quality/accuracy. • Ability to work independently and collaboratively in cross-functional, scientific teams. • Excellent communication skills to bridge technical concepts with non-technical stakeholders (scientists, researchers). Best-In-Class-Benefits: We are in the people business; treating people right is our ONLY priority. Theoris Services consultants are full-time employees with full benefits, including: • Robust Health Insurance • 401(k) plan About Theoris: Our goal is to Fuel Your Career! As a Theoris team member, you join a culture based on people-centered values and an environment that fosters both personal and professional growth. We build long-term relationships with our clients and our consultants. With over 30 years of building strong relationships in the industry, we’re uniquely positioned to make the right connections. This knowledge is used to find the right job placement. Our recruiting teams are experts dedicated to the information technology and engineering staffing space and are highly respected by our client base.

Apply now Apply with DFH

Theoris

Data Engineer

MLOps

Senior Trust & Safety Engineer

Data Scientist

Healthcare Interface Developer

Book a

chat

with us

Company