

Insight Global
Data Engineer
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer with a contract length of "unknown" and a pay rate of "$60-70/hr." Candidates must have at least 2 years of experience in life sciences or pharmaceutical manufacturing, proficiency in GCP/BigQuery, and strong Python skills.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
560
-
🗓️ - Date
July 1, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
United States
-
🧠 - Skills detailed
#Datasets #Data Architecture #Python #Database Schema #Data Quality #SQL (Structured Query Language) #BigQuery #React #Data Modeling #GCP (Google Cloud Platform) #ML (Machine Learning) #Storage #Cloud #Data Engineering #Data Transformations #Documentation #Data Pipeline #AI (Artificial Intelligence) #Data Science #Version Control #Metadata #R #Scala #Database Design #Data Warehouse #"ETL (Extract #Transform #Load)"
Role description
•
•
• THIS POSITION REQUIRES SPECIFIC INDUSTRY EXPERIENCE IN LIFE SCIENCES, PHARMACEUTICAL MANUFACTURING, AND/OR SCIENTIFIC R&D. Please do NOT apply if you do not have at least 2 years of relevant industry experience from the last five years.
•
•
•
• The Senior Data Engineer will design, build, and deliver a new enterprise data product supporting the clients generative drug design and computational chemistry platforms. This role focuses on creating scalable, well‑structured data architecture from the ground up, with long‑term expansion and downstream AI/ML integration in mind. The ideal candidate combines strong data engineering expertise with an understanding of drug design, chemistry, and scientific data workflows.
Responsibilities to include but not limited to,
• Design and implement a new enterprise data product, initially scoped as a standalone deliverable with future integration into broader AI‑driven drug discovery platforms.
• Build scalable data pipelines, schemas, and storage models capable of supporting large, complex scientific and chemistry‑derived datasets.
• Develop data solutions primarily on GCP / BigQuery, adhering to enterprise data engineering templates and standards.
• Implement data transformations and pipelines using Python, with a focus on data quality, traceability, and performance.
• Ensure the data architecture supports future expansion, additional datasets, and evolving analytical and computational needs.
• Collaborate closely with computational chemists, data scientists, and ML engineers to ensure data models align with generative design, molecular representations, and ML outputs.
• Apply an understanding of drug design and chemistry concepts (e.g., molecular properties, structure‑activity data, experimental outputs) to inform data modeling and integration decisions.
• Provide technical guidance on data structure, scalability, and long‑term maintainability in an enterprise environment.
The Data Engineer will take end‑to‑end ownership of a new cloud‑native data product on Google Cloud Platform, leveraging established in‑house templates and standards to deliver a robust, scalable, and production‑grade solution. The role involves designing and operating reliable ingestion pipelines for external data sources, integrating and harmonising key fields with parallel external data products, and delivering curated, analytics‑ready datasets that are readable, updatable, and trusted by downstream users. The engineer will apply strong expertise in Python, SQL, columnar data warehouses (e.g. BigQuery), schema and data‑model design, pipeline orchestration, and query optimization, while embedding best practices around data quality, testing, metadata, and documentation. Operating as part of a cross‑functional environment, the role requires a solid understanding of core cloud concepts, CI/CD, version control, and data reliability in production. Experience with scientific or R&D data—particularly chemistry or related life‑science domains—would be highly advantageous, enabling effective standardization, interpretation, and integration of complex domain‑specific datasets.
Required Skills & Experience
• Strong experience in data engineering, including database, schema, and data product design.
• Hands‑on experience with GCP and BigQuery.
• Proficiency in Python for building and maintaining data pipelines.
• CI/CD experience.
• Experience working with large, complex datasets at scale, ideally in scientific or R&D contexts.
• Background in life sciences, pharma, or scientific data platforms.
Nice to Have Skills & Experience
• Database Design experience.
• Experience supporting downstream analytics, ML pipelines, or AI‑driven platforms, particularly in R&D or discovery environments.
• Background in life sciences, pharma, or scientific data platforms.
• Working knowledge or hands‑on exposure to drug design, chemistry, or computational chemistry data.
• Postgres experience.
Compensation
$60-70/hr. Exact compensation may vary based on several factors, including skills, experience, and education. Benefit packages for this role will start on the 1st day of employment and include medical, dental, and vision insurance and 401k retirement account access. Employees in this role are also entitled to paid sick leave as provided based on state laws.
•
•
• THIS POSITION REQUIRES SPECIFIC INDUSTRY EXPERIENCE IN LIFE SCIENCES, PHARMACEUTICAL MANUFACTURING, AND/OR SCIENTIFIC R&D. Please do NOT apply if you do not have at least 2 years of relevant industry experience from the last five years.
•
•
•
• The Senior Data Engineer will design, build, and deliver a new enterprise data product supporting the clients generative drug design and computational chemistry platforms. This role focuses on creating scalable, well‑structured data architecture from the ground up, with long‑term expansion and downstream AI/ML integration in mind. The ideal candidate combines strong data engineering expertise with an understanding of drug design, chemistry, and scientific data workflows.
Responsibilities to include but not limited to,
• Design and implement a new enterprise data product, initially scoped as a standalone deliverable with future integration into broader AI‑driven drug discovery platforms.
• Build scalable data pipelines, schemas, and storage models capable of supporting large, complex scientific and chemistry‑derived datasets.
• Develop data solutions primarily on GCP / BigQuery, adhering to enterprise data engineering templates and standards.
• Implement data transformations and pipelines using Python, with a focus on data quality, traceability, and performance.
• Ensure the data architecture supports future expansion, additional datasets, and evolving analytical and computational needs.
• Collaborate closely with computational chemists, data scientists, and ML engineers to ensure data models align with generative design, molecular representations, and ML outputs.
• Apply an understanding of drug design and chemistry concepts (e.g., molecular properties, structure‑activity data, experimental outputs) to inform data modeling and integration decisions.
• Provide technical guidance on data structure, scalability, and long‑term maintainability in an enterprise environment.
The Data Engineer will take end‑to‑end ownership of a new cloud‑native data product on Google Cloud Platform, leveraging established in‑house templates and standards to deliver a robust, scalable, and production‑grade solution. The role involves designing and operating reliable ingestion pipelines for external data sources, integrating and harmonising key fields with parallel external data products, and delivering curated, analytics‑ready datasets that are readable, updatable, and trusted by downstream users. The engineer will apply strong expertise in Python, SQL, columnar data warehouses (e.g. BigQuery), schema and data‑model design, pipeline orchestration, and query optimization, while embedding best practices around data quality, testing, metadata, and documentation. Operating as part of a cross‑functional environment, the role requires a solid understanding of core cloud concepts, CI/CD, version control, and data reliability in production. Experience with scientific or R&D data—particularly chemistry or related life‑science domains—would be highly advantageous, enabling effective standardization, interpretation, and integration of complex domain‑specific datasets.
Required Skills & Experience
• Strong experience in data engineering, including database, schema, and data product design.
• Hands‑on experience with GCP and BigQuery.
• Proficiency in Python for building and maintaining data pipelines.
• CI/CD experience.
• Experience working with large, complex datasets at scale, ideally in scientific or R&D contexts.
• Background in life sciences, pharma, or scientific data platforms.
Nice to Have Skills & Experience
• Database Design experience.
• Experience supporting downstream analytics, ML pipelines, or AI‑driven platforms, particularly in R&D or discovery environments.
• Background in life sciences, pharma, or scientific data platforms.
• Working knowledge or hands‑on exposure to drug design, chemistry, or computational chemistry data.
• Postgres experience.
Compensation
$60-70/hr. Exact compensation may vary based on several factors, including skills, experience, and education. Benefit packages for this role will start on the 1st day of employment and include medical, dental, and vision insurance and 401k retirement account access. Employees in this role are also entitled to paid sick leave as provided based on state laws.






