

AstraZeneca
Data Scientist
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Scientist (Associate Director) based in Cambridge, UK, for a 6-month contract (likely extending). Pay rate is competitive. Key skills include Python, PostgreSQL, AWS, and experience in big pharma R&D. A Master's degree is required.
🌎 - Country
United Kingdom
💱 - Currency
£ GBP
-
💰 - Day rate
Unknown
-
🗓️ - Date
December 16, 2025
🕒 - Duration
More than 6 months
-
🏝️ - Location
Hybrid
-
📄 - Contract
Inside IR35
-
🔒 - Security
Unknown
-
📍 - Location detailed
Cambridge, England, United Kingdom
-
🧠 - Skills detailed
#Data Quality #Microsoft Power BI #Monitoring #ML (Machine Learning) #PySpark #Model Evaluation #Computer Science #Security #R #Data Science #Metadata #React #AWS (Amazon Web Services) #MDM (Master Data Management) #Databases #Data Integration #SQL (Structured Query Language) #Statistics #Strategy #Datasets #Supervised Learning #Data Strategy #Python #PostgreSQL #Spark (Apache Spark) #Automation #Base #Lean #Redshift #Forecasting #AI (Artificial Intelligence) #BI (Business Intelligence)
Role description
Title: Associate Director; Data Science
Duration: 6 months (likely to extend)
Location: Cambridge, UK (Hybrid 2-3 days min)
IR35 Status: Outside
Overview of the Role
The Operational Data Strategy (ODS) function sits within Clinical Data and Insights (CDI) — a key division in R&D at AstraZeneca. ODS oversees how we collect, organize, validate, and analyze operational data across R&D, combining robust systems and databases with innovative data science to visualize decision‑making at scale and deliver timely, actionable insights to study teams.
We are seeking a Data Science Specialist to lead a flagship engagement: co‑developing and operationalizing an AI‑driven Site Identification Tool over a 6–12-month hybrid build. You’ll report to the Strategic Analytics and Enablement Lead and drive the data science stream to accelerate enrollment and improve site selection delivering explainable, production‑ready predictive models that AstraZeneca will own and evolve. Expect hands‑on supervised learning for enrollment timelines and site‑level predictions (e.g., gradient boosting), TA‑specific feature engineering, protocol similarity, statistical sampling and optimization under constraints, and rigorous monitoring (drift, performance, stability). You’ll integrate internal data with external sources, operationalize pipelines/APIs in AZ’s tech stack (Python/PySpark, PostgreSQL/Redshift, NodeJS, React, AWS), and partner with Clinical Operations to embed analytics into feasibility/startup workflows. Power BI will be used to surface decision artifacts where helpful, but the emphasis is advanced analytics. To thrive, you’ll model and champion our core traits: disciplined critical thinking, a growth mindset, grit through ambiguity, and resilience under pressure.
Core Accountabilities
• Model co‑development, ownership, and monitoring
• Generate and validate predictions of site-level enrollment rates based on prior experience and external data
• Create an optimal country and site list based on enrollment speed
• Co‑train and refine supervised models for early timeline planning and site‑level enrollment; define constraints/scenarios and the retraining cadence.
• Build predictive models for enrollment forecasting across all TAs
• Establish performance tracking (e.g., MAE/RMSE, calibration, lift), stability checks, and drift detection, document rationale for overrides and decisions.
• Data integration and feature engineering
• Build a single source of truth linking internal operational data with external sources
• Pressure test model outputs and fine-tine base on TA‑specific features (disease prevalence, site congestion, quality signals); design data quality thresholds and coverage checks.
• Explainability and decision support
• Implement transparent feature importance, protocol similarity, and scenario analysis (high/low recruitment paths), capture rationale in logs for governance.
• Produce decision‑ready artifacts and, where useful, translate key insights into Power BI for portfolio and study visibility.
• Optimization and activation strategy
• Apply statistical sampling and constraint‑aware optimization for site/country activation; run scenario trade‑offs to support feasibility/startup decisions.
• Productionization and governance
• Collaborate with engineering to operationalize pipelines, APIs, and security (CI/CD, SSO) in AZ environments (Python/PySpark, PostgreSQL/Redshift, AWS).
• Transition models/UI to AZ production with versioning, lineage, and support playbooks; define retraining, monitoring, and incident processes.
Essential Skills
• Experience working with forecasting models in complex data environments
• Experience in model evaluation/monitoring
• Experience in data integration across internal/external sources
• Experience in optimization under constraints
• Big pharma/large biotech R&D, hands-on with site identification/selection and country activation, vendor datasets.
• Education: Masters in data science, statistics, computer science (or equivalent experience); Bachelor’s required.
Desirable
• Optimization under constraints, ML governance in regulated environments.
• Translating decision artifacts into Power BI for adoption; SQL automation.
• Education: Certifications (analytics/BI, DAMA, PMP, Lean Six Sigma); advanced coursework in optimization or causal inference.
• Experience: Big pharma or large-scale biotech R&D; hands on exposure to drug development stages (feasibility, startup, enrollment, monitoring); direct experience with site identification/selection and country activation; work with external data vendors in regulated environments.
• Skills: Advanced monitoring (drift/stability), SQL automation, translating decision artifacts into Power BI for adoption, prompt design for analytics tools.
• Knowledge: Metadata/lineage, MDM/RDM; vendor ecosystems and data exchange standards; enterprise ML governance in GxP adjacent contexts.
Title: Associate Director; Data Science
Duration: 6 months (likely to extend)
Location: Cambridge, UK (Hybrid 2-3 days min)
IR35 Status: Outside
Overview of the Role
The Operational Data Strategy (ODS) function sits within Clinical Data and Insights (CDI) — a key division in R&D at AstraZeneca. ODS oversees how we collect, organize, validate, and analyze operational data across R&D, combining robust systems and databases with innovative data science to visualize decision‑making at scale and deliver timely, actionable insights to study teams.
We are seeking a Data Science Specialist to lead a flagship engagement: co‑developing and operationalizing an AI‑driven Site Identification Tool over a 6–12-month hybrid build. You’ll report to the Strategic Analytics and Enablement Lead and drive the data science stream to accelerate enrollment and improve site selection delivering explainable, production‑ready predictive models that AstraZeneca will own and evolve. Expect hands‑on supervised learning for enrollment timelines and site‑level predictions (e.g., gradient boosting), TA‑specific feature engineering, protocol similarity, statistical sampling and optimization under constraints, and rigorous monitoring (drift, performance, stability). You’ll integrate internal data with external sources, operationalize pipelines/APIs in AZ’s tech stack (Python/PySpark, PostgreSQL/Redshift, NodeJS, React, AWS), and partner with Clinical Operations to embed analytics into feasibility/startup workflows. Power BI will be used to surface decision artifacts where helpful, but the emphasis is advanced analytics. To thrive, you’ll model and champion our core traits: disciplined critical thinking, a growth mindset, grit through ambiguity, and resilience under pressure.
Core Accountabilities
• Model co‑development, ownership, and monitoring
• Generate and validate predictions of site-level enrollment rates based on prior experience and external data
• Create an optimal country and site list based on enrollment speed
• Co‑train and refine supervised models for early timeline planning and site‑level enrollment; define constraints/scenarios and the retraining cadence.
• Build predictive models for enrollment forecasting across all TAs
• Establish performance tracking (e.g., MAE/RMSE, calibration, lift), stability checks, and drift detection, document rationale for overrides and decisions.
• Data integration and feature engineering
• Build a single source of truth linking internal operational data with external sources
• Pressure test model outputs and fine-tine base on TA‑specific features (disease prevalence, site congestion, quality signals); design data quality thresholds and coverage checks.
• Explainability and decision support
• Implement transparent feature importance, protocol similarity, and scenario analysis (high/low recruitment paths), capture rationale in logs for governance.
• Produce decision‑ready artifacts and, where useful, translate key insights into Power BI for portfolio and study visibility.
• Optimization and activation strategy
• Apply statistical sampling and constraint‑aware optimization for site/country activation; run scenario trade‑offs to support feasibility/startup decisions.
• Productionization and governance
• Collaborate with engineering to operationalize pipelines, APIs, and security (CI/CD, SSO) in AZ environments (Python/PySpark, PostgreSQL/Redshift, AWS).
• Transition models/UI to AZ production with versioning, lineage, and support playbooks; define retraining, monitoring, and incident processes.
Essential Skills
• Experience working with forecasting models in complex data environments
• Experience in model evaluation/monitoring
• Experience in data integration across internal/external sources
• Experience in optimization under constraints
• Big pharma/large biotech R&D, hands-on with site identification/selection and country activation, vendor datasets.
• Education: Masters in data science, statistics, computer science (or equivalent experience); Bachelor’s required.
Desirable
• Optimization under constraints, ML governance in regulated environments.
• Translating decision artifacts into Power BI for adoption; SQL automation.
• Education: Certifications (analytics/BI, DAMA, PMP, Lean Six Sigma); advanced coursework in optimization or causal inference.
• Experience: Big pharma or large-scale biotech R&D; hands on exposure to drug development stages (feasibility, startup, enrollment, monitoring); direct experience with site identification/selection and country activation; work with external data vendors in regulated environments.
• Skills: Advanced monitoring (drift/stability), SQL automation, translating decision artifacts into Power BI for adoption, prompt design for analytics tools.
• Knowledge: Metadata/lineage, MDM/RDM; vendor ecosystems and data exchange standards; enterprise ML governance in GxP adjacent contexts.






