Python AI Engineer (India)

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Python AI Engineer with pharmaceutical experience, offering a 12-month remote contract from India. Key skills include Python proficiency, synthetic clinical trial data generation, and GxP compliance knowledge. Familiarity with CDISC standards and cloud platforms is preferred.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
-
🗓️ - Date discovered
September 12, 2025
🕒 - Project duration
More than 6 months
-
🏝️ - Location type
Remote
-
📄 - Contract type
Unknown
-
🔒 - Security clearance
Unknown
-
📍 - Location detailed
United States
-
🧠 - Skills detailed
#Programming #"ETL (Extract #Transform #Load)" #Data Pipeline #Compliance #Pytest #PyTorch #GDPR (General Data Protection Regulation) #Medidata Rave #Automation #Azure #Python #Pandas #Data Automation #Data Management #Datasets #Deployment #R #Data Governance #AWS (Amazon Web Services) #GCP (Google Cloud Platform) #Cloud #Oracle #AI (Artificial Intelligence) #ADaM (Analysis Data Model) #TensorFlow #CDISC (Clinical Data Interchange Standards Consortium) #NumPy #XML (eXtensible Markup Language) #Statistics #Documentation #ML (Machine Learning) #SAS #Langchain
Role description
Python AI Engineer – Automated Test Data Generation - Pharmacuetical experience required Remote from India 12 month contract Overview We are seeking a Python AI Engineer with expertise in synthetic and automated test data generation to support clinical trials, regulatory submissions, and GxP-compliant systems across the life sciences domain. This role will focus on building intelligent frameworks that generate high-quality, audit-ready datasets for use in clinical data management (CDM), biostatistics, pharmacovigilance, and regulatory validation environments. By leveraging AI/ML techniques and Python-based automation, this role will help reduce dependency on production data, accelerate testing cycles, and ensure compliance with stringent regulatory requirements (FDA, EMA, ICH E6, 21 CFR Part 11). Key Responsibilities Synthetic Test Data Automation • Design and implement Python frameworks to automatically generate synthetic clinical and operational datasets. • Create SDTM- and ADaM-compliant datasets for testing downstream workflows (data transformation, analysis, and reporting). • Generate domain-specific datasets (e.g., adverse events, lab results, demographics, drug exposure) with variability, edge cases, and referential integrity. • Develop connectors for Define-XML, Pinnacle21 validation, and statistical programming workflows. AI/ML Application in Data Generation • Apply generative AI (LLMs, GANs, diffusion models) to simulate realistic clinical case scenarios. • Build synthetic patient journeys for eCOA, EDC, CTMS, LIMS, or pharmacovigilance systems. • Use AI-enabled automation to mimic rare or complex conditions for system stress testing. Compliance & Data Governance • Ensure generated data adheres to GxP, 21 CFR Part 11, and HIPAA/GDPR requirements. • Implement data anonymization/obfuscation while maintaining clinical and statistical relevance. • Document all processes, including validation protocols, for audit and inspection readiness. Collaboration & Integration • Partner with Clinical Data Management, Biostatistics, Pharmacovigilance, and Quality/Compliance teams to understand requirements. • Integrate test data generation into CI/CD pipelines, clinical trial simulation frameworks, and automated validation environments. • Support SMEs in assessing AI-generated datasets for plausibility and regulatory acceptability. Required Skills & Qualifications • Strong proficiency in Python (pandas, NumPy, scikit-learn, PyTorch/TensorFlow, LangChain, etc.). • Experience generating synthetic clinical trial data or working with CDISC standards (SDTM, ADaM, Define-XML). • Familiarity with clinical data management systems (Medidata Rave, Veeva CDB, Oracle InForm, etc.) and testing needs. • Understanding of GxP/21 CFR Part 11 compliance and regulated data handling. • Knowledge of test automation frameworks (PyTest, Robot Framework) and data validation tools (e.g., Pinnacle21). Preferred Qualifications • Experience supporting clinical trial systems (EDC, CTMS, PV, LIMS, eCOA). • Understanding of statistical programming workflows (SAS, R) and integration with test data pipelines. • Hands-on experience with cloud platforms (AWS, Azure, GCP) for secure AI/ML deployment. • Familiarity with inspection readiness and validation documentation in a pharma context. If this is a role that interests you and you’d like to learn more, click apply now and a recruiter will be in touch with you to discuss this great opportunity. We look forward to speaking with you! About ManpowerGroup, Parent Company of:Manpower, Experis, Talent Solutions, and Jefferson Wells ManpowerGroup® (NYSE: MAN), the leading global workforce solutions company, helps organizations transform in a fast-changing world of work by sourcing, assessing, developing, and managing the talent that enables them to win. We develop innovative solutions for hundreds of thousands of organizations every year, providing them with skilled talent while finding meaningful, sustainable employment for millions of people across a wide range of industries and skills. Our expert family of brands – Manpower, Experis, Talent Solutions, and Jefferson Wells – creates substantial value for candidates and clients across more than 75 countries and territories and has done so for over 70 years. We are recognized consistently for our diversity - as a best place to work for Women, Inclusion, Equality and Disability and in 2022 ManpowerGroup was named one of the World's Most Ethical Companies for the 13th year - all confirming our position as the brand of choice for in-demand talent.