Bioinformatics Data Scientist

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Bioinformatics Data Scientist, full-time for 1 year, remote during PST hours. Requires a PhD in a quantitative field, expertise in Python/R, ML modeling, and experience with multimodal datasets. Strong communication skills and a publication record are essential.
🌎 - Country
United States
πŸ’± - Currency
$ USD
-
πŸ’° - Day rate
-
πŸ—“οΈ - Date discovered
September 4, 2025
πŸ•’ - Project duration
More than 6 months
-
🏝️ - Location type
Remote
-
πŸ“„ - Contract type
Unknown
-
πŸ”’ - Security clearance
Unknown
-
πŸ“ - Location detailed
United States
-
🧠 - Skills detailed
#Statistics #Mathematics #Computer Science #"ETL (Extract #Transform #Load)" #AI (Artificial Intelligence) #ML (Machine Learning) #Programming #R #Python #Data Manipulation #Datasets #NLP (Natural Language Processing) #Data Science
Role description
Role: ML / Bioinformatics Data Scientist Duration: Full-time, 1 year to start with a possibility for extension. Location: Remote is okay, but must be available during normal PST business hours. About the Role We are seeking a highly motivated and collaborative Bioinformatics/ML scientist to join the Computational biology & Medicine department in Computational Sciences COE (Center of Excellence) within Research and Early Development (gRED). The successful candidate will contribute to a cross-functional project that will apply Machine Learning (ML) models to multi-modal datasets collected from clinical trials. This role requires a deep understanding of application of Machine Learning models, a background in biology, a passion for innovation, and a commitment to improving healthcare outcomes through cutting-edge technology. We are looking for exceptional researchers with a passion for interdisciplinary research and technical problem-solving, and a proven ability to develop and implement research ideas. The candidate is expected to have worked on previous ML modeling projects and applying them to multi-modal datasets to be considered. About the Project The goal of this project is to develop a machine learning model to predict a patient's risk for drug-induced liver toxicity based on a wide variety of patient characteristics including clinical, genetics, omics and safety labs. The focus will be harmonizing these diverse data sources, deriving new features, and building machine learning models designed to identify a predictive signature that can distinguish between at-risk and not-at-risk patient populations. Key Responsibilities β€’ Data centralization and harmonization β€’ Applying ML methods on assembled dataset to identify patients’ risk for drug-induced liver toxicity. . β€’ Collaborate with interdisciplinary and cross-functional teams including biologists, chemists, data scientists, and other stakeholders. Qualifications Educational Background: PhD degree in quantitative field ( e.g., Computer Science, Computational Biology, Bioinformatics, Statistics, Mathematics) Experience: β€’ Proven track record of working with statistical modeling techniques, including ML methods, is required β€’ Demonstrated interest in problems across biology as applied to the discovery and development of treatments for disease is preferred Technical Skills: β€’ Data Science & Programming: Expertise in Python/R for data manipulation, statistical analysis, and ML model building (required) β€’ Multimodal Data & Modeling: Proven ability to work with diverse data types (omics, clinical, imaging) (required). β€’ Knowledge of statistics and experience with survival analysis (required) β€’ Domain & AI-specific Skills: Experience with NLP/LLMs for feature extraction from unstructured text, and a strong background in a neuroscience (preferred) Soft Skills: Excellent communication, collaboration, and problem-solving skills (required). Publications: Strong publication record and experience contributing to research communities. Best Regards, Santosh Cherukuri Email: scherukuri@bayonesolutions.com