

Data Scientist – AI-Powered Biomedical Data Analysis - DSHR
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Scientist specializing in AI-Powered Biomedical Data Analysis, based in Bronx, NY, on a 1-year contract. Requires a Master’s/PhD, 3+ years in AI/ML, strong Python skills, and experience with NLP and LLMs.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
800
-
🗓️ - Date discovered
May 29, 2025
🕒 - Project duration
More than 6 months
-
🏝️ - Location type
Hybrid
-
📄 - Contract type
Unknown
-
🔒 - Security clearance
Unknown
-
📍 - Location detailed
Hempstead, NY
-
🧠 - Skills detailed
#Batch #ML (Machine Learning) #PyTorch #Deployment #Data Pipeline #Datasets #NLP (Natural Language Processing) #Scala #R #Classification #Python #TensorFlow #Libraries #Computer Science #Data Science #AI (Artificial Intelligence) #Programming #Data Analysis
Role description
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
- Item 1
- Item 2
- Item 3
Unordered list
- Item A
- Item B
- Item C
Bold text
Emphasis
Superscript
Subscript
Job Title: Data Scientist – AI-Powered Biomedical Data Analysis Location: Bronx, NY 10458 (Hybrid) Job Type: 1-Year Contract (with potential for extension)
About the Role:
We are seeking an experienced and innovative Data Scientist to lead the development and deployment of AI-driven solutions for biomedical literature and synthetic data analysis. This hybrid role is based in the Bronx, NY, and offers the opportunity to contribute to cutting-edge research at the intersection of artificial intelligence, life sciences, and healthcare.
The successful candidate will work on designing and optimizing scalable AI models using open-source large language models (LLMs) to classify biomedical literature across multiple dimensions. This role plays a key part in generating insights that support evidence synthesis, systematic reviews, and clinical research.
Key Responsibilities:
Design and refine multi-dimensional classification frameworks (e.g., relevance, data type, medical category, application area)
Implement Chain-of-Thought (CoT) reasoning and Few-Shot Learning (FSL) techniques
Deploy and optimize open-source LLMs (e.g., LLaMA) for high-throughput biomedical text classification
Perform inference on 5,000+ biomedical abstracts with attention to performance and accuracy
Apply fine-tuning or low-rank adaptation (LoRA) for domain-specific model enhancement
Curate high-quality training/validation datasets in collaboration with biomedical domain experts
Standardize data preprocessing workflows to ensure consistency and integrity across datasets
Identify trends in synthetic biomedical data applications, including methodology, disease focus, and clinical relevance
Manage batch processing pipelines for large-scale text analysis using GPU-enabled infrastructure
Present findings through reports, dashboards, and scientific presentations
Required Qualifications:
Master’s or PhD in Computer Science, Data Science, Biomedical Informatics, or related field
3+ years of experience in AI/ML projects, especially NLP and text classification
Strong programming skills in Python
Proficiency with machine learning libraries (e.g., PyTorch, TensorFlow)
Experience with open-source LLMs (e.g., LLaMA, GPT)
Knowledge of advanced prompting methods, such as CoT and FSL
Ability to communicate effectively with both technical and non-technical stakeholders
Strong problem-solving mindset and adaptability in a fast-paced R&D environment
Preferred Skills:
Familiarity with biomedical ontologies and data standards
Experience with GPU-based model training and inference
Knowledge of tools for data pipeline orchestration and high-throughput processing
Prior work with systematic reviews, evidence synthesis, or literature mining