

Rangam
RCI-ABBV-31464 Bioinformatics Scientist (Omics Data/Python/SQL/R Programming/HPC/Nextflow/Snakemake/PostgreSQL/Metadata/Git)
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Bioinformatics Scientist in Cambridge, MA, through the end of the year, with a pay rate of "unknown." Key skills include Python, SQL, R, and experience with omics data management. A BS/MS in a related field is required.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
720
-
🗓️ - Date
October 14, 2025
🕒 - Duration
More than 6 months
-
🏝️ - Location
Hybrid
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Cambridge, MA
-
🧠 - Skills detailed
#Data Wrangling #PostgreSQL #SQL (Structured Query Language) #R #Storage #Data Pipeline #Data Governance #Scala #Data Science #Data Quality #Security #Data Layers #Cloud #GIT #GCP (Google Cloud Platform) #Compliance #Data Catalog #Data Engineering #Data Security #Docker #AWS (Amazon Web Services) #Datasets #Python #Version Control #Automation #NoSQL #Metadata #Databases #Data Management #Programming #Visualization #Data Access #Computer Science #"ETL (Extract #Transform #Load)"
Role description
Hybrid – Cambridge, MA
Through End of Year
Top 3–5 Skills Needed:
• Strong programming and data engineering skills (Python, SQL, R)
• Experience with large-scale omics data management and integration
• Knowledge of metadata standards and ontologies for biological data
• Experience designing or maintaining bioinformatics data pipelines or repositories
• Understanding of data governance, permissions, and FAIR data principles
Job Description:
• We are seeking a highly motivated Data Scientist to design and implement an internal GEO-like system for managing the Immune Discovery omics data assets.
• The successful candidate will build a centralized platform that integrates raw, processed, and metadata layers of multi-omics datasets (e.g., bulk and single-cell RNA-seq, spatial omics, CyTOF) and ensures that they are findable, accessible, well-documented, and permission-controlled.
• This role bridges bioinformatics, data engineering, and data governance, enabling researchers to efficiently submit, query, and reuse internal datasets while maintaining data quality and compliance.
Key Responsibilities:
• Design and implement scalable pipelines for ingestion, curation, and storage of raw and processed omics data.
• Build and maintain a searchable data catalog or portal to enable dataset discovery and visualization of metadata and QC metrics.
• Implement access controls and permission management systems to ensure appropriate data security and compliance.
• Work closely with Immunology Discovery, and IR teams to integrate the system with existing compute and storage infrastructure.
• Develop and enforce metadata standards, ontologies, and schema to ensure consistency and interoperability across studies.
Impact:
• By developing this internal data platform, the candidate will transform how omics data are organized and shared across client.
• The system will improve data visibility and reuse, enhance reproducibility, and accelerate scientific insights by enabling streamlined access to all relevant data layers, raw, processed, and annotated.
Qualifications:
• BS (5+ years) or MS (0–3 years) in Bioinformatics, Computational Biology, Data Science, Computer Science, or related field.
• Proficiency in Python and SQL, with experience in data wrangling, ETL pipelines, and automation.
• Hands-on experience managing large omics datasets.
• Strong understanding of metadata models, data provenance, and FAIR data principles.
• Excellent communication skills and ability to collaborate with cross-functional teams.
Preferred Technical Skills:
• Experience with cloud storage or compute environments (AWS, GCP, or on-prem HPC).
• Experience with workflow orchestration tools (Nextflow, Snakemake).
• Familiarity with relational and NoSQL databases (PostgreSQL).
• Familiarity with public repositories such as GEO, or SRA and their metadata standards.
• Proficiency with Git for version control and collaboration.
Additional Technical Skills (a plus):
• Experience with containerization (Docker/Singularity) and CI/CD workflows.
• Understanding of web application frameworks or dashboarding tools for data portals.
• Exposure to single-cell or multi-omics integration workflows.
• Experience implementing data access and permission systems integrated with organizational identity management.
Hybrid – Cambridge, MA
Through End of Year
Top 3–5 Skills Needed:
• Strong programming and data engineering skills (Python, SQL, R)
• Experience with large-scale omics data management and integration
• Knowledge of metadata standards and ontologies for biological data
• Experience designing or maintaining bioinformatics data pipelines or repositories
• Understanding of data governance, permissions, and FAIR data principles
Job Description:
• We are seeking a highly motivated Data Scientist to design and implement an internal GEO-like system for managing the Immune Discovery omics data assets.
• The successful candidate will build a centralized platform that integrates raw, processed, and metadata layers of multi-omics datasets (e.g., bulk and single-cell RNA-seq, spatial omics, CyTOF) and ensures that they are findable, accessible, well-documented, and permission-controlled.
• This role bridges bioinformatics, data engineering, and data governance, enabling researchers to efficiently submit, query, and reuse internal datasets while maintaining data quality and compliance.
Key Responsibilities:
• Design and implement scalable pipelines for ingestion, curation, and storage of raw and processed omics data.
• Build and maintain a searchable data catalog or portal to enable dataset discovery and visualization of metadata and QC metrics.
• Implement access controls and permission management systems to ensure appropriate data security and compliance.
• Work closely with Immunology Discovery, and IR teams to integrate the system with existing compute and storage infrastructure.
• Develop and enforce metadata standards, ontologies, and schema to ensure consistency and interoperability across studies.
Impact:
• By developing this internal data platform, the candidate will transform how omics data are organized and shared across client.
• The system will improve data visibility and reuse, enhance reproducibility, and accelerate scientific insights by enabling streamlined access to all relevant data layers, raw, processed, and annotated.
Qualifications:
• BS (5+ years) or MS (0–3 years) in Bioinformatics, Computational Biology, Data Science, Computer Science, or related field.
• Proficiency in Python and SQL, with experience in data wrangling, ETL pipelines, and automation.
• Hands-on experience managing large omics datasets.
• Strong understanding of metadata models, data provenance, and FAIR data principles.
• Excellent communication skills and ability to collaborate with cross-functional teams.
Preferred Technical Skills:
• Experience with cloud storage or compute environments (AWS, GCP, or on-prem HPC).
• Experience with workflow orchestration tools (Nextflow, Snakemake).
• Familiarity with relational and NoSQL databases (PostgreSQL).
• Familiarity with public repositories such as GEO, or SRA and their metadata standards.
• Proficiency with Git for version control and collaboration.
Additional Technical Skills (a plus):
• Experience with containerization (Docker/Singularity) and CI/CD workflows.
• Understanding of web application frameworks or dashboarding tools for data portals.
• Exposure to single-cell or multi-omics integration workflows.
• Experience implementing data access and permission systems integrated with organizational identity management.