PolyLoop Recycling Corp.

ML Platform Creator - (Co-Principle Investigator for DOE Grant)

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Co-Principal Investigator and ML Platform Lead on a DOE grant, with a 9-month contract starting July 1, 2026. Pay is $120/hr. Key skills include machine learning, graph neural networks, and scientific computing. A PhD is preferred.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
960
-
🗓️ - Date
April 5, 2026
🕒 - Duration
More than 6 months
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
United States
-
🧠 - Skills detailed
#AWS SageMaker #Data Cleaning #Neural Networks #ML (Machine Learning) #Data Pipeline #Predictive Modeling #React #Datasets #Libraries #Cloud #Lean #"ETL (Extract #Transform #Load)" #GitHub #SageMaker #Documentation #Leadership #AI (Artificial Intelligence) #PyTorch #AWS (Amazon Web Services) #Python #A/B Testing #R
Role description
Co-Principal Investigator ML Platform Lead Cash Compensation: 1,045 hours at $120/hr = $125,400 Equity Compensation: 515 hours at $120/hr equivalent = $61,800 in stock Total Value of 9-month engagement - $187,200 • • Annualized rate of $249,600 Start: July 1, 2026 Machine Learning Graph Neural Networks Active Learning Scientific Computing DOE-Funded Research The Opportunity in One Sentence Build a self-improving, closed-loop ML system that tells wet-lab chemists exactly which catalysts to synthesize next, so they spend 80% less time on dead ends.ut PolyLoop Recycling Corp. PolyLoop Recycling Corp. is an early-stage deep-tech company pioneering AI-guided catalytic recycling of plastic waste. We are developing the Polymer Catalyst Intelligence Engine (PCIE), a closed-loop, machine learning-driven platform that accelerates the discovery of novel catalysts capable of converting mixed plastic waste into high-value chemical feedstocks and fuels. Our research program is conducted in partnership with the UCR, Riverside, with PolyLoop's CTO Nobel Laureate Prof. Richard R. Schrock (Chemistry, 2005) and Prof. Matthew P. Conley, whose group published the first broad-spectrum catalyst for aliphatic polymer breakdown. The company is pursuing Phase 1 funding under the DOE's Genesis Mission, focused on Agentic AI-Driven Chemical Manufacturing. This is a founding-team role. You will help shape the technical direction of PolyLoop's core AI platform at the ground floor. The Rol You will serve as Co-Principal Investigator on a DOE Phase 1 research award and be the sole architect and lead engineer of the Phase 1 PCIE prototype. This is not an AI research role, this is an applied ML engineering role with a defined deliverable, a fixed compute budget, and a scientific team counting on your platform to guide their experimental decisions. The DOE's expectation is a credible, working prototype that demonstrates measurable outcomes: that ML-guided candidate selection meaningfully reduces experimental burden compared to the team's existing manual benchmarks. You will not be building a general AI system. You will be building a self-improving experimental architecture that is lean, cost-efficient, and tightly coupled to wet-lab reality. The Closed-Loop Architecture Every component you build feeds the next. The loop is the product. Component 1 — Catalyst Candidate Generation Model Train graph neural networks (GNNs) for molecular structure and gradient boosting models for tabular reaction data. These models serve as the primary filters for surfacing the most promising catalyst candidates from a defined chemical space. Component 2 — Property Prediction Models Build models predicting catalyst activity and stability. Integrate with molecular dynamics (MD) and DFT simulation workflows to computationally validate top candidates before committing lab resources. Only the highest-ranked candidates proceed to wet lab testing (Density Functional Theory is expensive and must be used selectively). Component 3 — Active Learning Loop Implement the self-improving architecture that is the core DOE deliverable: • ML models generate candidates and predict performance • Active learning selects the most informative experiments to run • UCR lab results are structured and ingested back into the training pipeline • Models are retrained on updated data, improving predictive accuracy with each experimental cycle Component 4 — End-to-End Workflow Validation • ~5 candidates experimentally validated per cycle • Measurable improvement in screening efficiency vs. existing team benchmarks • Documented architecture ready for Phase 2 scale-up What This Is NOT This phase does not involve large language models, fine-tuned LLMs, or massive GPU clusters. The stack is intentionally lean and focused on validation, iteration, and cost efficiency. You will work within ~8,000 GPU hours on AWS, not 50,000. The goal is a credible, working prototype with a clear measurable outcome for the DOE, not a research exploration. ML Platform Architecture & Engineering • Own end-to-end technical architecture of the Phase 1 PCIE closed-loop prototype • Design and train GNN-based molecular candidate generation models and gradient boosting models for tabular reaction data • Build property prediction models for catalyst activity and stability; integrate outputs with MD and DFT simulation pipelines via DOE/national lab HPC (Genesis) • Implement the active learning loop: candidate generation → performance prediction → experiment selection → structured result ingestion → model retraining • Define and enforce structured data schemas for lab result capture so experimental outputs feed cleanly back into model training • Manage cloud ML compute on AWS (A10/A100 GPU instances; ~8,000 GPU hours budgeted for Phase 1) • Set up experiment tracking (Weights & Biases), model versioning, and reproducibility standards Data Pipeline & Dataset Development • Lead ML data acquisition: mine and clean published chemistry literature, structure chemical datasets, and curate annotated training data • Work with the chemistry team to extract and format existing experimental data into ML-ready features (yield, conversion rate, selectivity, temperature profiles, catalyst degradation) • Implement data cleaning, labeling, and quality validation pipelines in coordination with the UCR lab team DOE Grant Obligations (Co-PI) • Serve as Co-Principal Investigator on the DOE Phase 1 award, responsible for all ML/AI deliverables, technical reporting, and Stage Gate milestones • Demonstrate at the Month 6/9 review that ML-guided candidate selection measurably enhances catalyst screening efficiency vs. baseline benchmarks • Contribute validated datasets and model architectures to the American Science Cloud as required under DOE open science policy • Support Phase 2 proposal preparation with updated architecture documentation and performance benchmarking data Scientific Collaboration • Work directly with Prof. Conley (UCR Co-PI) and the UCR postdoctoral researcher to design standardized experimental data capture protocols • Translate wet lab outputs into structured ML training features; serve as the bridge between experimental chemistry and the model stack • Participate in scientific review sessions with Dr. Schrock and Prof. Conley to validate that model outputs are chemically meaningful • Co-author technical manuscripts documenting the PCIE architecture and Phase 1 results Qualifications Required • 5+ years of hands-on ML engineering experience with a strong foundation in scientific computing and predictive modeling • Proficiency with gradient boosting frameworks (XGBoost, LightGBM) for structured/tabular scientific data • Experience implementing active learning pipelines or iterative ML systems that incorporate experimental or human-in-the-loop feedback • Proficiency in Python and the core ML stack: PyTorch, scikit-learn, RDKit (or equivalent cheminformatics tooling) • Experience managing cloud GPU compute (AWS SageMaker, A10/A100 instances) with a focus on cost efficiency (you will work within a defined budget) • Ability to design multi-component ML architectures where model outputs feed downstream simulation or experimental workflows • Strong written communication skills for DOE technical reporting and co-authorship Strongly Preferred • Demonstrated experience building and training graph neural networks (GNNs) for molecular or materials data • PhD or equivalent in Machine Learning, Computational Chemistry, Chemical Engineering, Materials Science, or a related field • Experience in AI-for-science applications: molecular property prediction, materials discovery, catalyst screening, protein structure prediction, or related domains • Familiarity with DFT or MD simulation tools (VASP, ORCA, LAMMPS) at a user/interface level — deep expertise not required; the UCR team owns the chemistry • Experience with cheminformatics libraries (RDKit, DeepChem, DGL-LifeSci) • Prior experience as a named investigator or technical lead on a government-funded research program (DOE, NSF, NIH, DARPA, or equivalent) This Role Is NOT Right For You If... • You are primarily an LLM engineer with limited experience in predictive ML on structured scientific data • You need large GPU budgets and open-ended compute to work effectively — Phase 1 is deliberately constrained • You are not comfortable engaging directly with wet lab scientists and translating experimental results into ML features • You prefer pure research roles without engineering delivery accountability against defined milestones • You need a fully defined system to build — the PCIE prototype architecture will be co-designed with you PolyLoop is a small team making big bets. We evaluate candidates on the quality of their thinking, not their credentials. We move quick. For Applications • A brief description of a prior project involving GNN-based molecular modeling, active learning, or iterative ML system design (focus on data pipeline choices, architecture rationale, how experimental or human feedback was integrated into retraining, and what failed and how it was addressed) • A GitHub profile, portfolio link, or publication that demonstrates hands-on ML engineering work Interview Process • 1 — Initial Screen (30 min): Video call with PolyLoop leadership — role overview, mutual fit, logistics • 2 — Technical Screen (60 min): Discuss a prior GNN/active learning project in depth. Evaluate: architecture decisions, data pipeline design, experimental feedback integration, failure modes • 3 — Scientific Collaboration Session (45 min): Join a call with the UCR chemistry team. to evaluate communication style and ability to translate between scientific and ML domains • 4 — Offer: We move to offer within one week, but the role will be contingent upon PolyLoop being awarded the grant The successful candidate will contribute to the grant proposal prior to submission on April 28 Apply to: chadwasilenkoff@gmail.com