Jobs via Dice

Data Scientist

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Scientist in Mount Laurel, NJ, on a contract basis. Key skills include 7-10 years in Python ML (XGBoost, scikit-learn), PySpark, and Java-based ML libraries. Strong debugging and communication skills are essential.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
October 28, 2025
🕒 - Duration
Unknown
-
🏝️ - Location
On-site
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Mt. Laurel, NJ
-
🧠 - Skills detailed
#"ETL (Extract #Transform #Load)" #Data Analysis #Spark (Apache Spark) #Migration #Java #Datasets #Pandas #Python #ML (Machine Learning) #Data Ingestion #NumPy #Model Validation #Libraries #PySpark #Documentation #Data Science #Debugging
Role description
Dice is the leading career destination for tech experts at every stage of their careers. Our client, Synkriom, is seeking the following. Apply via Dice today! Role Name : Data Scientist - XGboost Location : MOUNT LAUREL, NJ(Onsite) Contract - Onsite Job Summary We are seeking a technically strong and detail-oriented Data Scientist to support model migration, development, and validation across Java and Python ecosystems. The primary responsibility is to migrate and validate XGBoost-based ML models originally developed in Java (DL4J library) into Python using a custom-built internal Python framework. The role involves close collaboration with PySpark engineers, platform teams, and validation teams to ensure metric parity and functional equivalence between the Java and Python implementations. Key Responsibilities • Collaborate with customer and internal teams to understand the logic, structure, and parameters of existing Java-based XGBoost models. • Interpret and document data transformation logic and validate feature pipelines from Java implementations. • Convert and validate Python models by running them on historical datasets and comparing results against Java model benchmarks. • Partner with model validation teams to review performance, analyze discrepancies, and explain metric deviations if any. • Design unit tests and validation scenarios to ensure migrated models meet sign-off criteria. • Ingest model input data from parquet files using PySpark and pandas, reproducing training and scoring workflows. • Conduct exploratory data analysis (EDA) and spot-check row-level predictions to ensure model consistency. • Provide detailed documentation and validation reports supporting model readiness for production. Skills & Qualifications • 7 10 years of hands-on experience with Python for Machine Learning (especially XGBoost, scikit-learn, NumPy, pandas). • Strong proficiency in PySpark for large-scale data ingestion, transformation, and analysis (parquet-based workflows). • Experience in validating or reverse-engineering ML models from existing business logic or legacy systems. • Exposure to or understanding of Java-based ML libraries (e.g., DL4J) and the ability to map model logic across languages. • Experience with meta-modelling or custom ML frameworks in Python. • Strong debugging and problem-solving skills with an eye for accuracy, reproducibility, and metric consistency. • Excellent communication skills for collaboration with technical and validation stakeholders.