

Senior Data Scientist
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Senior Data Scientist with a contract length of "unknown" and a pay rate of "$XX/hour". Candidates should have 5+ years of experience, specializing in Identity Resolution, proficiency in Python or R, and knowledge of graph databases.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
-
ποΈ - Date discovered
June 15, 2025
π - Project duration
Unknown
-
ποΈ - Location type
Unknown
-
π - Contract type
Unknown
-
π - Security clearance
Unknown
-
π - Location detailed
United States
-
π§ - Skills detailed
#Pandas #Neo4J #Datasets #NLP (Natural Language Processing) #Databases #Mathematics #Python #Libraries #Visualization #Data Quality #Computer Science #Data Engineering #TensorFlow #PyTorch #Statistics #Graph Databases #Data Science #Data Analysis #HBase #Scala #SQL (Structured Query Language) #NumPy #ML (Machine Learning) #R #Classification #Clustering #Documentation
Role description
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
- Item 1
- Item 2
- Item 3
Unordered list
- Item A
- Item B
- Item C
Bold text
Emphasis
Superscript
Subscript
We are seeking a highly skilled Data Scientist with specialized expertise in Identity Resolution or Entity Resolution/Matching. The ideal candidate will leverage advanced data science techniques, including machine learning, probabilistic matching, and graph-based algorithms, to resolve and link entities across disparate data sources, ensuring high accuracy in identity and entity disambiguation. This role will involve working closely with cross-functional teams to drive insights, improve data quality, and support business objectives through robust entity resolution solutions.
Key Responsibilities
β’ Entity Resolution & Matching: Design, develop, and implement scalable identity and entity resolution algorithms to deduplicate, link, and disambiguate records across structured and unstructured datasets.
β’ Data Analysis & Modeling: Apply statistical and machine learning techniques (e.g., clustering, classification, natural language processing) to analyze complex datasets and improve matching accuracy.
β’ Feature Engineering: Create and optimize features for entity matching, such as name standardization, address parsing, and fuzzy matching techniques.
β’ Graph-Based Solutions: Utilize graph databases and network analysis to model relationships between entities and enhance resolution processes.
β’ Data Quality & Validation: Assess and improve data quality by identifying inconsistencies, missing values, or duplicates, and validate matching results against ground truth or external benchmarks.
β’ Collaboration: Partner with data engineers, product managers, and domain experts to integrate entity resolution pipelines into production systems and align solutions with business needs.
β’ Performance Optimization: Optimize algorithms and workflows for scalability, speed, and accuracy, ensuring they perform efficiently on large-scale datasets.
β’ Documentation & Reporting: Document methodologies, present findings, and provide actionable insights to stakeholders through clear visualizations and reports.
Required Qualifications
Education: Masterβs or Ph.D. in Computer Science, Data Science, Statistics, Mathematics, or a related quantitative field.
Experience:
β’ 5+ years of experience in data science or a related role.
β’ 3+ years of hands-on experience with identity resolution, entity resolution, or record linkage projects.
Technical Skills:
β’ Proficiency in Python or R for data analysis and modeling (libraries: pandas, scikit-learn, NumPy, etc.).
β’ Experience with machine learning frameworks (e.g., TensorFlow, PyTorch) and probabilistic matching techniques.
β’ Familiarity with fuzzy matching tools (e.g., dedupe, fuzzywuzzy) and string similarity metrics (e.g., Levenshtein, Jaro-Winkler).
β’ Knowledge of graph databases (e.g., Neo4j) and network analysis for relationship modeling.
β’ Experience with SQL and handling large-scale datasets in relational or No