Data Scientist

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Scientist with a 6-month contract, UK remote location, paying competitively. Key skills include Python proficiency, LLM/GenAI testing expertise, and test strategy development. A Bachelor's degree in a quantitative field is required.
🌎 - Country
United Kingdom
💱 - Currency
£ GBP
-
💰 - Day rate
-
🗓️ - Date discovered
September 30, 2025
🕒 - Project duration
More than 6 months
-
🏝️ - Location type
Remote
-
📄 - Contract type
Fixed Term
-
🔒 - Security clearance
Unknown
-
📍 - Location detailed
London Area, United Kingdom
-
🧠 - Skills detailed
#Data Pipeline #Python #Statistics #Base #AI (Artificial Intelligence) #Monitoring #Leadership #Mathematics #Automation #Pytest #Compliance #Deployment #Agile #Data Science #Scala #Automated Testing #Strategy #Computer Science
Role description
Title: Data Scientist Duration = 6 months Location = UK remote Duties Role Overview Join a dynamic team driving a Firm-wide GenAI initiative aimed at advancing our people solutions. Collaborate with engineers, data scientists, designers, product managers, and stakeholders to deliver a critical product that supports the development, engagement, and retention of exceptional talent. As the DS in charge of testing, you’ll ensure the quality, reliability, and performance of cutting-edge features during an exciting phase of growth, supporting a rapidly expanding user base. This role is ideally suited for candidates in US/Europe time zones. As the DS for LLM testing, you will define and execute the technical vision and strategy for AI controls and testing. Your responsibilities will include continuous monitoring, evaluation, and reporting of LLM features to ensure compliance with internal standards, best practices, and external regulations. You’ll play a key role in risk assessment and mitigation, guiding the responsible development and deployment of LLMs. You will design and implement test cases for LLM governance and development, enabling your team to define features and mitigate risks. Collaborating with cross-functional teams, you’ll develop tools, automation strategies, and data pipelines to support scalable LLM management. Additionally, you’ll create standardized reporting templates for both technical and senior leadership audiences, ensuring clear communication of results. Your work will involve close collaboration with tool owners and senior management to present findings, assess risk implications, and propose enhancements to AI tools. Responsibilities Lead testing efforts for the platform, focusing on LLM output testing to ensure reliability, accuracy, and performance Develop and maintain a comprehensive and representative dataset of inputs and expected outputs for each prompt in the tool (i.e., benchmark dataset) Develop and maintain comprehensive testing strategies, including semantic similarity, Q&A validation, claims verification, LLM judge evaluations, and metrics like ROUGE Collaborate with engineering, product, and data science teams to define testing requirements, thresholds, and standards Design and implement robust test cases aligned with business goals and user needs Write and maintain automated tests in Python using frameworks like pytest (prior experience with Opik is not required) Monitor and improve test stability to support application changes Establish and track QA KPIs, such as test coverage and stability, to measure and communicate platform quality Stay updated on industry best practices for GenAI/LLM testing and integrate them into QA processes Skills • Python Proficiency: Strong experience in writing and maintaining Python code • LLM/GenAI Testing Expertise: Experience in testing LLM outputs, including semantic similarity, Q&A validation, claims verification, LLM judges, and evaluation metrics like ROUGE • Testing Frameworks: Understanding of automated testing tools (e.g., pytest) • Test Strategy Development: Proven ability to design and implement test strategies for complex systems General • Leadership: Demonstrated ability to lead own workstream and drive quality initiatives in fast-paced environments • Stakeholder Collaboration: Strong communication skills to align technical and non-technical stakeholders on testing needs and standards • Execution-Oriented: Self-driven with a “get stuff done” mindset, able to work independently and adapt quickly • Agile Mindset: Familiarity with agile principles and product development processes • Global Collaboration: Comfortable working with a global team and accommodating occasional early or late meetings Education Bachelor's degree in quantitative field like Computer Science, Engineering, Statistics, Mathematics or related field required. Advanced degree is a strong plus