Randstad Digital

Data & AI - LLM Model Developer(PySpark Engineer)

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is a 5-month contract for a Lead PySpark Engineer focused on migrating legacy data workflows to AWS. Requires 5+ years of PySpark experience, strong AWS knowledge, and financial services background. Remote work based in the UK.

🌎 - Country

United Kingdom

💱 - Currency

£ GBP

💰 - Day rate

Unknown

🗓️ - Date

February 10, 2026

🕒 - Duration

Unknown

🏝️ - Location

Remote

📄 - Contract

Fixed Term

🔒 - Security

Unknown

📍 - Location detailed

United Kingdom

🧠 - Skills detailed

#AI (Artificial Intelligence) #SQL (Structured Query Language) #Spark SQL #Python #Documentation #AWS (Amazon Web Services) #Data Accuracy #"ETL (Extract #Transform #Load)" #Quality Assurance #Migration #Base #Macros #S3 (Amazon Simple Storage Service) #GIT #Cloud #Debugging #Athena #Data Modeling #Spark (Apache Spark) #Data Mart #SAS #Scala #PySpark

Role description

Lead PySpark Engineer (Cloud Migration) Role Type: 5-Month Contract Location: Remote (UK-Based) Experience Level: Lead / Senior (5+ years PySpark) Role Overview We are seeking a Lead PySpark Engineer to drive a large-scale data modernisation project, transitioning legacy data workflows into a high-performance AWS cloud environment. This is a hands-on technical role focused on converting legacy SAS code into production-ready PySpark pipelines within a complex financial services landscape. Key Responsibilities • Code Conversion: Lead the end-to-end migration of SAS code (Base SAS, Macros, DI Studio) to PySpark using automated tools (SAS2PY) and manual refactoring. • Pipeline Engineering: Design, build, and troubleshoot complex ETL/ELT workflows and data marts on AWS. • Performance Tuning: Optimise Spark workloads for execution efficiency, partitioning, and cost-effectiveness. • Quality Assurance: Implement clean coding principles, modular design, and robust unit/comparative testing to ensure data accuracy throughout the migration. • Engineering Excellence: Maintain Git-based workflows, CI/CD integration, and comprehensive technical documentation. Technical Requirements • PySpark (P3): 5+ years of hands-on experience writing scalable, production-grade PySpark/Spark SQL. • AWS Data Stack (P3): Strong proficiency in EMR, Glue, S3, Athena, and Glue Workflows. • SAS Knowledge (P1): Solid foundation in SAS to enable the understanding and debugging of legacy logic for conversion. • Data Modeling: Expertise in ETL/ELT, dimensions, facts, SCDs, and data mart architecture. • Engineering Quality: Experience with parameterisation, exception handling, and modular Python design. Additional Details • Industry: Financial Services experience is highly desirable. • Working Pattern: Fully remote with internal team collaboration days. • Benefits: 33 days holiday entitlement (pro-rata).

Apply now Apply with DFH

Randstad Digital

Data & AI - LLM Model Developer(PySpark Engineer)

(Fixed Term) Analyst, Quality Control, Microbiology

Machine Learning Engineer

Data Science Specialist

Machine Learning Engineer (Python / MLE)

Book a

chat

with us

Company