Stott and May

Generative AI Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Generative AI Engineer (GenAI Data Engineer) based in London/Edinburgh (Hybrid, 2 days/week). Contract length is 6 months with a market rate pay. Key skills include PySpark, AWS, SQL, and experience with GenAI/LLM models.
🌎 - Country
United Kingdom
πŸ’± - Currency
Β£ GBP
-
πŸ’° - Day rate
Unknown
-
πŸ—“οΈ - Date
April 29, 2026
πŸ•’ - Duration
More than 6 months
-
🏝️ - Location
Hybrid
-
πŸ“„ - Contract
Inside IR35
-
πŸ”’ - Security
Unknown
-
πŸ“ - Location detailed
London Area, United Kingdom
-
🧠 - Skills detailed
#Lambda (AWS Lambda) #AI (Artificial Intelligence) #S3 (Amazon Simple Storage Service) #Indexing #DynamoDB #SQL (Structured Query Language) #"ETL (Extract #Transform #Load)" #Delta Lake #PySpark #Data Pipeline #Scala #Data Engineering #Distributed Computing #ML (Machine Learning) #Data Processing #Storage #Agile #Schema Design #Spark (Apache Spark) #Snowflake #Python #Redshift #Databases #Data Quality #AWS (Amazon Web Services) #Automation
Role description
GenAI Data Engineer Location: London / Edinburgh (Hybrid - 2 days per week in the office) Day Rate: Market rate (Inside IR35) Duration: 6 months The Role We are seeking a highly skilled GenAI Data Engineer to join a forward-thinking team delivering advanced data and AI solutions. This role will focus on designing scalable data platforms and integrating Generative AI capabilities into enterprise systems. Key Responsibilities β€’ Design, build and maintain scalable data pipelines using PySpark, Python and distributed computing frameworks β€’ Architect and optimise AWS-based data and AI infrastructure for secure, high-performance data processing β€’ Develop, fine-tune, benchmark and evaluate GenAI/LLM models, including custom training and inference optimisation β€’ Implement and maintain Retrieval-Augmented Generation (RAG) pipelines, vector databases and document processing workflows β€’ Build reusable frameworks for prompt management, evaluation and GenAI operations β€’ Collaborate with cross-functional teams to integrate GenAI solutions into production environments β€’ Ensure data quality, governance and operational reliability across systems Essential Skills & Experience β€’ Strong experience with PySpark and large-scale distributed data processing (ETL/ELT pipelines) β€’ Advanced SQL expertise, including schema design (star/snowflake), indexing and optimisation techniques β€’ Proven experience implementing CDC and SCD (Type 1/2/3) in data warehousing environments β€’ Advanced Python skills for data engineering, automation and AI/ML integration β€’ Hands-on experience with AWS services (e.g. S3, Glue, Lambda, EMR, Bedrock or custom model hosting) β€’ Practical experience developing and evaluating GenAI/LLM models β€’ Strong understanding of RAG architectures, embeddings and vector databases β€’ Experience handling structured and unstructured data (e.g. text, documents, logs, images) β€’ Knowledge of scalable storage solutions such as Delta Lake, Parquet, Redshift and DynamoDB β€’ Experience optimising AI models (e.g. quantisation, distillation, inference tuning) β€’ Strong troubleshooting and performance optimisation skills across distributed systems Desirable Skills β€’ Additional experience across PySpark, Python, SQL, AWS and GenAI technologies Personal Attributes β€’ Strong analytical and problem-solving abilities β€’ Effective communication skills and ability to work collaboratively β€’ Experience working in agile, cross-functional environments