

Stott and May
Generative AI Engineer
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Generative AI Engineer (GenAI Data Engineer) based in London/Edinburgh (Hybrid, 2 days/week). Contract length is 6 months with a market rate pay. Key skills include PySpark, AWS, SQL, and experience with GenAI/LLM models.
π - Country
United Kingdom
π± - Currency
Β£ GBP
-
π° - Day rate
Unknown
-
ποΈ - Date
April 29, 2026
π - Duration
More than 6 months
-
ποΈ - Location
Hybrid
-
π - Contract
Inside IR35
-
π - Security
Unknown
-
π - Location detailed
London Area, United Kingdom
-
π§ - Skills detailed
#Lambda (AWS Lambda) #AI (Artificial Intelligence) #S3 (Amazon Simple Storage Service) #Indexing #DynamoDB #SQL (Structured Query Language) #"ETL (Extract #Transform #Load)" #Delta Lake #PySpark #Data Pipeline #Scala #Data Engineering #Distributed Computing #ML (Machine Learning) #Data Processing #Storage #Agile #Schema Design #Spark (Apache Spark) #Snowflake #Python #Redshift #Databases #Data Quality #AWS (Amazon Web Services) #Automation
Role description
GenAI Data Engineer
Location: London / Edinburgh (Hybrid - 2 days per week in the office)
Day Rate: Market rate (Inside IR35)
Duration: 6 months
The Role
We are seeking a highly skilled GenAI Data Engineer to join a forward-thinking team delivering advanced data and AI solutions. This role will focus on designing scalable data platforms and integrating Generative AI capabilities into enterprise systems.
Key Responsibilities
β’ Design, build and maintain scalable data pipelines using PySpark, Python and distributed computing frameworks
β’ Architect and optimise AWS-based data and AI infrastructure for secure, high-performance data processing
β’ Develop, fine-tune, benchmark and evaluate GenAI/LLM models, including custom training and inference optimisation
β’ Implement and maintain Retrieval-Augmented Generation (RAG) pipelines, vector databases and document processing workflows
β’ Build reusable frameworks for prompt management, evaluation and GenAI operations
β’ Collaborate with cross-functional teams to integrate GenAI solutions into production environments
β’ Ensure data quality, governance and operational reliability across systems
Essential Skills & Experience
β’ Strong experience with PySpark and large-scale distributed data processing (ETL/ELT pipelines)
β’ Advanced SQL expertise, including schema design (star/snowflake), indexing and optimisation techniques
β’ Proven experience implementing CDC and SCD (Type 1/2/3) in data warehousing environments
β’ Advanced Python skills for data engineering, automation and AI/ML integration
β’ Hands-on experience with AWS services (e.g. S3, Glue, Lambda, EMR, Bedrock or custom model hosting)
β’ Practical experience developing and evaluating GenAI/LLM models
β’ Strong understanding of RAG architectures, embeddings and vector databases
β’ Experience handling structured and unstructured data (e.g. text, documents, logs, images)
β’ Knowledge of scalable storage solutions such as Delta Lake, Parquet, Redshift and DynamoDB
β’ Experience optimising AI models (e.g. quantisation, distillation, inference tuning)
β’ Strong troubleshooting and performance optimisation skills across distributed systems
Desirable Skills
β’ Additional experience across PySpark, Python, SQL, AWS and GenAI technologies
Personal Attributes
β’ Strong analytical and problem-solving abilities
β’ Effective communication skills and ability to work collaboratively
β’ Experience working in agile, cross-functional environments
GenAI Data Engineer
Location: London / Edinburgh (Hybrid - 2 days per week in the office)
Day Rate: Market rate (Inside IR35)
Duration: 6 months
The Role
We are seeking a highly skilled GenAI Data Engineer to join a forward-thinking team delivering advanced data and AI solutions. This role will focus on designing scalable data platforms and integrating Generative AI capabilities into enterprise systems.
Key Responsibilities
β’ Design, build and maintain scalable data pipelines using PySpark, Python and distributed computing frameworks
β’ Architect and optimise AWS-based data and AI infrastructure for secure, high-performance data processing
β’ Develop, fine-tune, benchmark and evaluate GenAI/LLM models, including custom training and inference optimisation
β’ Implement and maintain Retrieval-Augmented Generation (RAG) pipelines, vector databases and document processing workflows
β’ Build reusable frameworks for prompt management, evaluation and GenAI operations
β’ Collaborate with cross-functional teams to integrate GenAI solutions into production environments
β’ Ensure data quality, governance and operational reliability across systems
Essential Skills & Experience
β’ Strong experience with PySpark and large-scale distributed data processing (ETL/ELT pipelines)
β’ Advanced SQL expertise, including schema design (star/snowflake), indexing and optimisation techniques
β’ Proven experience implementing CDC and SCD (Type 1/2/3) in data warehousing environments
β’ Advanced Python skills for data engineering, automation and AI/ML integration
β’ Hands-on experience with AWS services (e.g. S3, Glue, Lambda, EMR, Bedrock or custom model hosting)
β’ Practical experience developing and evaluating GenAI/LLM models
β’ Strong understanding of RAG architectures, embeddings and vector databases
β’ Experience handling structured and unstructured data (e.g. text, documents, logs, images)
β’ Knowledge of scalable storage solutions such as Delta Lake, Parquet, Redshift and DynamoDB
β’ Experience optimising AI models (e.g. quantisation, distillation, inference tuning)
β’ Strong troubleshooting and performance optimisation skills across distributed systems
Desirable Skills
β’ Additional experience across PySpark, Python, SQL, AWS and GenAI technologies
Personal Attributes
β’ Strong analytical and problem-solving abilities
β’ Effective communication skills and ability to work collaboratively
β’ Experience working in agile, cross-functional environments






