Vallum Associates

Gen AI Data Engineer - Pyspark/Python

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for a Gen AI Data Engineer with strong PySpark, Python, and SQL skills, requiring experience in large-scale ETL/ELT pipelines and AWS services. The contract is hybrid for 6 months, based in London or Edinburgh, UK.

🌎 - Country

United Kingdom

💱 - Currency

£ GBP

💰 - Day rate

Unknown

🗓️ - Date

April 29, 2026

🕒 - Duration

Unknown

🏝️ - Location

Hybrid

📄 - Contract

Inside IR35

🔒 - Security

Unknown

📍 - Location detailed

London Area, United Kingdom

🧠 - Skills detailed

#Lambda (AWS Lambda) #AI (Artificial Intelligence) #S3 (Amazon Simple Storage Service) #Indexing #DynamoDB #SQL (Structured Query Language) #"ETL (Extract #Transform #Load)" #Model Optimization #Delta Lake #PySpark #Datasets #Scala #Data Engineering #Data Storage #ML (Machine Learning) #Data Processing #Storage #Schema Design #Spark (Apache Spark) #Snowflake #Python #Redshift #AWS (Amazon Web Services) #Automation

Role description

The Role: GenAI Data Engineer Location: London (or) Edinburgh, UK Position Type: Contract Inside IR35 Remote work option Available: Hybrid – 2 Days Onsite Job Description: Essential skills/knowledge/experience: • Strong experience with PySpark, distributed data processing, and largescale ETL/ELT pipelines. • Strong SQL expertise including star/snowflake schema design, indexing strategies, writing optimized queries, and implementing CDC, SCD Type 1/2/3 patterns for reliable data warehousing. • Advanced proficiency in Python for data engineering, automation, and ML/GenAI integration. • Hands‑on expertise with AWS services (S3, Glue, Lambda, EMR, Bedrock / custom model hosting). • Practical experience with GenAI/LLM model creation, finetuning, benchmarking, and evaluation. • Solid understanding of RAG architectures, embeddings, vector stores, and LLM evaluation methods. • Experience working with structured and unstructured datasets (documents, logs, text, images). • Familiarity with scalable data storage solutions (Delta Lake, Parquet, Redshift, DynamoDB). • Understanding model optimization techniques (quantization, distillation, inference optimization). • Strong capability to debug, tune, and optimize distributed systems and AI pipelines. Desirable skills/knowledge/experience: • Pyspark, Python, SQL,AWS, GenAI

Apply now Apply with DFH

Vallum Associates

Gen AI Data Engineer - Pyspark/Python

Business Analyst - Cyber Defence

Interim Global Procurement Analyst

Principle Machine Learning Engineer

AI Business Analyst

Book a

chat

with us

Company