Infoplus Technologies UK Limited

Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer with a contract length of "unknown", offering a pay rate of "unknown". Key skills include 7+ years of AWS data engineering, S3, Glue, Athena, OpenSearch, and strong metadata modeling experience in scientific domains.
🌎 - Country
United Kingdom
💱 - Currency
£ GBP
-
💰 - Day rate
Unknown
-
🗓️ - Date
May 27, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
United Kingdom
-
🧠 - Skills detailed
#AWS (Amazon Web Services) #OpenSearch #Visualization #Athena #Spark (Apache Spark) #S3 (Amazon Simple Storage Service) #AI (Artificial Intelligence) #API (Application Programming Interface) #Metadata #PySpark #UAT (User Acceptance Testing) #Python #Data Orchestration #Cloud #Data Engineering #JSON (JavaScript Object Notation) #ML (Machine Learning)
Role description
• Key responsibilities on this engagement • • Run the Sprint 1 architecture review of the existing UAT codebase (S3 + Glue + S3 Tables + OpenSearch + Athena) and deliver written gap findings. • • Design the metadata schema, taxonomy, and field catalogue (Light, Brain, Power). • • Tune data orchestration — Glue jobs, Athena queries, S3 Tables config, scheduling. Lead the deep-dive technical sessions with analysts on visualization requirements • • Build and validate the simulation data onboarding pipeline against real data — including the 30 GB-per-run acoustic spectra dataset. • • Configure and validate the OpenSearch k-NN vector store and the Bedrock embedding pipeline. • • Author the AI/ML data export format specification and the AI onboarding pattern document. • • Co-design the API middleware blueprint with the Cloud Infrastructure Architect. • Must-have • Principal-level hands-on data engineering on AWS — 7+ years • Deep production experience with S3, S3 Tables, Glue, Athena, and OpenSearch • (including k-NN / vector search) • Built and shipped vector embedding workloads • Strong metadata modelling and data taxonomy design experience for scientific • or engineering domains • Comfort working with Parquet, JSON-LD, and large binary scientific data formats • (mesh, time-series, spectra) • Python proficiency; PySpark / Glue job tuning experience