Feuji

Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for a Data Engineer with a contract length of "unknown" and a pay rate of "unknown." Key skills include Scala, Python, Spark, and experience with cloud platforms. Familiarity with LangChain and Generative AI is preferred.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

440

🗓️ - Date

November 18, 2025

🕒 - Duration

Unknown

🏝️ - Location

Unknown

📄 - Contract

Unknown

🔒 - Security

Unknown

📍 - Location detailed

Atlanta, GA

🧠 - Skills detailed

#PySpark #Agile #Azure #Programming #Airflow #Apache Spark #Data Science #SQL (Structured Query Language) #GCP (Google Cloud Platform) #Kubernetes #Data Ingestion #AWS (Amazon Web Services) #Data Lake #Data Modeling #AI (Artificial Intelligence) #Observability #Python #"ETL (Extract #Transform #Load)" #Cloud #Kafka (Apache Kafka) #Big Data #Scala #Redshift #Docker #PostgreSQL #Data Architecture #Spark (Apache Spark) #Data Quality #Data Engineering #MySQL #ML (Machine Learning) #Data Pipeline #Databases #Snowflake #Langchain

Role description

Job Summary: We are looking for a skilled and motivated Data Engineer with strong hands-on experience in Scala/Pyspark and Python, and familiarity with LangChain and Generative AI technologies. The ideal candidate will work on building scalable data pipelines, designing efficient data architectures, and integrating cutting-edge AI tools to drive data-driven solutions across the organization. Key Responsibilities: • Design, develop, and maintain scalable and efficient data pipelines using Scala, Python, and Spark. • Integrate structured and unstructured data from various sources to support downstream analytics and machine learning models. • Work closely with Data Scientists and Machine Learning Engineers to deploy LLM and GenAI-powered solutions. • Explore and implement LangChain and other frameworks for LLM orchestration and prompt chaining. • Build and maintain ETL/ELT pipelines to support data ingestion, transformation, and loading from diverse sources (cloud, APIs, etc.). • Ensure data quality, observability, and governance best practices across the pipeline. • Collaborate in Agile teams to deliver well-architected, high-performance data engineering solutions. Required Skills: • Strong programming skills in Python. • Hands-on experience with Big Data tools like Apache Spark, Kafka, Airflow, etc. • Proficient in SQL and experience with databases (PostgreSQL, MySQL, Redshift, Snowflake, etc.). • Working knowledge of data modeling, data warehousing, and data lake architectures. • Experience in building and deploying pipelines on cloud platforms (AWS, Azure, GCP). Good to Have: • Familiarity with LangChain, OpenAI APIs, LlamaIndex, or similar frameworks. • Exposure to Generative AI concepts, LLM-based app development, and prompt engineering. • Experience with vector databases (Pinecone, FAISS, Weaviate). • Experience with containerization (Docker/Kubernetes) and CI/CD pipelines

Apply now Apply with DFH Sign up

← See all roles