

Feuji
Data Engineer
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer with a contract length of "unknown" and a pay rate of "unknown." Key skills include Scala, Python, Spark, and experience with cloud platforms. Familiarity with LangChain and Generative AI is preferred.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
440
-
🗓️ - Date
November 18, 2025
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Atlanta, GA
-
🧠 - Skills detailed
#PySpark #Agile #Azure #Programming #Airflow #Apache Spark #Data Science #SQL (Structured Query Language) #GCP (Google Cloud Platform) #Kubernetes #Data Ingestion #AWS (Amazon Web Services) #Data Lake #Data Modeling #AI (Artificial Intelligence) #Observability #Python #"ETL (Extract #Transform #Load)" #Cloud #Kafka (Apache Kafka) #Big Data #Scala #Redshift #Docker #PostgreSQL #Data Architecture #Spark (Apache Spark) #Data Quality #Data Engineering #MySQL #ML (Machine Learning) #Data Pipeline #Databases #Snowflake #Langchain
Role description
Job Summary:
We are looking for a skilled and motivated Data Engineer with strong hands-on experience in Scala/Pyspark and Python, and familiarity with LangChain and Generative AI technologies. The ideal candidate will work on building scalable data pipelines, designing efficient data architectures, and integrating cutting-edge AI tools to drive data-driven solutions across the organization.
Key Responsibilities:
• Design, develop, and maintain scalable and efficient data pipelines using Scala, Python, and Spark.
• Integrate structured and unstructured data from various sources to support downstream analytics and machine learning models.
• Work closely with Data Scientists and Machine Learning Engineers to deploy LLM and GenAI-powered solutions.
• Explore and implement LangChain and other frameworks for LLM orchestration and prompt chaining.
• Build and maintain ETL/ELT pipelines to support data ingestion, transformation, and loading from diverse sources (cloud, APIs, etc.).
• Ensure data quality, observability, and governance best practices across the pipeline.
• Collaborate in Agile teams to deliver well-architected, high-performance data engineering solutions.
Required Skills:
• Strong programming skills in Python.
• Hands-on experience with Big Data tools like Apache Spark, Kafka, Airflow, etc.
• Proficient in SQL and experience with databases (PostgreSQL, MySQL, Redshift, Snowflake, etc.).
• Working knowledge of data modeling, data warehousing, and data lake architectures.
• Experience in building and deploying pipelines on cloud platforms (AWS, Azure, GCP).
Good to Have:
• Familiarity with LangChain, OpenAI APIs, LlamaIndex, or similar frameworks.
• Exposure to Generative AI concepts, LLM-based app development, and prompt engineering.
• Experience with vector databases (Pinecone, FAISS, Weaviate).
• Experience with containerization (Docker/Kubernetes) and CI/CD pipelines
Job Summary:
We are looking for a skilled and motivated Data Engineer with strong hands-on experience in Scala/Pyspark and Python, and familiarity with LangChain and Generative AI technologies. The ideal candidate will work on building scalable data pipelines, designing efficient data architectures, and integrating cutting-edge AI tools to drive data-driven solutions across the organization.
Key Responsibilities:
• Design, develop, and maintain scalable and efficient data pipelines using Scala, Python, and Spark.
• Integrate structured and unstructured data from various sources to support downstream analytics and machine learning models.
• Work closely with Data Scientists and Machine Learning Engineers to deploy LLM and GenAI-powered solutions.
• Explore and implement LangChain and other frameworks for LLM orchestration and prompt chaining.
• Build and maintain ETL/ELT pipelines to support data ingestion, transformation, and loading from diverse sources (cloud, APIs, etc.).
• Ensure data quality, observability, and governance best practices across the pipeline.
• Collaborate in Agile teams to deliver well-architected, high-performance data engineering solutions.
Required Skills:
• Strong programming skills in Python.
• Hands-on experience with Big Data tools like Apache Spark, Kafka, Airflow, etc.
• Proficient in SQL and experience with databases (PostgreSQL, MySQL, Redshift, Snowflake, etc.).
• Working knowledge of data modeling, data warehousing, and data lake architectures.
• Experience in building and deploying pipelines on cloud platforms (AWS, Azure, GCP).
Good to Have:
• Familiarity with LangChain, OpenAI APIs, LlamaIndex, or similar frameworks.
• Exposure to Generative AI concepts, LLM-based app development, and prompt engineering.
• Experience with vector databases (Pinecone, FAISS, Weaviate).
• Experience with containerization (Docker/Kubernetes) and CI/CD pipelines






