Feuji

Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer with a contract length of "unknown" and a pay rate of "unknown." Key skills include Scala, Python, Spark, and experience with cloud platforms. Familiarity with LangChain and Generative AI is preferred.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
440
-
🗓️ - Date
November 18, 2025
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Atlanta, GA
-
🧠 - Skills detailed
#PySpark #Agile #Azure #Programming #Airflow #Apache Spark #Data Science #SQL (Structured Query Language) #GCP (Google Cloud Platform) #Kubernetes #Data Ingestion #AWS (Amazon Web Services) #Data Lake #Data Modeling #AI (Artificial Intelligence) #Observability #Python #"ETL (Extract #Transform #Load)" #Cloud #Kafka (Apache Kafka) #Big Data #Scala #Redshift #Docker #PostgreSQL #Data Architecture #Spark (Apache Spark) #Data Quality #Data Engineering #MySQL #ML (Machine Learning) #Data Pipeline #Databases #Snowflake #Langchain
Role description
Job Summary: We are looking for a skilled and motivated Data Engineer with strong hands-on experience in Scala/Pyspark and Python, and familiarity with LangChain and Generative AI technologies. The ideal candidate will work on building scalable data pipelines, designing efficient data architectures, and integrating cutting-edge AI tools to drive data-driven solutions across the organization. Key Responsibilities: • Design, develop, and maintain scalable and efficient data pipelines using Scala, Python, and Spark. • Integrate structured and unstructured data from various sources to support downstream analytics and machine learning models. • Work closely with Data Scientists and Machine Learning Engineers to deploy LLM and GenAI-powered solutions. • Explore and implement LangChain and other frameworks for LLM orchestration and prompt chaining. • Build and maintain ETL/ELT pipelines to support data ingestion, transformation, and loading from diverse sources (cloud, APIs, etc.). • Ensure data quality, observability, and governance best practices across the pipeline. • Collaborate in Agile teams to deliver well-architected, high-performance data engineering solutions. Required Skills: • Strong programming skills in Python. • Hands-on experience with Big Data tools like Apache Spark, Kafka, Airflow, etc. • Proficient in SQL and experience with databases (PostgreSQL, MySQL, Redshift, Snowflake, etc.). • Working knowledge of data modeling, data warehousing, and data lake architectures. • Experience in building and deploying pipelines on cloud platforms (AWS, Azure, GCP). Good to Have: • Familiarity with LangChain, OpenAI APIs, LlamaIndex, or similar frameworks. • Exposure to Generative AI concepts, LLM-based app development, and prompt engineering. • Experience with vector databases (Pinecone, FAISS, Weaviate). • Experience with containerization (Docker/Kubernetes) and CI/CD pipelines