

Droisys
GenAI Data Engineer with Databricks and PySpark Experience
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a GenAI Data Engineer with 5–8+ years of experience, focused on Databricks, PySpark, and Python. It offers a contract in Plano, TX, at $40-$47/hr W2, requiring expertise in ETL/ELT pipelines and GenAI workflows.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
376
-
🗓️ - Date
November 22, 2025
🕒 - Duration
Unknown
-
🏝️ - Location
On-site
-
📄 - Contract
W2 Contractor
-
🔒 - Security
Unknown
-
📍 - Location detailed
Plano, TX
-
🧠 - Skills detailed
#Langchain #REST API #Strategy #Jenkins #Pytest #SageMaker #Databricks #Azure Databricks #Monitoring #Microservices #Schema Design #Azure #AI (Artificial Intelligence) #Delta Lake #Scala #REST (Representational State Transfer) #GIT #Data Engineering #Agile #Deployment #DevOps #Data Pipeline #Version Control #AWS (Amazon Web Services) #Automated Testing #SQL (Structured Query Language) #Data Quality #PySpark #Docker #ML (Machine Learning) #Python #Cloud #GCP (Google Cloud Platform) #Spark (Apache Spark) #"ETL (Extract #Transform #Load)" #Azure DevOps #Databases
Role description
About Company,
Droisys is an innovation technology company focused on helping companies accelerate their digital initiatives from strategy and planning through execution. We leverage deep technical expertise, Agile methodologies, and data-driven intelligence to modernize systems of engagement and simplify human/tech interaction.
Amazing things happen when we work in environments where everyone feels a true sense of belonging and when candidates have the requisite skills and opportunities to succeed. At Droisys, we invest in our talent and support career growth, and we are always on the lookout for amazing talent who can contribute to our growth by delivering top results for our clients. Join us to challenge yourself and accomplish work that matters.
GenAI Data Engineer with Databricks and PySpark experience
Plano, TX — Fully Onsite from Day 1
Interview Mode:-Phone & Face to Face
Rate Range $40 to $47 hr W2 All Inc
Role Overview
We are seeking GenAI-focused Data Engineers with strong expertise in Databricks SQL, PySpark, Python, and Pytest to design, build, and maintain scalable data pipelines and testing frameworks. In this role, you will work closely with AI, data, and application engineering teams to support Generative AI solutions, LLM workflows, and web application integrations. The ideal candidate has a hybrid skill set spanning data engineering and application development, with hands-on experience building robust, production-grade systems.
Key Responsibilities
• Build, optimize, and maintain scalable data pipelines using Databricks, PySpark, and SQL.
• Design and implement data workflows supporting GenAI, LLM-based applications, and RAG pipelines.
• Develop Python-based application modules to support end-to-end AI/ML workflows.
• Create and maintain automated testing frameworks using Pytest for data pipelines, APIs, and GenAI components.
• Integrate data engineering pipelines with GenAI services, vector databases, and model APIs.
• Work with cross-functional teams to deliver high-quality, reliable data models and application components.
• Troubleshoot data quality issues, optimize performance, and ensure pipeline reliability.
• Implement CI/CD for data and application workflows, ensuring version control and deployment consistency.
• Document design patterns, architecture, and processes for ongoing maintenance and scalability.
Required Skills & Qualifications
• 5–8+ years of experience as a Data Engineer / Application Engineer.
• Strong hands-on experience with Databricks SQL, PySpark, and Python.
• Experience designing and maintaining ETL/ELT pipelines in cloud environments (Azure/AWS/GCP).
• Proficiency in Pytest for automated testing of data pipelines and backend application logic.
• Solid understanding of GenAI workflows, LLM integration, APIs, and model-driven data pipelines.
• Experience with Delta Lake, Databricks Workflows, and data optimization techniques.
• Strong experience with SQL, schema design, and performance tuning.
• Familiarity with REST APIs, microservices, and Python-based application development.
Preferred Qualifications
• Experience with vector databases (Pinecone, Weaviate, Milvus, FAISS).
• Knowledge of RAG pipelines, embeddings, and LLM orchestration frameworks (LangChain, LlamaIndex).
• Knowledge of DevOps tools: Git, Jenkins, Azure DevOps, Docker.
• Experience with cloud-based ML/AI platforms such as Azure Databricks, Data Factory, Sagemaker, or Vertex AI.
• Exposure to MLOps or AI model monitoring is a plus.
Droisys is an equal opportunity employer. We do not discriminate based on race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. Droisys believes in diversity, inclusion, and belonging, and we are committed to fostering a diverse work environment.
About Company,
Droisys is an innovation technology company focused on helping companies accelerate their digital initiatives from strategy and planning through execution. We leverage deep technical expertise, Agile methodologies, and data-driven intelligence to modernize systems of engagement and simplify human/tech interaction.
Amazing things happen when we work in environments where everyone feels a true sense of belonging and when candidates have the requisite skills and opportunities to succeed. At Droisys, we invest in our talent and support career growth, and we are always on the lookout for amazing talent who can contribute to our growth by delivering top results for our clients. Join us to challenge yourself and accomplish work that matters.
GenAI Data Engineer with Databricks and PySpark experience
Plano, TX — Fully Onsite from Day 1
Interview Mode:-Phone & Face to Face
Rate Range $40 to $47 hr W2 All Inc
Role Overview
We are seeking GenAI-focused Data Engineers with strong expertise in Databricks SQL, PySpark, Python, and Pytest to design, build, and maintain scalable data pipelines and testing frameworks. In this role, you will work closely with AI, data, and application engineering teams to support Generative AI solutions, LLM workflows, and web application integrations. The ideal candidate has a hybrid skill set spanning data engineering and application development, with hands-on experience building robust, production-grade systems.
Key Responsibilities
• Build, optimize, and maintain scalable data pipelines using Databricks, PySpark, and SQL.
• Design and implement data workflows supporting GenAI, LLM-based applications, and RAG pipelines.
• Develop Python-based application modules to support end-to-end AI/ML workflows.
• Create and maintain automated testing frameworks using Pytest for data pipelines, APIs, and GenAI components.
• Integrate data engineering pipelines with GenAI services, vector databases, and model APIs.
• Work with cross-functional teams to deliver high-quality, reliable data models and application components.
• Troubleshoot data quality issues, optimize performance, and ensure pipeline reliability.
• Implement CI/CD for data and application workflows, ensuring version control and deployment consistency.
• Document design patterns, architecture, and processes for ongoing maintenance and scalability.
Required Skills & Qualifications
• 5–8+ years of experience as a Data Engineer / Application Engineer.
• Strong hands-on experience with Databricks SQL, PySpark, and Python.
• Experience designing and maintaining ETL/ELT pipelines in cloud environments (Azure/AWS/GCP).
• Proficiency in Pytest for automated testing of data pipelines and backend application logic.
• Solid understanding of GenAI workflows, LLM integration, APIs, and model-driven data pipelines.
• Experience with Delta Lake, Databricks Workflows, and data optimization techniques.
• Strong experience with SQL, schema design, and performance tuning.
• Familiarity with REST APIs, microservices, and Python-based application development.
Preferred Qualifications
• Experience with vector databases (Pinecone, Weaviate, Milvus, FAISS).
• Knowledge of RAG pipelines, embeddings, and LLM orchestration frameworks (LangChain, LlamaIndex).
• Knowledge of DevOps tools: Git, Jenkins, Azure DevOps, Docker.
• Experience with cloud-based ML/AI platforms such as Azure Databricks, Data Factory, Sagemaker, or Vertex AI.
• Exposure to MLOps or AI model monitoring is a plus.
Droisys is an equal opportunity employer. We do not discriminate based on race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. Droisys believes in diversity, inclusion, and belonging, and we are committed to fostering a diverse work environment.





