

Zillion Technologies, Inc.
Senior Data Scientist
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Senior Data Scientist focused on Generative AI, based onsite in McLean, VA, for a contract length of unspecified duration. Key skills include advanced proficiency in prompt engineering, LLMs, AWS cloud deployments, and strong programming in Python.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
Unknown
-
ποΈ - Date
December 11, 2025
π - Duration
Unknown
-
ποΈ - Location
On-site
-
π - Contract
Unknown
-
π - Security
Unknown
-
π - Location detailed
McLean, VA
-
π§ - Skills detailed
#Kubernetes #Cloud #Programming #Data Science #PySpark #AI (Artificial Intelligence) #MLflow #Python #Transformers #Base #Azure #Libraries #Agile #API (Application Programming Interface) #ML (Machine Learning) #Databases #Apache Spark #Data Engineering #Jupyter #Scala #Langchain #AWS (Amazon Web Services) #Spark (Apache Spark) #"ETL (Extract #Transform #Load)" #AWS SageMaker #SageMaker #Deployment
Role description
Locals to Only# In- Person Interview
Job Title: Data Scientist Specialist
Location: Onsite in Mclean, VA (5 days)
Overview:
We are seeking a highly experienced Principal Gen AI Scientist with a strong focus on Generative AI (GenAI) to lead the design and development of cutting-edge AI Agents, Agentic Workflows and Gen AI Applications that solve complex business problems. This role requires advanced proficiency in Prompt Engineering, Large Language Models (LLMs), RAG, Graph RAG, MCP, A2A, multi-modal AI, Gen AI Patterns, Evaluation Frameworks, Guardrails, data curation, and AWS cloud deployments. You will serve as a hands-on Gen AI (data) scientist and critical thought leader, working alongside full stack developers, UX designers, product managers and data engineers to shape and implement enterprise-grade Gen AI solutions.
Responsibilities:
β’ Architect and implement scalable AI Agents, Agentic Workflows and GenAI applications to address diverse and complex business use cases.
β’ Develop, fine-tune, and optimize lightweight LLMs; lead the evaluation and adaptation of models such as Claude (Anthropic), Azure OpenAI, and open-source alternatives.
β’ Design and deploy Retrieval-Augmented Generation (RAG) and Graph RAG systems using vector databases and knowledge bases.
β’ Curate enterprise data using connectors integrated with AWS Bedrock's Knowledge Base/Elastic.
β’ Implement solutions leveraging MCP (Model Context Protocol) and A2A (Agent-to-Agent) communication.
β’ Build and maintain Jupyter-based notebooks using platforms like AWS SageMaker and MLFlow/Kubeflow on Kubernetes (EKS).
β’ Collaborate with cross-functional teams of UI and microservice engineers, designers, and data engineers to build full-stack Gen AI experiences.
β’ Integrate GenAI solutions with enterprise platforms via API-based methods and GenAI standardized patterns.
β’ Establish and enforce validation procedures with Evaluation Frameworks, bias mitigation, safety protocols, and guardrails for production-ready deployment.
β’ Design & build robust ingestion pipelines that extract, chunk, enrich, and anonymize data from PDFs, video, and audio sources for use in LLM-powered workflowsβleveraging best practices like semantic chunking and privacy controls.
β’ Orchestrate multimodal pipelines
β’
β’ using scalable frameworks (e.g., Apache Spark, PySpark) for automated ETL/ELT workflows appropriate for unstructured media.
β’ Implement embeddings drivesβmap media content to vector representations using embedding models, and integrate with vector stores (AWS Knowledge Base/Elastic/Mongo Atlas) to support RAG architectures.
Qualifications:
β’ experience in AI/ML, with applied GenAI or LLM-based solutions.
β’ Deep expertise in prompt engineering, fine-tuning, RAG, GraphRAG, vector databases (e.g., AWS Knowledge Base / Elastic), and multi-modal models.
β’ Proven experience with cloud-native AI development (AWS SageMaker, Amazon Bedrock, MLFlow on EKS).
β’ Strong programming skills in Python and ML libraries (Transformers, LangChain, etc.).
β’ Deep understanding of Gen AI system patterns and architectural best practices, Evaluation Frameworks.
β’ Demonstrated ability to work in cross-functional agile teams.
Locals to Only# In- Person Interview
Job Title: Data Scientist Specialist
Location: Onsite in Mclean, VA (5 days)
Overview:
We are seeking a highly experienced Principal Gen AI Scientist with a strong focus on Generative AI (GenAI) to lead the design and development of cutting-edge AI Agents, Agentic Workflows and Gen AI Applications that solve complex business problems. This role requires advanced proficiency in Prompt Engineering, Large Language Models (LLMs), RAG, Graph RAG, MCP, A2A, multi-modal AI, Gen AI Patterns, Evaluation Frameworks, Guardrails, data curation, and AWS cloud deployments. You will serve as a hands-on Gen AI (data) scientist and critical thought leader, working alongside full stack developers, UX designers, product managers and data engineers to shape and implement enterprise-grade Gen AI solutions.
Responsibilities:
β’ Architect and implement scalable AI Agents, Agentic Workflows and GenAI applications to address diverse and complex business use cases.
β’ Develop, fine-tune, and optimize lightweight LLMs; lead the evaluation and adaptation of models such as Claude (Anthropic), Azure OpenAI, and open-source alternatives.
β’ Design and deploy Retrieval-Augmented Generation (RAG) and Graph RAG systems using vector databases and knowledge bases.
β’ Curate enterprise data using connectors integrated with AWS Bedrock's Knowledge Base/Elastic.
β’ Implement solutions leveraging MCP (Model Context Protocol) and A2A (Agent-to-Agent) communication.
β’ Build and maintain Jupyter-based notebooks using platforms like AWS SageMaker and MLFlow/Kubeflow on Kubernetes (EKS).
β’ Collaborate with cross-functional teams of UI and microservice engineers, designers, and data engineers to build full-stack Gen AI experiences.
β’ Integrate GenAI solutions with enterprise platforms via API-based methods and GenAI standardized patterns.
β’ Establish and enforce validation procedures with Evaluation Frameworks, bias mitigation, safety protocols, and guardrails for production-ready deployment.
β’ Design & build robust ingestion pipelines that extract, chunk, enrich, and anonymize data from PDFs, video, and audio sources for use in LLM-powered workflowsβleveraging best practices like semantic chunking and privacy controls.
β’ Orchestrate multimodal pipelines
β’
β’ using scalable frameworks (e.g., Apache Spark, PySpark) for automated ETL/ELT workflows appropriate for unstructured media.
β’ Implement embeddings drivesβmap media content to vector representations using embedding models, and integrate with vector stores (AWS Knowledge Base/Elastic/Mongo Atlas) to support RAG architectures.
Qualifications:
β’ experience in AI/ML, with applied GenAI or LLM-based solutions.
β’ Deep expertise in prompt engineering, fine-tuning, RAG, GraphRAG, vector databases (e.g., AWS Knowledge Base / Elastic), and multi-modal models.
β’ Proven experience with cloud-native AI development (AWS SageMaker, Amazon Bedrock, MLFlow on EKS).
β’ Strong programming skills in Python and ML libraries (Transformers, LangChain, etc.).
β’ Deep understanding of Gen AI system patterns and architectural best practices, Evaluation Frameworks.
β’ Demonstrated ability to work in cross-functional agile teams.






