

Brooksource
Data Scientist
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Scientist in Houston, offering a 6-month ongoing contract with a competitive pay rate. Key skills include advanced Python, Generative AI expertise, and experience with Databricks/Snowflake. Domain experience in data-rich industries is preferred.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
November 11, 2025
🕒 - Duration
More than 6 months
-
🏝️ - Location
On-site
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Greater Houston
-
🧠 - Skills detailed
#NoSQL #ML (Machine Learning) #Matplotlib #Spark (Apache Spark) #Databricks #Kubernetes #Prometheus #Indexing #"ETL (Extract #Transform #Load)" #SQL (Structured Query Language) #Mathematics #Data Engineering #Statistics #Deep Learning #AWS SageMaker #Containers #A/B Testing #AWS (Amazon Web Services) #Datasets #GitHub #Cloud #Databases #PySpark #Python #PyTorch #GCP (Google Cloud Platform) #GIT #Terraform #Deployment #Programming #Pandas #MongoDB #IoT (Internet of Things) #Snowflake #SageMaker #Calculus #Hugging Face #Libraries #Azure Machine Learning #Azure DevOps #Azure #GitLab #Kafka (Apache Kafka) #Airflow #NumPy #Visualization #Infrastructure as Code (IaC) #MLflow #Data Science #DevOps #AI (Artificial Intelligence) #Compliance #Docker #Langchain #TensorFlow #Transformers #Version Control #BitBucket #Observability
Role description
Data Scientist
Location: Houston
Contract Type: 6-Month Ongoing Contract (potential to hire Fulltime)
Overview
• Design & deliver GenAI solutions: Architect and implement LLM/LVM applications (text and, where applicable, vision) including prompt strategies, guardrails, evaluation metrics, cost/latency optimization, and production rollout.
• Build robust RAG systems: Stand up end‑to‑end RAG pipelines (ingestion → chunking → embedding → retrieval → synthesis), with observability, feedback loops, and AB testing for groundedness and hallucination control.
• Fine‑tune foundation models: Select, adapt, and fine‑tune open and hosted models for domain‑specific tasks using efficient techniques (LoRA/QLoRA, PEFT, parameter‑efficient adapters), and manage evaluation datasets.
• Develop agentic workflows: Implement multi‑step, tool‑using agents for real‑time, context‑aware operations; orchestrate planning, memory, and tool calling with safe‑execution policies.
• ML/LLM engineering: Build high‑quality Python code, reusable libraries, and APIs; implement CI/CD, testing, experiment tracking, and model/version governance.
• Data & platforms: Partner with data engineering to operationalize pipelines on enterprise platforms (e.g., Databricks/Snowflake) and integrate with cloud AI/ML services.
Required qualifications
Mathematics & foundations
• Statistics, Multivariate Calculus, Linear Algebra, Optimization (you can explain choices and trade‑offs in model behavior based on these principles).
Programming
• Advanced Python (clean architecture, typing, packaging, testing; performance profiling and async where appropriate).
Python ecosystem
• Data Handling/Visualization: pandas, NumPy, Seaborn/Matplotlib, pyspark
• Machine Learning: scikit‑learn, XGBoost, LightGBM
• Deep Learning:TensorFlow or PyTorch
• Generative AI:LangChain, LlamaIndex, Haystack, Hugging Face transformers
Generative AI expertise
• Large Language/Vision Models (LLM/LVM): Hands‑on with multiple providers/models (e.g., Gemini, GPT, Claude, Llama) and their APIs.
• Model fine‑tuning: Proven experience fine‑tuning foundation models for domain tasks; evaluation design and data curation included.
• Retrieval‑Augmented Generation (RAG): Ability to design and implement robust RAG systems for real‑time, context‑aware applications.
• Prompt engineering & agentic workflows: Advanced prompt design (system/task/reflection patterns) and building multi‑step AI agents.
• Vector databases/search & embeddings: Practical experience with vector indexing, similarity search, and embedding selection/management.
Software craftsmanship & platforms
• Version Control: Git (GitHub/GitLab/Bitbucket)
• Databases:SQL and NoSQL (e.g., MongoDB, Cassandra)
• Cloud: Hands‑on with one or more of AWS, Azure, GCP, specifically AI/ML & data services (e.g., AWS SageMaker, Azure Machine Learning, Google Vertex AI)
• Enterprise data engineering platforms:Databricks, Snowflake
• Azure AI Foundry:Hands‑on experience required
Preferred qualifications
• Vector stores/search: FAISS, Milvus, Weaviate, Pinecone; hybrid (BM25 + vector) retrieval; reranking.
• LLMOps/observability: MLflow, LangSmith, OpenTelemetry, Prometheus, dashboards for cost/latency/quality; offline & online eval (e.g., RAGAS/DeepEval style).
• Orchestration & data: Airflow/Prefect; Kafka/Event Hubs; Delta/Parquet; Unity Catalog/governance.
• Containers & CI/CD: Docker, Kubernetes (AKS/EKS/GKE), GitHub Actions/Azure DevOps; infrastructure as code (Terraform).
• Safety & compliance: Content moderation, jailbreak resistance, prompt‑leakage mitigation, red‑teaming, privacy/PII handling.
• MLOps patterns: Canary/shadow deployments, feature flags, AB testing, blue‑green rollouts.
• Domain experience: Background in complex, datarich industries (e.g., energy, manufacturing, industrial IoT) is a plus.
Eight Eleven Group provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, national origin, age, sex, citizenship, disability, genetic information, gender, sexual orientation, gender identity, marital status, amnesty or status as a covered veteran in accordance with applicable federal, state, and local laws.
Data Scientist
Location: Houston
Contract Type: 6-Month Ongoing Contract (potential to hire Fulltime)
Overview
• Design & deliver GenAI solutions: Architect and implement LLM/LVM applications (text and, where applicable, vision) including prompt strategies, guardrails, evaluation metrics, cost/latency optimization, and production rollout.
• Build robust RAG systems: Stand up end‑to‑end RAG pipelines (ingestion → chunking → embedding → retrieval → synthesis), with observability, feedback loops, and AB testing for groundedness and hallucination control.
• Fine‑tune foundation models: Select, adapt, and fine‑tune open and hosted models for domain‑specific tasks using efficient techniques (LoRA/QLoRA, PEFT, parameter‑efficient adapters), and manage evaluation datasets.
• Develop agentic workflows: Implement multi‑step, tool‑using agents for real‑time, context‑aware operations; orchestrate planning, memory, and tool calling with safe‑execution policies.
• ML/LLM engineering: Build high‑quality Python code, reusable libraries, and APIs; implement CI/CD, testing, experiment tracking, and model/version governance.
• Data & platforms: Partner with data engineering to operationalize pipelines on enterprise platforms (e.g., Databricks/Snowflake) and integrate with cloud AI/ML services.
Required qualifications
Mathematics & foundations
• Statistics, Multivariate Calculus, Linear Algebra, Optimization (you can explain choices and trade‑offs in model behavior based on these principles).
Programming
• Advanced Python (clean architecture, typing, packaging, testing; performance profiling and async where appropriate).
Python ecosystem
• Data Handling/Visualization: pandas, NumPy, Seaborn/Matplotlib, pyspark
• Machine Learning: scikit‑learn, XGBoost, LightGBM
• Deep Learning:TensorFlow or PyTorch
• Generative AI:LangChain, LlamaIndex, Haystack, Hugging Face transformers
Generative AI expertise
• Large Language/Vision Models (LLM/LVM): Hands‑on with multiple providers/models (e.g., Gemini, GPT, Claude, Llama) and their APIs.
• Model fine‑tuning: Proven experience fine‑tuning foundation models for domain tasks; evaluation design and data curation included.
• Retrieval‑Augmented Generation (RAG): Ability to design and implement robust RAG systems for real‑time, context‑aware applications.
• Prompt engineering & agentic workflows: Advanced prompt design (system/task/reflection patterns) and building multi‑step AI agents.
• Vector databases/search & embeddings: Practical experience with vector indexing, similarity search, and embedding selection/management.
Software craftsmanship & platforms
• Version Control: Git (GitHub/GitLab/Bitbucket)
• Databases:SQL and NoSQL (e.g., MongoDB, Cassandra)
• Cloud: Hands‑on with one or more of AWS, Azure, GCP, specifically AI/ML & data services (e.g., AWS SageMaker, Azure Machine Learning, Google Vertex AI)
• Enterprise data engineering platforms:Databricks, Snowflake
• Azure AI Foundry:Hands‑on experience required
Preferred qualifications
• Vector stores/search: FAISS, Milvus, Weaviate, Pinecone; hybrid (BM25 + vector) retrieval; reranking.
• LLMOps/observability: MLflow, LangSmith, OpenTelemetry, Prometheus, dashboards for cost/latency/quality; offline & online eval (e.g., RAGAS/DeepEval style).
• Orchestration & data: Airflow/Prefect; Kafka/Event Hubs; Delta/Parquet; Unity Catalog/governance.
• Containers & CI/CD: Docker, Kubernetes (AKS/EKS/GKE), GitHub Actions/Azure DevOps; infrastructure as code (Terraform).
• Safety & compliance: Content moderation, jailbreak resistance, prompt‑leakage mitigation, red‑teaming, privacy/PII handling.
• MLOps patterns: Canary/shadow deployments, feature flags, AB testing, blue‑green rollouts.
• Domain experience: Background in complex, datarich industries (e.g., energy, manufacturing, industrial IoT) is a plus.
Eight Eleven Group provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, national origin, age, sex, citizenship, disability, genetic information, gender, sexual orientation, gender identity, marital status, amnesty or status as a covered veteran in accordance with applicable federal, state, and local laws.






