

ACL Digital
Data Scientist
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Scientist/Data Engineer in Columbus, OH, for 6 months, offering a competitive pay rate. Requires expertise in LLM integration, Python, SQL, data pipelines, and AI security. Must be a U.S. Citizen with a relevant degree.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
Unknown
-
ποΈ - Date
January 15, 2026
π - Duration
More than 6 months
-
ποΈ - Location
On-site
-
π - Contract
Unknown
-
π - Security
Unknown
-
π - Location detailed
Columbus, OH
-
π§ - Skills detailed
#Data Privacy #Model Deployment #Cloud #Databases #Security #Deployment #Data Pipeline #Data Framework #Langchain #Data Science #Data Management #Azure #Spark (Apache Spark) #Python #Computer Science #Azure Databricks #AI (Artificial Intelligence) #Delta Lake #ML (Machine Learning) #SQL (Structured Query Language) #Compliance #Data Governance #Scala #MLflow #Databricks #Data Engineering #Kafka (Apache Kafka) #Data Architecture
Role description
Job Title: Data Scientist/Data Engineer
Location: Columbus OH
Duration: 06 Months
Looking only for USC
Description:
We are seeking a highly skilled and forward-thinking Data Engineer to drive the integration of Large Language Models (LLMs) and Generative AI systems into our data ecosystem. This role will focus on designing and operationalizing intelligent data pipelines and interfaces that enable seamless interaction between curated enterprise data and advanced AI models. You will play a key role in bridging data engineering and AI innovation, ensuring secure, scalable, and high-performance systems that power next-generation language-based applications.
Key Responsibilities
Design, build, and optimize data pipelines that serve as the backbone for LLM-powered systems and AI applications.
Integrate Generative AI and LLM technologies (e.g., OpenAI, Anthropic, Azure OpenAI, or open-source models like LLaMA or Mistral) with curated enterprise data.
Develop and maintain retrieval-augmented generation (RAG) pipelines to connect structured and unstructured data to model contexts.
Collaborate with data scientists, ML engineers, and AI researchers to ensure alignment between data readiness and model performance.
Implement agentic system architectures, including orchestration frameworks (e.g., LangChain, Semantic Kernel, or similar).
Enforce AI security, compliance, and data governance best practices to ensure responsible use of enterprise data in AI applications.
Automate LLM evaluation, model fine-tuning, and deployment workflows where applicable.
Monitor and troubleshoot AI data pipelines, ensuring high availability, scalability, and accuracy of responses.
Document design patterns, integration strategies, and operational playbooks for AI-driven data engineering.
Required Skills & Qualifications
Proven experience as a Data Engineer or ML Engineer with hands-on expertise in LLM or Generative AI system integrations.
Strong proficiency in Python, SQL, and distributed data frameworks (e.g., Spark, DataBricks).
Practical understanding of RAG architectures, vector databases (e.g., Pinecone, Weaviate, Chroma, FAISS), and embedding pipelines.
Familiarity with LangChain, LlamaIndex, Semantic Kernel, or equivalent frameworks.
Experience implementing secure and compliant AI pipelines, with understanding of AI security, prompt injection defenses, and data privacy.
Solid understanding of cloud-based AI infrastructureβpreferably Azure AI Services, Azure DataBricks, and Azure OpenAI Service.
Excellent problem-solving skills and ability to work across data, infrastructure, and AI teams.
Bachelorβs degree in Computer Science, Engineering, or related field (or equivalent experience).
Preferred Qualifications
Experience fine-tuning or customizing LLMs for enterprise use cases.
Familiarity with MLflow, MLOps, and CI/CD for model deployment.
Knowledge of medallion data architecture and Delta Lake for AI-ready data management.
Experience with streaming data systems (e.g., Kafka, Event Hubs) for real-time AI applications.
Contributions to open-source AI frameworks or enterprise AI integrations.
Job Title: Data Scientist/Data Engineer
Location: Columbus OH
Duration: 06 Months
Looking only for USC
Description:
We are seeking a highly skilled and forward-thinking Data Engineer to drive the integration of Large Language Models (LLMs) and Generative AI systems into our data ecosystem. This role will focus on designing and operationalizing intelligent data pipelines and interfaces that enable seamless interaction between curated enterprise data and advanced AI models. You will play a key role in bridging data engineering and AI innovation, ensuring secure, scalable, and high-performance systems that power next-generation language-based applications.
Key Responsibilities
Design, build, and optimize data pipelines that serve as the backbone for LLM-powered systems and AI applications.
Integrate Generative AI and LLM technologies (e.g., OpenAI, Anthropic, Azure OpenAI, or open-source models like LLaMA or Mistral) with curated enterprise data.
Develop and maintain retrieval-augmented generation (RAG) pipelines to connect structured and unstructured data to model contexts.
Collaborate with data scientists, ML engineers, and AI researchers to ensure alignment between data readiness and model performance.
Implement agentic system architectures, including orchestration frameworks (e.g., LangChain, Semantic Kernel, or similar).
Enforce AI security, compliance, and data governance best practices to ensure responsible use of enterprise data in AI applications.
Automate LLM evaluation, model fine-tuning, and deployment workflows where applicable.
Monitor and troubleshoot AI data pipelines, ensuring high availability, scalability, and accuracy of responses.
Document design patterns, integration strategies, and operational playbooks for AI-driven data engineering.
Required Skills & Qualifications
Proven experience as a Data Engineer or ML Engineer with hands-on expertise in LLM or Generative AI system integrations.
Strong proficiency in Python, SQL, and distributed data frameworks (e.g., Spark, DataBricks).
Practical understanding of RAG architectures, vector databases (e.g., Pinecone, Weaviate, Chroma, FAISS), and embedding pipelines.
Familiarity with LangChain, LlamaIndex, Semantic Kernel, or equivalent frameworks.
Experience implementing secure and compliant AI pipelines, with understanding of AI security, prompt injection defenses, and data privacy.
Solid understanding of cloud-based AI infrastructureβpreferably Azure AI Services, Azure DataBricks, and Azure OpenAI Service.
Excellent problem-solving skills and ability to work across data, infrastructure, and AI teams.
Bachelorβs degree in Computer Science, Engineering, or related field (or equivalent experience).
Preferred Qualifications
Experience fine-tuning or customizing LLMs for enterprise use cases.
Familiarity with MLflow, MLOps, and CI/CD for model deployment.
Knowledge of medallion data architecture and Delta Lake for AI-ready data management.
Experience with streaming data systems (e.g., Kafka, Event Hubs) for real-time AI applications.
Contributions to open-source AI frameworks or enterprise AI integrations.






