VBeyond Corporation

Data Scientist with NLP

⭐ - Featured Role | Apply direct with Data Freelance Hub

This role is for a Data Scientist with NLP in Houston, TX, on a long-term contract. Key skills include NLP, AWS Bedrock, and healthcare experience. Proficiency in Python, SQL, and big data technologies is required.

🌎 - Country

United States

💱 - Currency

$ USD

💰 - Day rate

Unknown

🗓️ - Date

January 6, 2026

🕒 - Duration

Unknown

🏝️ - Location

On-site

📄 - Contract

Unknown

🔒 - Security

Unknown

📍 - Location detailed

Houston, TX

🧠 - Skills detailed

#MySQL #Spark (Apache Spark) #Python #Langchain #Scala #PostgreSQL #Data Storage #AWS EMR (Amazon Elastic MapReduce) #Storage #AWS (Amazon Web Services) #ML (Machine Learning) #PySpark #AI (Artificial Intelligence) #Documentation #"ETL (Extract #Transform #Load)" #SQL (Structured Query Language) #Scripting #NLP (Natural Language Processing) #Deep Learning #FHIR (Fast Healthcare Interoperability Resources) #Data Science #Databases #Data Framework #Programming #Big Data #Automated Testing

Role description

Job Description Job Title : - Data Scientist Location : - Houston, TX (5 days onsite/Week) Type of Employment : - Contract Duration : - Long Term Must have : -NLP, AWS Bedrock & Healthcare Mandatory skills • Expertise in Fine tuning using AWS Nova • Proficiency in Python and scripting languages for NLP and machine learning development. • Hands-on experience with large language models and agentic workflow tools such as LangGraph. • Strong understanding of clinical NLP techniques and experience with machine learning and deep learning models. • Expertise in SQL and big data technologies including AWS EMR and Spark/pySpark. • Practical knowledge of AWS services, especially AWS Bedrock for generative AI applications. • Experience with relational databases such as PostgreSQL or MySQL. Good to have skills: - • Familiarity with generative AI applications in healthcare and related use cases. • Understanding of healthcare data standards and terminologies such as HL7, FHIR, and CCDA. • Experience in creating detailed documentation, user manuals, and technical specifications. • Background in automated testing and validation frameworks for NLP outputs. • Ability to collaborate effectively with cross-functional teams including engineering and products. • Exposure to LangChain or similar frameworks for building intelligent agent workflows. Responsibilities: - • Analyze and process clinical textual data using AI-powered NLP techniques and advanced machine learning models. • Modify and improve current workflows by incorporating cutting-edge machine learning and deep learning algorithms, including leveraging large language models (LLMs) and tools like LangGraph for complex AI agentic workflows in healthcare contexts. • Develop NLP modules within the NLP development team using programming or scripting languages such as Python. • Conduct pre-processing and quality analysis for textual data inputs and validate performance of NLP outputs. • Create systematic testing procedures, error-checking mechanisms, and user manuals for NLP modules. • Build infrastructure for optimal extraction, transformation, and loading of data from diverse sources including MCP servers, using SQL and AWS big data frameworks such as EMR and Spark/pySpark. • Collaborate with Engineering teams to ensure scalable and efficient data workflows using SQL and AWS big data technologies. • Apply working knowledge of AWS services, particularly AWS Bedrock, to develop generative AI applications. • Utilize relational databases such as PostgreSQL or MySQL for data storage and retrieval in NLP and AI workflows.

Apply now Apply with DFH

VBeyond Corporation

Data Scientist with NLP

Data Analyst

Business Intelligence Analyst

Business Analyst / Data Analyst

ETL Developer

Book a

chat

with us

Company