Brooksource

Senior Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Senior Data Engineer, offering a hybrid work location in Minneapolis, MN, with a contract length of unspecified duration and an hourly pay rate of $70-75. Key skills include Databricks, unstructured data processing, PySpark, Python, SQL, and API development.
🌎 - Country
United States
πŸ’± - Currency
$ USD
-
πŸ’° - Day rate
600
-
πŸ—“οΈ - Date
March 17, 2026
πŸ•’ - Duration
Unknown
-
🏝️ - Location
Hybrid
-
πŸ“„ - Contract
Unknown
-
πŸ”’ - Security
Unknown
-
πŸ“ - Location detailed
Minneapolis, MN
-
🧠 - Skills detailed
#Observability #Scala #REST API #API (Application Programming Interface) #Langchain #Data Modeling #Consulting #SQL (Structured Query Language) #Datasets #AWS (Amazon Web Services) #Cloud #Cybersecurity #"ETL (Extract #Transform #Load)" #ML (Machine Learning) #Databases #PySpark #Azure #FastAPI #Security #Data Quality #REST (Representational State Transfer) #Automation #Data Processing #Spark (Apache Spark) #ADLS (Azure Data Lake Storage) #AI (Artificial Intelligence) #Delta Lake #Microservices #BI (Business Intelligence) #Python #Semantic Models #Data Engineering #Databricks #Data Ingestion #NLP (Natural Language Processing)
Role description
Senior Data Engineer Hybrid- Minneapolis, MN Hourly Pay Rate Range $70-75/Hour This project supports the firm’s Fixed Income organization, which currently relies on highly manual, Excel-based workflows to capture, maintain, and analyze deal and market data. These processes are time-intensive, introduce potential data quality risks, and limit the ability to leverage advanced automation and AI-driven insights. The initiative focuses on modernizing this ecosystem by developing a scalable full-stack application that integrates seamlessly with Databricks and internal data platforms. The solution will incorporate LLM-powered capabilities to streamline data ingestion, enhance data quality, surface actionable insights, and enable intelligent, multi-agent workflows. Unstructured Data Platform β€’ Design and implement pipelines to ingest, parse, and transform unstructured data β€” PDFs, earnings transcripts, deal documents, research reports, and pitch materials β€’ Build scalable document processing workflows on Databricks using Delta Lake, AutoLoader, and Spark-based NLP/embedding pipelines β€’ Integrate with vector stores (e.g., LanceDB, Pinecone, Databricks Vector Search) to support RAG architectures and semantic search β€’ Partner with AI engineers to prep and ground document corpora for LLM-powered applications Genie Spaces β€’ Architect and deploy Databricks Genie Spaces to enable natural language querying across deal, financial, and client data β€’ Define and curate certified datasets, semantic models, and trusted metrics layers to ensure accuracy of AI-generated answers β€’ Collaborate with business stakeholders to identify high-value use cases and onboard end users to Genie-powered analytics β€’ Monitor and tune Genie Space performance, model grounding quality, and response fidelity API & Integration Layer β€’ Build and maintain REST and Model Serving APIs on Databricks to expose data and ML model outputs to downstream applications β€’ Integrate Databricks Model Serving endpoints with Azure API Management for governance, rate limiting, and routing β€’ Develop robust Python/FastAPI microservices and connectors linking Databricks assets to Cosmos DB, DealCloud, and Azure services β€’ Ensure API reliability, observability, and versioning standards across the platform Required Qualifications β€’ 5+ years of data engineering experience with at least 2 years in Databricks (Unity Catalog, Delta Live Tables, Workflows) β€’ Hands-on experience with unstructured data processing β€” text extraction, chunking strategies, embedding pipelines β€’ Proficiency in PySpark, Python, and SQL; experience with Databricks SQL and serverless compute β€’ Working knowledge of Databricks Genie Spaces or equivalent NL-to-SQL/AI BI tools β€’ Experience building and deploying REST APIs; familiarity with FastAPI or equivalent frameworks β€’ Familiarity with vector databases and RAG pipeline design β€’ Strong understanding of data modeling, lineage, and governance best practices Preferred Qualifications β€’ Experience integrating Databricks with Azure services (ADLS, API Management, Azure AI Foundry, Cosmos DB) β€’ Exposure to LLM orchestration frameworks (LangChain, DSPy, LlamaIndex) β€’ Background in financial services, investment banking, or professional services data environments β€’ Familiarity with DealCloud, S&P Capital IQ, or PitchBook data structures β€’ Databricks Certified Data Engineer Associate or Professional certification About Us: At Brooksource, relationships are the foundation of everything we do. Since 2000, we’ve built lasting partnerships with clients, consultants, and internal teams to deliver an exceptional experience across every engagement. As a trusted IT and Engineering services provider, Brooksource supports Fortune 500 organizations through Experience-Driven Staffing, Professional Services, and Elevate, our proprietary Workforce Transformation program. Whether you're hiring for software development, cloud computing, cybersecurity, data analytics, or enterprise IT, our customized staffing solutions are designed to align with your company’s unique goals, culture, and technology stack. We offer flexible hiring models, including contract, contract-to-hire, and direct placement to meet your evolving business needs. We are a certified partner of leading platforms, including Salesforce, AWS, Microsoft, and Google Cloud, enabling us to deliver scalable, end-to-end technology solutions. With a growing national footprint, Brooksource is redefining expectations in IT consulting, engineering services, and technology workforce solutions. EEO Statement: Brooksource is an equal opportunity employer that does not discriminate on the basis of actual or perceived race, color, creed, religion, national origin, ancestry, citizenship status, age, sex or gender (including pregnancy, childbirth, lactation and related medical conditions), gender identity or gender expression, sexual orientation, marital status, military service and veteran status, physical or mental disability, protected medical condition as defined by applicable state or local law, genetic information, or any other characteristic protected by applicable federal, state, or local laws and ordinances.