Recru

Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer with a 6-month contract, offering a pay rate of "$X/hour". Required skills include 5+ years in data engineering, SQL/NoSQL proficiency, and experience with Databricks. A Bachelor's degree in IT or related field is essential.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
June 30, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Sugar Land, TX
-
🧠 - Skills detailed
#Snowflake #Data Analysis #Spark (Apache Spark) #Data Modeling #Monitoring #Datasets #Databases #ML (Machine Learning) #MS SQL (Microsoft SQL Server) #Data Engineering #AI (Artificial Intelligence) #Spark SQL #Automation #REST (Representational State Transfer) #Web Services #Oracle #Data Quality #Python #Web API #Synapse #Cloud #PySpark #SQL Server #Documentation #Automated Testing #Requirements Gathering #Data Integration #Batch #Data Warehouse #Databricks #NoSQL #SQL (Structured Query Language) #Data Governance #Data Lake #XML (eXtensible Markup Language) #"ETL (Extract #Transform #Load)" #SQL Queries
Role description
• • NO C2C Submissions • • Design, develop, and support data engineering, data modeling, and data integrations, with a primary focus on accelerating data landing and curation in a Databricks data lake house. Build and maintain reliable, well-governed pipelines that ingest data from source systems into the lake house and curate it through a layered (medallion) architecture into analytics-ready, trusted datasets. The role also carries a strong reporting and data-analysis focus — partnering with business users to build semantic data models, dashboards, and reports, and performing hands-on analysis to answer business questions. The Data Engineer will help establish the data foundation that powers data-related AI and machine learning initiatives, ensuring high-quality, well-documented, AI-ready data products. Key Responsibilities • Build, optimize, and support pipelines that land data from source systems into the Databricks lake house and curate it through a layered (medallion) architecture into trusted, analytics-ready datasets. • Produce and maintain high-quality, well-governed, documented, AI-ready data products that serve as the foundation for AI and machine learning initiatives. • Implement data quality, governance, and monitoring controls (e.g., Unity Catalog, automated testing, alerting) across lake house pipelines. • Develop and maintain reporting and analytics solutions — semantic data models, dashboards, and reports — and perform ad-hoc querying to support business decision-making. • Gather requirements, design, and develop new data integrations or enhancements to existing code. • Partner with business users and the Business Relationship Management team on requirements gathering, testing, and supporting existing integrations, analytics, and reporting. • Create and maintain documentation and process flows for integration solutions. Required Experience & Skills • Minimum 5 years of IT/technology experience spanning data analysis, data engineering, and/or data integration, with a focus on building and curating pipelines in a cloud data lake or lake house environment. • At least 3 years writing SQL/NoSQL queries, with specific experience in MS SQL Server, Oracle, and/or Postgres. • Hands-on experience with a modern cloud data platform / lake house (Databricks, Microsoft Fabric, Snowflake, or comparable). Databricks strongly preferred. • Demonstrated experience landing data from diverse source systems into a lake/lake house and curating it through a medallion (bronze-silver-gold) architecture into clean, conformed, analytics-ready datasets. • Strong Python skills for data engineering, including PySpark. • Working knowledge of data quality, data governance, and pipeline reliability practices — automated testing, monitoring, alerting, and orchestration of batch and incremental/streaming workloads. • Experience designing simplified data models for integrations, analytics, and reporting; comfortable performing hands-on data analysis and ad-hoc querying. • Experience extracting data from source systems via web services (SOAP, REST, Web APIs), XML, and CSV/Excel exports. • Experience building the data foundation and automation pipelines for analytics and AI/ML initiatives, and partnering with business users on LLM/GenAI use cases. • Bachelor's degree in Information Systems, IT, or a related technical discipline — or equivalent demonstrated technical proficiency. • Strong interpersonal and communication skills; fluent in English (oral and written). Preferred / Nice-to-Have • Python, cloud data warehouse experience (e.g., Snowflake, Synapse), Spark SQL • Performance tuning, partitioning, and optimization. • Modern LLM architectures and GenAI frameworks — retrieval-augmented generation (RAG), embeddings and vector databases, prompt orchestration, and integrating LLMs into data products and pipelines. • Familiarity with using LLMs in automation development and with vector/embedding data. • Experience in the Oil & Gas domain.