

Data Lakehouse Engineer
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Lakehouse Engineer with a contract length of "X months" and a pay rate of "$X per hour." Remote work is allowed. Key skills include AWS, Python, ETL, and data governance. Experience with HIPAA and sensitive data is required.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
-
ποΈ - Date discovered
June 14, 2025
π - Project duration
Unknown
-
ποΈ - Location type
Unknown
-
π - Contract type
Unknown
-
π - Security clearance
Unknown
-
π - Location detailed
Montpelier, VT
-
π§ - Skills detailed
#SQL (Structured Query Language) #Metadata #Data Governance #Python #Strategy #SQL Server #MySQL #Data Strategy #Data Engineering #Security #S3 (Amazon Simple Storage Service) #AWS Glue #Data Lakehouse #Data Management #Storage #SharePoint #"ETL (Extract #Transform #Load)" #Scala #Automation #SaaS (Software as a Service) #AWS (Amazon Web Services) #Data Catalog #Data Lake #IAM (Identity and Access Management) #Data Ingestion #R #Data Security #Redshift
Role description
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
- Item 1
- Item 2
- Item 3
Unordered list
- Item A
- Item B
- Item C
Bold text
Emphasis
Superscript
Subscript
This
SOW will implement a data lake that is scalable for use across State agencies and will
support the secure storage and analytics of sensitive data such as HIPAA, MARS-E, PHI,
PII, and other confidential data. ADS will work with the Contractor's Amazon Web
Service (AWS) professional services team such as their data engineer, platform engineer,
and solution architect, to design and implement the Lakehouse environment
REQUIREMENTS:
Implement a Data Lake House in the AWS Commercial environment as a template solution
that will serve as the model for the next generation of Vermont's data platform, building on
existing work available as templates and deployed in AWS Gov for the Department of
Public Safety. The following features are implemented other than adjustments to take
advantage of features available in Commercial that are unavailable in Gov.
Approach to all layers:
β’ Design and implement a data lake in AWS that is compliant with requirements and best
practices protecting sensitive data, including health data and other sensitive data, in
collaboration with the Vermont Technical Lead, using the latest Lake House standards
and best practices.
β’ The lake house will be implemented in a way that the State can maintain, enhance, and
expand it on its own.
β’ The lake house will be implemented using templates that the State can use to refine and
implement its data strategy in other domains.
β’ The lake house will be implemented using a medallion architecture.
β’ All layers will be scalable as needed to ensure capabilities and cost can be tuned.
β’ The State prefers to use Python-based solutions as much as possible, while also providing
support for R and other data management languages.
β’ Outputs of the development work are not intended be used for mission critical operational
decision support and control system automation at this time. Data will be ingested in a
read-only manner and will be aggregated for reporting and analytical purposes only.
Data Security Layer
β’ Design and implement Identity and Access Management roles and processes.
β’ Design and implement data governance processes that are flexible to future security
requirements and evolving use cases.
β’ Create technical templates to implement security controls
β’ The existing templates are designed to comply with CJIS standards, the commercial lake
will be focused on complying with other data standards. Some adjustments may be
necessary.
Data Catalog Layer
β’ Design and implement solution to govern and catalog data using AWS Glue and other
technologies as recommended by the implementor.
β’ The data catalog layer will resolve issues in other layers due to data schema drift in AWS
Glue for use with reporting and analytical needs.
o Design and implement crawlers and catalog to store and maintain schema
information.
o Consume the metadata catalog to control permissions and processing in other
components of the solution.
o Create crawler templates for data sources.
Bronze: Data Ingestion and Landing Zon
Request for Proposal (RFP)
Page 5 of 11
β’ This layer will be based on S3, combined with AWS warehouse technologies (Redshift and
other technologies as recommended by the implementer).
β’ Design, implement, and templatize ingestion approaches for a variety of sources, including:
o Operational Database Sources, MySQL, SQL Server
o SaaS Applications
o File Shares (SharePoint, OneDrive, and Onprem)
o Stream Data Sources
o System Templates of ingestion processes
β’ Data Extract, Load and Transform (ELT) process for loading from source systems.
β’ Templates for creating ELT processes for future processes.
β’ Implement in a way that can be integrated with a future State data mastery solution.
Current System Design Diagram:
Proposed Services β Work Plan
a) Proposed Services: A description of the Contractor's proposed services to
accomplish the specified work requirements, including dates of completion.
b) Risk Assessment: An assessment of any risks inherent in the work requirements and
actions to mitigate these risks.
c) Proposed Tools: A description of proposed tools that may be used to facilitate the
work.
d) Tasks and Deliverables: A description of and the schedule for each task and
deliverable, illustrated by a Gantt chart. Start and completion dates for each task,
milestone, and deliverable shall be indicated. Must include deliverables specified in
SOW-RFP as well as other deliverables that may be proposed by Contractor.
e) Work Breakdown Structure: A detailed work breakdown structure and staffing
schedule, with labor hours by skill category that will be applied to meet each
milestone and deliverable, and to accomplish all specified work requirements.
Proposed Personnel
a) Identify all personnel by their skill set who will be working on the project, include
resumes. Please do not provide personnel names.
b) Certification that all proposed personnel meet the minimum required qualifications and
possess the required certifications to complete the work as required.
c) Provide the titles of all key management personnel who will be involved with supervising
the services rendered under the Agreement. Please do not provide personnel names.