

E-IT
AWS Data Architect (Healthcare/Life Science Exp)
β - Featured Role | Apply direct with Data Freelance Hub
This role is for an AWS Data Architect with 12+ years of experience in data engineering, focusing on healthcare/life sciences. Contract length is unspecified, with a pay rate of "unknown". Key skills include PySpark, AWS Redshift, and Python.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
Unknown
-
ποΈ - Date
October 17, 2025
π - Duration
Unknown
-
ποΈ - Location
On-site
-
π - Contract
Unknown
-
π - Security
Unknown
-
π - Location detailed
Tarrytown, NY
-
π§ - Skills detailed
#Kubernetes #Elasticsearch #Databricks #PySpark #Security #Data Ingestion #Data Engineering #Airflow #Virtualization #Docker #Compliance #SQL (Structured Query Language) #DevOps #Looker #Data Security #Data Architecture #Redshift #Apache Spark #Apache Airflow #Dremio #Spark (Apache Spark) #Jenkins #Data Migration #Strategy #Python #GIT #AWS (Amazon Web Services) #Data Governance #Migration #ML (Machine Learning)
Role description
Role: AWS Data Architect
Location: Tarrytown NY 10591 (100% Onsite)
Contract
Skills: PySpark, AWS Redshift, EMR, Databricks, Python, Healthcare/Pharma/Life Sciences domain
Job Description:
β’ Candidate should have strong 12+ Years of experience in data engineering architecture for large-scale platforms.
β’ Contribute towards defining platform roadmap/Architecture/ solution, Design, POCs, prototype, technical evaluation for tech stack finalization and guiding principle for the best practices etc.
β’ Data platform development strategy, Data migration strategy, Data validation strategy, To review code, checklist / coding standards etc.
β’ Creating data models to reduce system complexities and hence increase efficiency & reduce cost.
β’ Expertise with data ingestion/orchestration tools and working experience in Real-time processing Framework (Apache Spark), PySpark and in AWS Redshift, Apache Airflow and EMR etc
β’ Strong coding background in Python, SQL, PySpark. Proficiency in data virtualization (Dremio or similar).
β’ Experience with data governance and access control frameworks (Privacera, Apache Ranger, etc.).
β’ Knowledge of search & discovery platforms (Solr, Elasticsearch, Looker).
β’ Solid understanding of data security, authentication (Okta), and compliance frameworks.
β’ Familiarity with CI/CD pipelines and DevOps practices (Jenkins, Git, Docker, Kubernetes).
β’ Prior experience designing enterprise data platforms in healthcare, pharma, or regulated industries.
β’ Knowledge of machine learning pipelines and integration into data platforms.
Role: AWS Data Architect
Location: Tarrytown NY 10591 (100% Onsite)
Contract
Skills: PySpark, AWS Redshift, EMR, Databricks, Python, Healthcare/Pharma/Life Sciences domain
Job Description:
β’ Candidate should have strong 12+ Years of experience in data engineering architecture for large-scale platforms.
β’ Contribute towards defining platform roadmap/Architecture/ solution, Design, POCs, prototype, technical evaluation for tech stack finalization and guiding principle for the best practices etc.
β’ Data platform development strategy, Data migration strategy, Data validation strategy, To review code, checklist / coding standards etc.
β’ Creating data models to reduce system complexities and hence increase efficiency & reduce cost.
β’ Expertise with data ingestion/orchestration tools and working experience in Real-time processing Framework (Apache Spark), PySpark and in AWS Redshift, Apache Airflow and EMR etc
β’ Strong coding background in Python, SQL, PySpark. Proficiency in data virtualization (Dremio or similar).
β’ Experience with data governance and access control frameworks (Privacera, Apache Ranger, etc.).
β’ Knowledge of search & discovery platforms (Solr, Elasticsearch, Looker).
β’ Solid understanding of data security, authentication (Okta), and compliance frameworks.
β’ Familiarity with CI/CD pipelines and DevOps practices (Jenkins, Git, Docker, Kubernetes).
β’ Prior experience designing enterprise data platforms in healthcare, pharma, or regulated industries.
β’ Knowledge of machine learning pipelines and integration into data platforms.