Data Architect

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Architect with a contract length of "unknown", offering a pay rate of "unknown". Key requirements include 15+ years in software engineering, expertise in Big Data technologies, and strong leadership skills.
🌎 - Country
United States
πŸ’± - Currency
$ USD
-
πŸ’° - Day rate
-
πŸ—“οΈ - Date discovered
August 21, 2025
πŸ•’ - Project duration
Unknown
-
🏝️ - Location type
Unknown
-
πŸ“„ - Contract type
Unknown
-
πŸ”’ - Security clearance
Unknown
-
πŸ“ - Location detailed
New York, United States
-
🧠 - Skills detailed
#Java #Datasets #DevOps #Storage #Data Engineering #Apache Kafka #GCP (Google Cloud Platform) #Leadership #Synapse #Redshift #Data Quality #Hadoop #Lambda (AWS Lambda) #SQL (Structured Query Language) #AWS (Amazon Web Services) #Python #Big Data #PySpark #BigQuery #Data Lake #Databricks #Cloud #S3 (Amazon Simple Storage Service) #Data Automation #HDFS (Hadoop Distributed File System) #Data Analysis #Spark (Apache Spark) #Kafka (Apache Kafka) #AWS Kinesis #Anomaly Detection #Azure #Automation #Scala #Data Architecture #Spark SQL #Data Governance #Apache Spark #Apache Iceberg #Computer Science #Delta Lake
Role description
Required Qualifications β€’ Bachelor’s or master’s degree in computer science, Engineering, or a related quantitative field. β€’ 15+ years of progressive experience in software engineering, with at least 5+ years in a Technical Architect, Lead Data Architect, or Principal Data Engineer role, specifically focused on data quality, data governance, or data platform architecture. β€’ Exceptional hands-on proficiency and deep architectural understanding of the Big Data ecosystem: β€’ Apache Spark (PySpark, Scala, or Java): Expert-level experience with Spark SQL, DataFrames/Datasets, streaming, and advanced performance tuning techniques. β€’ Distributed Storage & Processing: Hadoop, HDFS, S3, Delta Lake, Apache Iceberg, or similar data lake technologies. β€’ Streaming Technologies: Apache Kafka, AWS Kinesis, or similar high-throughput messaging systems. β€’ Cloud Data Platforms: Extensive experience designing and implementing solutions on AWS (e.g., EMR, Glue, Redshift, Lambda, Step Functions, S3), Azure (e.g., Databricks, Synapse Analytics, Data Lake Storage), or GCP (e.g., Dataproc, BigQuery, Cloud Storage). β€’ Expert-level hands-on experience with Advanced SQL for complex data analysis, validation, and optimization. β€’ Expert-level hands-on experience with Python for data engineering, automation, and developing robust data quality solutions. β€’ Proven track record of defining, designing, and implementing large-scale data automation frameworks. β€’ Demonstrated expertise in data quality engineering principles, methodologies, and tools (profiling, validation, cleansing, reconciliation, anomaly detection). β€’ Experience in leading and mentoring technical teams, fostering a culture of technical excellence and continuous improvement. β€’ Strong understanding of software development lifecycle (SDLC), DevOps practices, and integrating quality gates into CI/CD pipelines. β€’ Excellent communication, presentation, and interpersonal skills, with the ability to articulate complex technical concepts to diverse audiences, including senior leadership and non-technical stakeholders.