Santcore Technologies

AI Big Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for an AI Big Data Engineer with a contract length of "unknown" and a pay rate of "unknown." Located in Tysons, VA or Rockville, MD (Hybrid 3 days/week), candidates must have 5+ years of experience, strong communication skills, and AWS certifications.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
560
-
🗓️ - Date
May 8, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Hybrid
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Rockville, MD
-
🧠 - Skills detailed
#Hadoop #Lambda (AWS Lambda) #AWS (Amazon Web Services) #Storage #Data Ingestion #"ETL (Extract #Transform #Load)" #Data Processing #Java #Programming #Trino #Cloud #Agile #Kanban #System Testing #Automated Testing #GitHub #Spark (Apache Spark) #Computer Science #Big Data #SQL (Structured Query Language) #Scrum #Data Pipeline #Scala #Automation #Complex Queries #Datasets #Data Quality #Athena #Python #ChatGPT #AI (Artificial Intelligence) #Data Science #Data Engineering #S3 (Amazon Simple Storage Service) #Apache Spark
Role description
Job Title: AI Big Data Engineer Location: Tysons, VA or Rockville, MD (Hybrid 3 Days/Week Onsite) Interview Process 1st Round: 1-hour Zoom with Manager 2nd Round: Technical/Panel Onsite in Rockville Work Authorization: Any (H-1B allowed) Strong communication required Priority: Local candidates only (2 slots available first qualified submission prioritized) Role Overview We are seeking a highly skilled AI Big Data Engineer to design, develop, and optimize large-scale data processing systems. The role involves building scalable data pipelines, implementing data integration solutions, and ensuring performance, scalability, and reliability of big data platforms in a fast-paced Agile environment. The engineer will work closely with cross-functional teams, including data scientists and analysts, to deliver data-driven solutions and support AI-enabled workflows. Key Responsibilities Design, develop, and maintain large-scale data processing pipelines using Hadoop, Spark, Python, and Scala. Implement data ingestion, storage, transformation, and analytics solutions ensuring scalability and reliability. Optimize existing data pipelines for performance and efficiency. Collaborate with cross-functional teams to translate business requirements into technical solutions. Develop automated testing frameworks and ensure continuous data quality validation. Perform unit, integration, and system testing of data pipelines. Support production data pipelines and troubleshoot failures and performance issues. Work with data scientists and analysts to support data-driven decision-making. Stay current with emerging big data and AI technologies to improve architecture and workflows. Required Qualifications Bachelor s degree in Computer Science, Information Systems, or related field with 5+ years of experience (or equivalent). Strong experience in Object-Oriented Programming and database concepts. Experience delivering enterprise-grade solutions in Agile environments. Strong experience in software engineering best practices including test automation, build automation, and configuration management. Strong communication and collaboration skills. Ability to work in a fast-paced environment and manage competing priorities. Experience with Java, Scala, or Python. Technical Skills Big Data Technologies Hadoop, Spark, Hive, Trino Understanding of distributed systems challenges including data skew, large-scale datasets, and resource limitations Experience troubleshooting Spark job failures and scalability issues AI Tool Proficiency Hands-on experience with AI tools such as GitHub Copilot, ChatGPT, Claude, or similar Experience in AI-assisted development workflows and prompt engineering Ability to interpret AI-generated outputs and apply them effectively SQL Skills Strong knowledge of SQL including window functions, joins, aggregations Ability to write and optimize complex queries on the spot Apache Spark Strong understanding of Spark architecture (executors, stages, DAG, tasks) Experience with performance tuning (partitioning, caching, broadcast joins, etc.) Experience optimizing Spark jobs for large-scale datasets Cloud Technologies (AWS) Experience with AWS services such as S3, EMR, Glue, Lambda, Athena Experience working with Spark on S3 Exposure to EKS and serverless architecture Programming Strong coding experience in Python or Scala Knowledge of functional programming concepts (immutability, higher-order functions) Preferred Skills Experience managing production ETL/data pipelines Experience with CI/CD pipelines Experience writing automated test cases AWS certifications (mandatory as per requirement) Experience with Agile methodologies (Scrum/Kanban) Additional Requirements Experience in financial services domain (preferred) Must provide updated resume and redacted photo ID Strong communication skills are critical Must be able to work onsite 3 days per week Must be able to participate in onsite technical interview in Rockville