

Tential Solutions
Big Data Engineer
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Big Data Engineer with a contract length of "unknown" and a pay rate of "$/hour". The position requires expertise in Hadoop, Spark, Python or Scala, cloud technologies, and SQL, along with a Bachelor's degree and five years of experience.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
Unknown
-
ποΈ - Date
May 7, 2026
π - Duration
Unknown
-
ποΈ - Location
Unknown
-
π - Contract
Unknown
-
π - Security
Unknown
-
π - Location detailed
Rockville, CT
-
π§ - Skills detailed
#Automated Testing #Data Science #Datasets #Cloud #System Testing #ChatGPT #Automation #"ETL (Extract #Transform #Load)" #Data Integration #Lambda (AWS Lambda) #AI (Artificial Intelligence) #Kanban #Data Ingestion #Computer Science #Debugging #Python #Programming #AWS (Amazon Web Services) #Athena #Data Processing #Data Quality #SQL (Structured Query Language) #Big Data #Data Architecture #Apache Spark #GitHub #Agile #Data Engineering #Data Analysis #Data Pipeline #S3 (Amazon Simple Storage Service) #Trino #Scala #Complex Queries #Scrum #Hadoop #Java #Storage #Spark (Apache Spark) #Quality Assurance
Role description
We are seeking a highly skilled and experienced Big Data Engineer to design, develop, and optimize large-scale data processing systems. In this role, you will work closely with cross-functional teams to architect data pipelines, implement data integration solutions, and ensure the performance, scalability, and reliability of big data platforms. The ideal candidate will have deep expertise in distributed systems, cloud platforms, and modern big data technologies such as Hadoop, Spark etc
Responsibilities
β’ Design, develop, and maintain large-scale data processing pipelines using Big Data technologies (e.g., Hadoop, Spark, Python, Scala)
β’ Implement data ingestion, storage, transformation, and analysis solutions that are scalable, efficient, and reliable
β’ Stay current with industry trends and emerging Big Data technologies to continuously improve the data architecture
β’ Collaborate with cross-functional teams to understand business requirements and translate them into technical solutions
β’ Optimize and enhance existing data pipelines for performance, scalability, and reliability
β’ Develop automated testing frameworks and implement continuous testing for data quality assurance
β’ Conduct unit, integration, and system testing to ensure the robustness and accuracy of data pipelines
β’ Work with data scientists and analysts to support data-driven decision-making across the organization
β’ Write and maintain automated unit, integration, and end-to-end tests
β’ Monitor and troubleshoot data pipelines in production environments to identify and resolve issues
Essential Technical Skills
AI Tool Proficiency
β’ Hands-on experience with AI development tools (GitHub Copilot, Q Developer, ChatGPT, Claude, etc.)
Technical Background
β’ Strong software development background with the ability to contribute to technical discussions
Agile Methodology
β’ Extensive experience with Scrum, Kanban, and continuous improvement practices
Big Data Technologies
β’ Experience with big data technologies such as Hadoop, Spark, Hive, and Trino
β’ Understanding of common challenges such as:
β’ Data skew and mitigation strategies
β’ Working with massive data volumes (petabyte scale)
β’ Troubleshooting job failures related to resource constraints, bad data, and scalability issues
β’ Ability to provide real-world debugging and mitigation examples
AI Skills
β’ Prompt engineering: ability to craft effective prompts for AI coding assistants and analysis tools
β’ AI workflow design: experience leveraging AI to redesign development processes
β’ Data analysis: ability to interpret AI-generated insights and translate them into actionable improvements
β’ Change management: experience supporting AI adoption and workflow transformation
SQL Skills
β’ Proficiency in SQL including window functions, multi-table joins, and aggregations
β’ Ability to write and optimize complex queries
β’ Experience handling edge cases such as NULLs, duplicates, and ordering
Apache Spark
β’ Strong understanding of Spark architecture (executors, tasks, stages, DAG)
β’ Experience with performance tuning techniques (partitioning, caching, broadcast joins)
β’ Ability to troubleshoot slow or failing jobs and resolve resource bottlenecks
β’ Experience optimizing jobs for large-scale datasets
Cloud Technologies
β’ Experience with AWS services such as S3, EMR, Glue, Lambda, Athena, etc.
β’ Experience working with S3 in Spark environments (file formats, consistency challenges, etc.)
β’ Familiarity with EKS and serverless technologies
Programming (Python or Scala)
β’ Ability to write clean, modular, and performant code
β’ Experience with functional programming concepts (immutability, higher-order functions)
β’ Understanding of collections, concurrency, and memory management
β’ Experience building scalable data processing systems
Education & Experience
β’ Bachelorβs degree in Computer Science, Information Systems, or related discipline with at least five (5) years of related experience, or equivalent training and/or work experience
β’ Masterβs degree and financial services industry experience preferred
β’ Demonstrated technical expertise in object-oriented and database technologies leading to enterprise-quality solutions
β’ Experience developing enterprise solutions in an iterative or Agile environment
β’ Extensive knowledge of test automation, build automation, and configuration management frameworks
β’ Strong written and verbal communication skills
β’ Proven ability to build effective working relationships and improve quality of work products
β’ Strong organizational skills with the ability to manage competing priorities
β’ Ability to learn new technologies quickly and work in a fast-paced environment
β’ Experience with object-oriented programming languages such as Java, Scala, or Python
Nice to Have
β’ Experience managing production data pipelines and ETL systems
β’ Experience with CI/CD pipelines
β’ Experience writing test cases
β’ AWS certifications
We are seeking a highly skilled and experienced Big Data Engineer to design, develop, and optimize large-scale data processing systems. In this role, you will work closely with cross-functional teams to architect data pipelines, implement data integration solutions, and ensure the performance, scalability, and reliability of big data platforms. The ideal candidate will have deep expertise in distributed systems, cloud platforms, and modern big data technologies such as Hadoop, Spark etc
Responsibilities
β’ Design, develop, and maintain large-scale data processing pipelines using Big Data technologies (e.g., Hadoop, Spark, Python, Scala)
β’ Implement data ingestion, storage, transformation, and analysis solutions that are scalable, efficient, and reliable
β’ Stay current with industry trends and emerging Big Data technologies to continuously improve the data architecture
β’ Collaborate with cross-functional teams to understand business requirements and translate them into technical solutions
β’ Optimize and enhance existing data pipelines for performance, scalability, and reliability
β’ Develop automated testing frameworks and implement continuous testing for data quality assurance
β’ Conduct unit, integration, and system testing to ensure the robustness and accuracy of data pipelines
β’ Work with data scientists and analysts to support data-driven decision-making across the organization
β’ Write and maintain automated unit, integration, and end-to-end tests
β’ Monitor and troubleshoot data pipelines in production environments to identify and resolve issues
Essential Technical Skills
AI Tool Proficiency
β’ Hands-on experience with AI development tools (GitHub Copilot, Q Developer, ChatGPT, Claude, etc.)
Technical Background
β’ Strong software development background with the ability to contribute to technical discussions
Agile Methodology
β’ Extensive experience with Scrum, Kanban, and continuous improvement practices
Big Data Technologies
β’ Experience with big data technologies such as Hadoop, Spark, Hive, and Trino
β’ Understanding of common challenges such as:
β’ Data skew and mitigation strategies
β’ Working with massive data volumes (petabyte scale)
β’ Troubleshooting job failures related to resource constraints, bad data, and scalability issues
β’ Ability to provide real-world debugging and mitigation examples
AI Skills
β’ Prompt engineering: ability to craft effective prompts for AI coding assistants and analysis tools
β’ AI workflow design: experience leveraging AI to redesign development processes
β’ Data analysis: ability to interpret AI-generated insights and translate them into actionable improvements
β’ Change management: experience supporting AI adoption and workflow transformation
SQL Skills
β’ Proficiency in SQL including window functions, multi-table joins, and aggregations
β’ Ability to write and optimize complex queries
β’ Experience handling edge cases such as NULLs, duplicates, and ordering
Apache Spark
β’ Strong understanding of Spark architecture (executors, tasks, stages, DAG)
β’ Experience with performance tuning techniques (partitioning, caching, broadcast joins)
β’ Ability to troubleshoot slow or failing jobs and resolve resource bottlenecks
β’ Experience optimizing jobs for large-scale datasets
Cloud Technologies
β’ Experience with AWS services such as S3, EMR, Glue, Lambda, Athena, etc.
β’ Experience working with S3 in Spark environments (file formats, consistency challenges, etc.)
β’ Familiarity with EKS and serverless technologies
Programming (Python or Scala)
β’ Ability to write clean, modular, and performant code
β’ Experience with functional programming concepts (immutability, higher-order functions)
β’ Understanding of collections, concurrency, and memory management
β’ Experience building scalable data processing systems
Education & Experience
β’ Bachelorβs degree in Computer Science, Information Systems, or related discipline with at least five (5) years of related experience, or equivalent training and/or work experience
β’ Masterβs degree and financial services industry experience preferred
β’ Demonstrated technical expertise in object-oriented and database technologies leading to enterprise-quality solutions
β’ Experience developing enterprise solutions in an iterative or Agile environment
β’ Extensive knowledge of test automation, build automation, and configuration management frameworks
β’ Strong written and verbal communication skills
β’ Proven ability to build effective working relationships and improve quality of work products
β’ Strong organizational skills with the ability to manage competing priorities
β’ Ability to learn new technologies quickly and work in a fast-paced environment
β’ Experience with object-oriented programming languages such as Java, Scala, or Python
Nice to Have
β’ Experience managing production data pipelines and ETL systems
β’ Experience with CI/CD pipelines
β’ Experience writing test cases
β’ AWS certifications






