Smart IT Frame LLC

Data Engineering - Apache Spark, PySpark, Scala, ETL and SQL Expertise

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is a Data Engineering position focused on Apache Spark, PySpark, Scala, ETL, and SQL expertise, offered as a 6-month contract-to-hire in Pittsburgh (Hybrid). Requires 8 years of experience, financial domain knowledge, and skills in AWS services.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
March 5, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Hybrid
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Pittsburgh, PA
-
🧠 - Skills detailed
#PySpark #Data Quality #AWS EMR (Amazon Elastic MapReduce) #Hadoop #Cloud #Data Engineering #Apache Spark #AWS (Amazon Web Services) #Amazon ECS (Amazon Elastic Container Service) #Data Pipeline #EC2 #Python #Spark (Apache Spark) #S3 (Amazon Simple Storage Service) #Microservices #Data Storage #Scala #Data Processing #"ETL (Extract #Transform #Load)" #HDFS (Hadoop Distributed File System) #Data Access #SQL (Structured Query Language) #Java #Storage
Role description
Job Title: Data Engineering - Apache Spark, PySpark, Scala, ETL and SQL expertise Location: Pittsburgh – Pennsylvania (Hybrid – 4 days in a week) Employment Type: 6 Contract-To-Hire About Smart IT Frame: At Smart IT Frame, we connect top talent with leading organizations across the USA. With over a decade of staffing excellence, we specialize in IT, healthcare, and professional roles, empowering both clients and candidates to grow together. • Senior Developer 8 years exp who has advanced handson skill in distributed data processing using pyspark scala. • Worked with data processing ETL transformation logic Strong SQL. • Prior experience in financial domain and entityidentity resolution is preferred • Design and develop scalable data pipelines using Apache Spark and PySpark ensuring high performance and efficiency • Optimize Spark jobs and clusters for performance tuning resource utilization and cost efficiency • Implement and manage data storage solutions on Hadoop Distributed File System HDFS and cloud storage services like Amazon S3 • Utilize Scala or Java to write and optimize core data processing logic complex transformations and highly performant backend services • Ensure strict data quality governance and validation checks are integrated into all data pipelines to maintain accuracy and reliability of the data ecosystem • Deploy manage and scale data applications and services using various AWS compute resources including Amazon EC2 instances AWS EMR clusters and containerized environments via Amazon ECS • Design and develop robust highperformance RESTful APIs and microservices using ScalaJava to facilitate realtime data access and transactional services Mandatory Skills: ETL Concepts, Python, Scala