

Siri InfoSolutions, Inc.
Senior Data Engineer
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Senior Data Engineer with 8-10 years of experience, focusing on Python and Hadoop. Contract length is unspecified, with a pay rate of "X". Key skills include Python, SQL, and big data technologies.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
April 2, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Pittsburgh, PA
-
🧠 - Skills detailed
#Data Accuracy #Scala #Security #Data Science #PostgreSQL #Big Data #Data Management #Cloud #Datasets #"ETL (Extract #Transform #Load)" #AWS (Amazon Web Services) #Automation #Data Engineering #Distributed Computing #Database Performance #Data Pipeline #Kafka (Apache Kafka) #Databases #Data Architecture #Data Processing #Data Governance #Informatica BDM (Big Data Management) #Azure #Apache Spark #HDFS (Hadoop Distributed File System) #NoSQL #Python #YARN (Yet Another Resource Negotiator) #Data Quality #Hadoop #Programming #SQL (Structured Query Language) #GCP (Google Cloud Platform) #Spark (Apache Spark) #Deployment #Spark SQL #HBase #MySQL #PySpark #Impala #Java
Role description
Job Description
Exp: 8-10 years in required skills
MUST HAVE- Strong proficiency in Python (including Py-Spark) and SQL is essential| with additional experience in Java or Scala being a plus.
Role Descriptions:
A Senior Data Engineer focusing on Python and Hadoop is responsible for designing| building| and maintaining robust data pipelines and infrastructure using the Hadoop ecosystem and advanced Python programming.
This role involves leading technical projects| ensuring data quality and scalability| and collaborating with cross-functional teams.
Key Responsibilities-
Data Pipeline Development Design| build| and maintain scalable ETL/ ELT processes and data pipelines using Python| SQL| and big data technologies (Hadoop| Spark| Hive| Kafka).
Big Data Management Work within the Hadoop technology stack| including HDFS| Hive| Yarn| Impala| and HBase| to manage and store large datasets.
Performance Optimization Automation Troubleshoot| tune| and optimize data processing jobs and database performance| while identifying opportunities for automation in testing and deployment processes (CICD).
Architecture Design Lead in developing data solutions and designing data service infrastructure| contributing to overall data architecture decisions.
Collaborate with data scientists| analysts| and business stakeholders to understand data requirements| and provide technical guidance and mentorship to junior team members.
Data Quality Governance Ensure data accuracy| integrity| and security by implementing validation checks and adhering to data governance standards.
Experience-
Typically requires 5 years of experience in data engineering or a related role| with a proven track record of deploying and managing large-scale distributed systems.
Programming Languages-
Strong proficiency in Python (including Py-Spark) and SQL is essential| with additional experience in Java or Scala being a plus.
Big Data Technologies Expertise in the Hadoop ecosystem and components| along with distributed computing frameworks like Apache Spark and Kafka| is crucial.
Databases Cloud Platforms Experience with relational (e.g.| PostgreSQL| MySQL) and NoSQL databases| and familiarity with cloud services (AWS| GCP| or Azure).
Problem-Solving High-level analytical and problem-solving skills to resolve complex technical data issues.
Required Skills: Python, pyspark, SQL, NOSQL, MySQL, AWS, GCP, Azure, PostgreSQL
Job Description
Exp: 8-10 years in required skills
MUST HAVE- Strong proficiency in Python (including Py-Spark) and SQL is essential| with additional experience in Java or Scala being a plus.
Role Descriptions:
A Senior Data Engineer focusing on Python and Hadoop is responsible for designing| building| and maintaining robust data pipelines and infrastructure using the Hadoop ecosystem and advanced Python programming.
This role involves leading technical projects| ensuring data quality and scalability| and collaborating with cross-functional teams.
Key Responsibilities-
Data Pipeline Development Design| build| and maintain scalable ETL/ ELT processes and data pipelines using Python| SQL| and big data technologies (Hadoop| Spark| Hive| Kafka).
Big Data Management Work within the Hadoop technology stack| including HDFS| Hive| Yarn| Impala| and HBase| to manage and store large datasets.
Performance Optimization Automation Troubleshoot| tune| and optimize data processing jobs and database performance| while identifying opportunities for automation in testing and deployment processes (CICD).
Architecture Design Lead in developing data solutions and designing data service infrastructure| contributing to overall data architecture decisions.
Collaborate with data scientists| analysts| and business stakeholders to understand data requirements| and provide technical guidance and mentorship to junior team members.
Data Quality Governance Ensure data accuracy| integrity| and security by implementing validation checks and adhering to data governance standards.
Experience-
Typically requires 5 years of experience in data engineering or a related role| with a proven track record of deploying and managing large-scale distributed systems.
Programming Languages-
Strong proficiency in Python (including Py-Spark) and SQL is essential| with additional experience in Java or Scala being a plus.
Big Data Technologies Expertise in the Hadoop ecosystem and components| along with distributed computing frameworks like Apache Spark and Kafka| is crucial.
Databases Cloud Platforms Experience with relational (e.g.| PostgreSQL| MySQL) and NoSQL databases| and familiarity with cloud services (AWS| GCP| or Azure).
Problem-Solving High-level analytical and problem-solving skills to resolve complex technical data issues.
Required Skills: Python, pyspark, SQL, NOSQL, MySQL, AWS, GCP, Azure, PostgreSQL






