Siri InfoSolutions, Inc.

Senior Data Engineer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Senior Data Engineer with 8-10 years of experience, focusing on Python and Hadoop. Contract length is unspecified, with a pay rate of "X". Key skills include Python, SQL, and big data technologies.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
April 2, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Unknown
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Pittsburgh, PA
-
🧠 - Skills detailed
#Data Accuracy #Scala #Security #Data Science #PostgreSQL #Big Data #Data Management #Cloud #Datasets #"ETL (Extract #Transform #Load)" #AWS (Amazon Web Services) #Automation #Data Engineering #Distributed Computing #Database Performance #Data Pipeline #Kafka (Apache Kafka) #Databases #Data Architecture #Data Processing #Data Governance #Informatica BDM (Big Data Management) #Azure #Apache Spark #HDFS (Hadoop Distributed File System) #NoSQL #Python #YARN (Yet Another Resource Negotiator) #Data Quality #Hadoop #Programming #SQL (Structured Query Language) #GCP (Google Cloud Platform) #Spark (Apache Spark) #Deployment #Spark SQL #HBase #MySQL #PySpark #Impala #Java
Role description
Job Description Exp: 8-10 years in required skills MUST HAVE- Strong proficiency in Python (including Py-Spark) and SQL is essential| with additional experience in Java or Scala being a plus. Role Descriptions: A Senior Data Engineer focusing on Python and Hadoop is responsible for designing| building| and maintaining robust data pipelines and infrastructure using the Hadoop ecosystem and advanced Python programming. This role involves leading technical projects| ensuring data quality and scalability| and collaborating with cross-functional teams. Key Responsibilities- Data Pipeline Development Design| build| and maintain scalable ETL/ ELT processes and data pipelines using Python| SQL| and big data technologies (Hadoop| Spark| Hive| Kafka). Big Data Management Work within the Hadoop technology stack| including HDFS| Hive| Yarn| Impala| and HBase| to manage and store large datasets. Performance Optimization Automation Troubleshoot| tune| and optimize data processing jobs and database performance| while identifying opportunities for automation in testing and deployment processes (CICD). Architecture Design Lead in developing data solutions and designing data service infrastructure| contributing to overall data architecture decisions. Collaborate with data scientists| analysts| and business stakeholders to understand data requirements| and provide technical guidance and mentorship to junior team members. Data Quality Governance Ensure data accuracy| integrity| and security by implementing validation checks and adhering to data governance standards. Experience- Typically requires 5 years of experience in data engineering or a related role| with a proven track record of deploying and managing large-scale distributed systems. Programming Languages- Strong proficiency in Python (including Py-Spark) and SQL is essential| with additional experience in Java or Scala being a plus. Big Data Technologies Expertise in the Hadoop ecosystem and components| along with distributed computing frameworks like Apache Spark and Kafka| is crucial. Databases Cloud Platforms Experience with relational (e.g.| PostgreSQL| MySQL) and NoSQL databases| and familiarity with cloud services (AWS| GCP| or Azure). Problem-Solving High-level analytical and problem-solving skills to resolve complex technical data issues. Required Skills: Python, pyspark, SQL, NOSQL, MySQL, AWS, GCP, Azure, PostgreSQL