

Big Data Specialist
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Big Data Specialist in Pleasanton, CA, with a contract length of "unknown" and a pay rate of "unknown." Requires 4-5 years of experience in Python, Java, Scala, SQL, and building data pipelines using Hadoop components.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
-
ποΈ - Date discovered
June 13, 2025
π - Project duration
Unknown
-
ποΈ - Location type
On-site
-
π - Contract type
Unknown
-
π - Security clearance
Unknown
-
π - Location detailed
Pleasanton, CA
-
π§ - Skills detailed
#API (Application Programming Interface) #REST (Representational State Transfer) #Impala #NLP (Natural Language Processing) #Java #Spark (Apache Spark) #HBase #Cloudera #GitLab #Programming #Supervised Learning #AI (Artificial Intelligence) #Jenkins #RDBMS (Relational Database Management System) #Regression #Flask #Django #Scala #Unix #Reinforcement Learning #Unsupervised Learning #Big Data #Classification #ML (Machine Learning) #Python #Monitoring #Data Pipeline #Jupyter #SQL (Structured Query Language) #REST API #Data Science #Hadoop #Linux #Spark SQL #Jira #NumPy #Clustering #Sqoop (Apache Sqoop) #Pandas #Kafka (Apache Kafka) #Cloud
Role description
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
- Item 1
- Item 2
- Item 3
Unordered list
- Item A
- Item B
- Item C
Bold text
Emphasis
Superscript
Subscript
Direct Client
Location: Pleasanton, CA
Role: Big data Hadoop engineer
Must Haves:
β’ Strong experience in Big Data, Cloudera Distribution 7.x, RDBMS development
β’ 4-5 years of programming experience in Python, Java, Scala, and SQL is must.
β’ Strong experience building data pipelines using Hadoop components Sqoop, Hive, SOLR, MR, Impala, Spark, Spark SQL, HBase.
β’ Strong experience with REST API development using Python frameworks (Django, Flask, Fast API etc.) and Java Springboot frameworks
β’ Project experience in AI/Machine Learning and NLP development.
Deliverables or Tasks:
1. Provide vision, gather requirements and translate client user requirements into technical architecture.
1. Design and implement an integrated Big Data platform and analytics solution
1. Design and implement data pipelines to collect and transport data to the Big Data Platform.
1. Design, build and scale AI/Machine Learning systems across multiple domains.
1. Implement monitoring solution(s) for the Big Data platform to monitor health on the infrastructure
Technical Knowledge and Skills:
β’ Strong experience in Big Data, Cloudera Distribution 7.x, RDBMS
β’ 4-5 years of programming experience in Python, Java, Scala and SQL is must.
β’ Strong experience building data pipelines using Hadoop components Sqoop, Hive, SOLR, MR, Impala, Spark, Spark SQL., HBase.
β’ Strong experience with REST API development using Python frameworks (Django, Flask, Fast API etc.) and Java Springboot frameworks
β’ Micro Services/Web service development experience using Spring framework
β’ Experience with Dask, Numpy, Pandas, Scikit-Learn
β’ Proficient in Machine Learning Algorithms: Supervised Learning (Regression, Classification, SVM, Decision Trees etc.), Unsupervised Learning (Clustering) and Reinforcement Learning
β’ Strong experience working in Real-Time analytics like Spark/Kafka/Storm
β’ Experience with Gitlab, Jenkins, JIRA
β’ Expertise in Unix/Linux environment in writing scripts and schedule/execute jobs.
β’ Strong Experience with Data Science Notebooks like Jupyter, Zeppelin, RStudio. PyCharm etc.