MindQuest Technology Solutions LLC

Spark Developer

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Spark Developer (Search Integration) on a contract basis in Pleasanton, CA. Requires expertise in Spark (Scala/PySpark), OpenSearch/Algolia, and big data ecosystems. Pay rate and contract length are unspecified. Hybrid work model with onsite requirements.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
April 7, 2026
🕒 - Duration
Unknown
-
🏝️ - Location
Hybrid
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Pleasanton, CA
-
🧠 - Skills detailed
#HDFS (Hadoop Distributed File System) #Kafka (Apache Kafka) #PostgreSQL #Big Data #Spark SQL #Databases #Apache Spark #Indexing #Elasticsearch #Batch #Datasets #Spark (Apache Spark) #Strategy #Scala #Data Pipeline #DynamoDB #S3 (Amazon Simple Storage Service) #"ETL (Extract #Transform #Load)" #NoSQL #OpenSearch #Python #JSON (JavaScript Object Notation) #PySpark #SQL (Structured Query Language)
Role description
Hi All, Job Title: Spark Developer (Search Integration) Position Type: Contract . Location: Pleasanton, CA – Onsite- Hybrid - Looking for resources who can work 3 days hybrid onsite in Pleasanton, CA Note: • We are looking for a Spark Developer with OpenSearch/Algolia expertise who can design, build, and optimize scalable data pipelines to ingest, transform, and index large-scale datasets into search engines for fast retrieval. • Utilize Scala/Python and Spark SQL to process data from various sources (S3, Kafka) for real-time indexing in OpenSearch or Algolia. Focus: Spark (ETL/Streaming) + Search Engines (OpenSearch/Algolia) Objective: Power real-time, relevant, and fast search experiences.. Responsibilities • Data Pipelines: Design, develop, and maintain high-performance Spark jobs (Scala or PySpark) to process, transform, and clean large datasets. • Index Management: Ingest data into OpenSearch or Algolia, optimizing index strategy, mapping, and document structuring for maximum search efficiency. • Optimization: Tune Spark applications (data partitioning, caching, shuffle tuning) and search engines (query performance, indexing speed). • Streaming/Batch: Implement both batch ETL jobs and real-time streaming solutions (Spark Streaming/Kafka) to keep search indexes updated. • Collaboration: Work with backend teams to integrate search functionality into applications and debug search relevance issues. Required Skills and Qualification: • Core Spark: Strong experience with Apache Spark RDD/DataFrame APIs, Scala, or Python (PySpark). • Search Tech: Experience in indexing, querying, and managing clusters in OpenSearch (formerly Elasticsearch) or Algolia. • Big Data Ecosystem: Proficiency with HDFS, S3, Kafka, and data warehousing solutions. • Database Knowledge: Experience with SQL/NoSQL databases (PostgreSQL, Cassandra, DynamoDB). • Performance Tuning: Expertise in optimizing distributed systems and troubleshooting latency issues Typical Projects: • Building a product catalog search engine using Spark to transform ERP data into JSON, indexed into OpenSearch. • Implementing real-time update pipelines to sync user activity logs into Algolia for instant search results. • Optimizing large-scale data re-indexing processes to reduce latency. Please do share me your updated resume to srikanth@mqtechsolutions.com