Data Enginee

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer with a minimum of 12 years of experience, focusing on AI/ML data solutions on Google Cloud Platform. Contract length is unspecified, located onsite in San Jose, CA, with a strong emphasis on Python, SQL, and data governance.
🌎 - Country
United States
πŸ’± - Currency
$ USD
-
πŸ’° - Day rate
-
πŸ—“οΈ - Date discovered
May 31, 2025
πŸ•’ - Project duration
Unknown
-
🏝️ - Location type
On-site
-
πŸ“„ - Contract type
Unknown
-
πŸ”’ - Security clearance
Unknown
-
πŸ“ - Location detailed
San Jose, CA
-
🧠 - Skills detailed
#Cloud #NoSQL #GCP (Google Cloud Platform) #Security #Hadoop #GIT #Dataflow #Computer Science #Compliance #Data Governance #Scala #Monitoring #Keras #Programming #Big Data #Data Processing #ML (Machine Learning) #Storage #Python #PyTorch #Datasets #BigQuery #AI (Artificial Intelligence) #Data Engineering #"ETL (Extract #Transform #Load)" #Data Science #Data Lifecycle #Java #TensorFlow #Airflow #Spark (Apache Spark) #Data Pipeline #Data Warehouse #Batch #Deployment #SQL (Structured Query Language) #Apache Airflow #Version Control #Automation #Data Accuracy #Data Quality
Role description
Position: Data Engineer Location: San Jose, CA (Onsite) Contract Implementation partner Job Description: We are seeking a highly skilled and experienced Data Engineer with a strong background in AI/ML to design, build, and optimize robust data pipelines and infrastructure on the Google Cloud Platform (GCP). The ideal candidate will be passionate about leveraging data to power machine learning initiatives, ensuring data quality, accessibility, and scalability for advanced analytics and AI applications. MINIMUM 12 YEARS OF EXPERIENCE Responsibilities: Data Pipeline Development: Design, build, and maintain scalable, efficient, and reliable ETL/ELT data pipelines for batch and real-time processing using GCP services (e.g., Dataflow, Dataproc, Cloud Composer, Pub/Sub). AI/ML Data Preparation: Collaborate closely with Data Scientists and Machine Learning Engineers to understand data requirements for model training, evaluation, and serving. Prepare, transform, and curate large, diverse datasets (structured, unstructured, streaming) to optimize them for AI/ML workloads. GCP Ecosystem Expertise: Leverage a wide range of GCP data and AI/ML services, including: Data Warehousing & Storage: BigQuery (for analytics and BigQuery ML), Cloud Storage, Cloud SQL, Cloud Bigtable. Data Processing: Dataflow, Dataproc (Spark, Hadoop), Cloud Composer (Apache Airflow), Data Fusion. AI/ML Services: Vertex AI (for model training, deployment, MLOps, Pipelines, Workbench, AutoML), AI Platform, TensorFlow Enterprise, Keras, PyTorch. Data Governance & Quality: Implement and enforce data quality, security, and governance standards throughout the data lifecycle, ensuring data accuracy, consistency, and compliance with regulations. Performance Optimization: Monitor, troubleshoot, and optimize the performance and cost-effectiveness of data pipelines and AI/ML infrastructure. Automation & MLOps: Automate data processes, develop CI/CD pipelines for data and ML models, and contribute to MLOps best practices for seamless deployment and monitoring of AI/ML solutions. Collaboration & Communication: Work effectively with cross-functional teams, including Data Scientists, Analysts, Software Engineers, and Product Managers, to understand data needs and deliver impactful solutions. Innovation & Research: Stay up-to-date with the latest advancements in data engineering, AI/ML, and GCP technologies, continuously exploring and recommending new tools and approaches. Qualifications: Bachelor's or Master's degree in Computer Science, Data Engineering, or a related quantitative field. Proven experience as a Data Engineer, with a strong focus on building data solutions for AI/ML applications. In-depth knowledge and hands-on experience with Google Cloud Platform (GCP) data services (BigQuery, Dataflow, Dataproc, Cloud Storage, Cloud Composer, etc.). Strong proficiency in programming languages such as Python (essential), and experience with Scala or Java is a plus. Expertise in SQL and experience with various database technologies (relational, NoSQL, data warehouses). Familiarity with machine learning concepts, algorithms, and workflows (e.g., feature engineering, model training, evaluation, deployment). Experience with machine learning frameworks like TensorFlow or PyTorch. Understanding of distributed systems, big data technologies, and real-time data processing. Experience with version control systems (e.g., Git) and CI/CD practices. Excellent problem-solving, analytical, and communication skills. Google Cloud Professional Data Engineer certification is a strong plus. -- Thanks Prashant Bansal Raas infotek corporation 262 Chapman road, Suite 105A, Newark, DE-19702 Phone: 302-565-0188 Ext: 144, Email: Prashant.bansal@raasinfotek.com Website: raasinfotek.com