CloudRay

Data Engineer - Kafka Connect

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer - Kafka Connect with 4+ years of experience, focusing on data ingestion pipelines using Kafka Connect and Debezium. Remote position, requiring proficiency in AWS, GCP, MySQL, PostgreSQL, and Python. Contract length and pay rate unspecified.
🌎 - Country
United States
πŸ’± - Currency
Unknown
-
πŸ’° - Day rate
Unknown
-
πŸ—“οΈ - Date
March 4, 2026
πŸ•’ - Duration
Unknown
-
🏝️ - Location
Remote
-
πŸ“„ - Contract
Unknown
-
πŸ”’ - Security
Unknown
-
πŸ“ - Location detailed
Indiana
-
🧠 - Skills detailed
#Storage #Terraform #Scripting #"ETL (Extract #Transform #Load)" #Batch #Observability #AWS S3 (Amazon Simple Storage Service) #Cloud #RDBMS (Relational Database Management System) #GCP (Google Cloud Platform) #MySQL #Databases #Airflow #S3 (Amazon Simple Storage Service) #Deployment #BigQuery #Python #Automation #Docker #Kubernetes #AWS (Amazon Web Services) #PostgreSQL #Data Modeling #Scala #Data Ingestion #Monitoring #Data Lake #Apache Kafka #Data Engineering #Data Quality #Data Integration #Kafka (Apache Kafka)
Role description
Role: Data Engineer - Kafka Connect Location: Remote immediate joiner prefered. Experience: 4+ Years About the Role We are seeking a skilled Data Engineer with hands-on experience in Kafka Connect and related data ingestion tools to design and implement a Bronze Layer for our data platform. The ideal candidate will work closely with data platform and analytics teams to build scalable and reliable ingestion pipelines from various data sources into cloud-based storage systems. Key Responsibilities Design, develop, and maintain data ingestion pipelines using Kafka Connect and Debezium for real-time and batch data integration. Ingest data from MySQL and PostgreSQL databases into AWS S3, Google Cloud Storage (GCS), and BigQuery. Implement best practices for data modeling, schema evolution, and efficient partitioning in the Bronze Layer. Ensure reliability, scalability, and monitoring of Kafka Connect clusters and connectors. Collaborate with cross-functional teams to understand source systems and downstream data requirements. Optimize data ingestion processes for performance and cost efficiency. Contribute to automation and deployment scripts using Python and cloud-native tools. Stay updated with emerging data lake technologies such as Apache Hudi or Apache Iceberg. Required Skills and Qualifications 2+ years of hands-on experience as a Data Engineer or similar role. Strong experience with Apache Kafka and Kafka Connect (sink and source connectors). Experience with Debezium for change data capture (CDC) from RDBMS. Proficiency in working with MySQL and PostgreSQL. Hands-on experience with AWS S3, GCP BigQuery, and GCS. Proficiency in Python for automation, data handling, and scripting. Understanding of data lake architectures and ingestion patterns. Solid understanding of ETL/ELT pipelines, data quality, and observability practices. Good to Have Experience with containerization (Docker, Kubernetes). Familiarity with workflow orchestration tools (Airflow, Dagster, etc.). Exposure to infrastructure-as-code tools (Terraform, CloudFormation). Familiarity with data versioning and table formats such as Apache Hudi or Apache Iceberg (preferred).