

PL/SQL PY Spark Developer
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Senior Data Engineer (PL/SQL PY Spark Developer) with a contract length of "unknown," offering a pay rate of "unknown." Located in Houston, TX, it requires 10+ years of back-end development experience, expert PL/SQL, and 7+ years of PySpark.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
-
ποΈ - Date discovered
August 28, 2025
π - Project duration
Unknown
-
ποΈ - Location type
Hybrid
-
π - Contract type
Unknown
-
π - Security clearance
Unknown
-
π - Location detailed
Houston, TX
-
π§ - Skills detailed
#Schema Design #Scala #Data Warehouse #Delta Lake #Debugging #Data Quality #Airflow #Azure #Data Lake #SQL (Structured Query Language) #Migration #Data Management #"ETL (Extract #Transform #Load)" #Spark (Apache Spark) #Azure cloud #Quality Assurance #Leadership #Oracle #PySpark #Databricks #Data Integration #Documentation #Data Modeling #MS SQL (Microsoft SQL Server) #Cloud #Data Engineering #EDW (Enterprise Data Warehouse) #GCP (Google Cloud Platform) #Base #AWS (Amazon Web Services) #SQL Server #BI (Business Intelligence) #Apache Airflow #Data Pipeline #Data Processing
Role description
We are seeking a Senior Data Engineer to lead a critical pilot project focused on modernizing our enterprise customer data consolidation from SQL Server to our Databricks-based data lake. This role combines traditional Oracle/PL/SQL expertise with modern PySpark development to support our East region initiative across three billing platforms.
## Key Responsibilities
Primary Project (Enterprise Customer Table Pilot)
β’ Design and implement data consolidation solutions moving from SQL Server to Databricks data lake
β’ Work with business stakeholders and cross-functional teams to define enterprise customer table specifications
β’ Determine optimal approach for data processing - either within existing Oracle systems or in the data lake environment
β’ Collaborate with enterprise data lake team to leverage existing PySpark resources and infrastructure
β’ Produce modified data inputs for the new enterprise customer table consolidation process
β’ Ensure data quality and consistency across three different billing platform feeds
Secondary Pilot (Code Conversion)
β’ Convert existing Oracle/PL/SQL code to PySpark for data lake processing
β’ Evaluate feasibility of migrating current data warehouse operations to PySpark
β’ Provide proof-of-concept for future large-scale migration initiatives
β’ Test and validate converted code performance in the data lake environment
Team Development & Knowledge Transfer
β’ Train and mentor existing PL/SQL team members on PySpark technologies
β’ Work independently with minimal supervision while collaborating effectively with stakeholders
β’ Provide technical leadership and architectural guidance for data processing solutions
β’ Document best practices and create knowledge base for future PySpark implementations
Required Technical Skills
Core Requirements
β’
β’
β’ 10+ years
β’
β’ of back-end development experience
β’
β’
β’ Expert-level PL/SQL and Oracle
β’
β’ database development
β’
β’
β’ 7+ years of PySpark
β’
β’ experience with data lake implementations
β’ Strong experience with
β’
β’ Databricks
β’
β’ platform
β’ Proficiency in
β’
β’ data modeling and schema design
β’
β’
β’ Experience with
β’
β’ data pipeline development
β’
β’
β’ Custom data warehouse development experience
Preferred Technologies
β’
β’
β’ Delta Lake
β’
β’ experience
β’
β’
β’ Apache Airflow
β’
β’ for job scheduling and pipeline orchestration
β’
β’
β’ GCP (Google Cloud Platform)
β’
β’ - our primary cloud environment
β’
β’
β’ AWS or Azure
β’
β’ cloud experience (transferable)
β’ Data warehouse and ETL/ELT processes
β’ Experience with enterprise-scale data integration projects
Technical Environment
β’
β’
β’ Current Stack
β’
β’ : Oracle-based custom data warehouse, PL/SQL processing
β’
β’
β’ Target Stack
β’
β’ : Databricks data lake, PySpark, Delta Lake, GCP
β’
β’
β’ Integration Points
β’
β’ : Three separate billing systems, SQL Server consolidation layer
β’
β’
β’ Data Volume
β’
β’ : Enterprise-scale customer data across multiple regions
Business Context & Domain Knowledge
β’ Support for multiple regions: East, Texas (largest), and Panera
β’ Integration challenges across three separate billing systems with different data formats
β’ Enterprise-level customer data consolidation and reporting requirements
β’ Migration from legacy SQL Server data warehouse to modern data lake architecture
β’ Sales reporting and sales count reporting focus
β’ Experience with acquired company data integration challenges preferred
Required Competencies
Technical Leadership
β’ Ability to analyze existing systems and recommend architectural improvements
β’ Experience designing scalable data processing solutions
β’ Strong debugging and troubleshooting skills across multiple platforms
β’ Code review and quality assurance capabilities
Business Acumen
β’ Understanding of enterprise data warehouse concepts
β’ Experience with customer data management and consolidation
β’ Knowledge of sales reporting and business intelligence requirements
β’ Familiarity with multi-system integration challenges
Communication & Collaboration
β’ Excellent stakeholder management skills
β’ Ability to translate technical concepts to business users
β’ Experience working with cross-functional teams
β’ Strong documentation and knowledge sharing abilities
Work Arrangement
β’
β’
β’ Hybrid role
β’
β’ : 3 days on-site (Monday, Tuesday, Thursday) in Houston, TX
β’ 2 days remote work
β’ Candidates willing to relocate to Houston will be considered
β’
β’
β’ Office Location
β’
β’ : Houston, Texas (specific location to be provided)
We are seeking a Senior Data Engineer to lead a critical pilot project focused on modernizing our enterprise customer data consolidation from SQL Server to our Databricks-based data lake. This role combines traditional Oracle/PL/SQL expertise with modern PySpark development to support our East region initiative across three billing platforms.
## Key Responsibilities
Primary Project (Enterprise Customer Table Pilot)
β’ Design and implement data consolidation solutions moving from SQL Server to Databricks data lake
β’ Work with business stakeholders and cross-functional teams to define enterprise customer table specifications
β’ Determine optimal approach for data processing - either within existing Oracle systems or in the data lake environment
β’ Collaborate with enterprise data lake team to leverage existing PySpark resources and infrastructure
β’ Produce modified data inputs for the new enterprise customer table consolidation process
β’ Ensure data quality and consistency across three different billing platform feeds
Secondary Pilot (Code Conversion)
β’ Convert existing Oracle/PL/SQL code to PySpark for data lake processing
β’ Evaluate feasibility of migrating current data warehouse operations to PySpark
β’ Provide proof-of-concept for future large-scale migration initiatives
β’ Test and validate converted code performance in the data lake environment
Team Development & Knowledge Transfer
β’ Train and mentor existing PL/SQL team members on PySpark technologies
β’ Work independently with minimal supervision while collaborating effectively with stakeholders
β’ Provide technical leadership and architectural guidance for data processing solutions
β’ Document best practices and create knowledge base for future PySpark implementations
Required Technical Skills
Core Requirements
β’
β’
β’ 10+ years
β’
β’ of back-end development experience
β’
β’
β’ Expert-level PL/SQL and Oracle
β’
β’ database development
β’
β’
β’ 7+ years of PySpark
β’
β’ experience with data lake implementations
β’ Strong experience with
β’
β’ Databricks
β’
β’ platform
β’ Proficiency in
β’
β’ data modeling and schema design
β’
β’
β’ Experience with
β’
β’ data pipeline development
β’
β’
β’ Custom data warehouse development experience
Preferred Technologies
β’
β’
β’ Delta Lake
β’
β’ experience
β’
β’
β’ Apache Airflow
β’
β’ for job scheduling and pipeline orchestration
β’
β’
β’ GCP (Google Cloud Platform)
β’
β’ - our primary cloud environment
β’
β’
β’ AWS or Azure
β’
β’ cloud experience (transferable)
β’ Data warehouse and ETL/ELT processes
β’ Experience with enterprise-scale data integration projects
Technical Environment
β’
β’
β’ Current Stack
β’
β’ : Oracle-based custom data warehouse, PL/SQL processing
β’
β’
β’ Target Stack
β’
β’ : Databricks data lake, PySpark, Delta Lake, GCP
β’
β’
β’ Integration Points
β’
β’ : Three separate billing systems, SQL Server consolidation layer
β’
β’
β’ Data Volume
β’
β’ : Enterprise-scale customer data across multiple regions
Business Context & Domain Knowledge
β’ Support for multiple regions: East, Texas (largest), and Panera
β’ Integration challenges across three separate billing systems with different data formats
β’ Enterprise-level customer data consolidation and reporting requirements
β’ Migration from legacy SQL Server data warehouse to modern data lake architecture
β’ Sales reporting and sales count reporting focus
β’ Experience with acquired company data integration challenges preferred
Required Competencies
Technical Leadership
β’ Ability to analyze existing systems and recommend architectural improvements
β’ Experience designing scalable data processing solutions
β’ Strong debugging and troubleshooting skills across multiple platforms
β’ Code review and quality assurance capabilities
Business Acumen
β’ Understanding of enterprise data warehouse concepts
β’ Experience with customer data management and consolidation
β’ Knowledge of sales reporting and business intelligence requirements
β’ Familiarity with multi-system integration challenges
Communication & Collaboration
β’ Excellent stakeholder management skills
β’ Ability to translate technical concepts to business users
β’ Experience working with cross-functional teams
β’ Strong documentation and knowledge sharing abilities
Work Arrangement
β’
β’
β’ Hybrid role
β’
β’ : 3 days on-site (Monday, Tuesday, Thursday) in Houston, TX
β’ 2 days remote work
β’ Candidates willing to relocate to Houston will be considered
β’
β’
β’ Office Location
β’
β’ : Houston, Texas (specific location to be provided)