

Ocean Blue Solutions Inc
Data Scientist - Hybrid, VA
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Scientist in Virginia (Hybrid) with a contract length of "unknown" and a pay rate of "unknown." Requires 8-10 years of experience, proficiency in Spark, Scala, R, Python, and expertise in AWS and Azure ecosystems.
π - Country
United States
π± - Currency
Unknown
-
π° - Day rate
Unknown
-
ποΈ - Date
January 17, 2026
π - Duration
Unknown
-
ποΈ - Location
Hybrid
-
π - Contract
Unknown
-
π - Security
Unknown
-
π - Location detailed
Richmond, VA
-
π§ - Skills detailed
#Programming #Big Data #Python #Cloud #SQL (Structured Query Language) #Statistics #MySQL #AI (Artificial Intelligence) #AWS (Amazon Web Services) #Databricks #Athena #HBase #Pig #Spark (Apache Spark) #Eclipse #Maven #Impala #Database Systems #AWS EMR (Amazon Elastic MapReduce) #Data Science #Jenkins #SageMaker #Data Analysis #Datasets #Monitoring #Jira #Scala #Sqoop (Apache Sqoop) #S3 (Amazon Simple Storage Service) #Azure #Linux #"ETL (Extract #Transform #Load)" #AWS SageMaker #Data Governance #EC2 #Documentation #Data Lake #ML (Machine Learning) #Java #Spark SQL #HDFS (Hadoop Distributed File System) #R #Computer Science #AWS EC2 (Amazon Elastic Compute Cloud) #Aurora #BI (Business Intelligence) #Data Pipeline #GCP (Google Cloud Platform) #Data Processing #CLI (Command-Line Interface) #SQL Server #Unix #Synapse #GitHub #Agile #DynamoDB #Hadoop #Microsoft Azure #Scripting
Role description
Data Scientist - Hybrid, VA
8 hours ago
richmond,varginia
Job Title : Data Scientist
Location: Virginia - Hybrid
Job Description:
Understand and prioritize business problems and identify ways to leverage data to recommend solutions to business problems. Organize and synthesize data into actionable business decisions, focused on insights. Provide insight into, trends, financial and business operations through data analysis and the development of business intelligence visuals.
Work with advanced business intelligence tools to complete complex calculations, table calculations, geographic mapping, data blending, and optimization of data extracts.
Apply all Phases of Software Development Life Cycle (Analysis, Design, Development, Testing and Maintenance) using Waterfall and Agile methodologies
Proficient in working on Apache Hadoop ecosystem components like Map-Reduce, Hive, Pig, SQOOP, Spark, Flume, HBase and Oozie with AWS EC2/Azure VMβs cloud computing
Expertise in using Hive for creating tables, data distribution by implementing Partitioning and Bucketing. Capable in developing, tuning and optimizing the HQL queries
Proficient in importing and exporting the data using SQOOP from HDFS to Relational Database systems and vice-versa
Expert in Spark SQL and Spark Data Frames using Scala for Distributed Data Processing
Develop Data Frame and RDD (Resilient Distributed Datasets) to achieve unified transformations on the data load
Expertise in various scripting languages like Linux/Unix shell scripts and Python
Develop scheduling and monitoring Oozie workflows for parallel execution of jobs
Experience in working with cloud environment AWS EMR, EC2, S3 and Athena and GCP Big Query
Transfer data from different platformβs into AWS platform
Diverse experience in working with variety of Database like SQL Server, MySql, IBM DB2 and Netezza
Manage the source code in GitHub
Track and delivery requirements in Jira
Expertise in using IDEs and Tools like Eclipse, GitHub, Jenkins, Maven and IntelliJ
Optimize the Spark application to improve performance and reduced time on the Hadoop cluster
Proficient in executing Hive queries using Hive cli, Web GUI Hue and Impala to read, write and query the data
Build distributed, scalable, and reliable data pipelines that ingest and process data at scale and in real-time
Create metrics and apply business logic using Spark, Scala, R, Python, and/or Java
Model, design, develop, code, test, debug, document and deploy application to production through standard processes also in addition build business models using Data science skills
Harmonize, transform, and move data from a raw format to consumable and curated view
Apply strong Data Governance principles, standards, and frameworks to promote data consistency and quality while effectively managing and protecting the integrity of corporate data
POSITION QUALIFICATIONS:
Education Required: Bachelorβs and/or Masterβs degree in Computer Science, Analytics, Statistics, or similar field.
Required or Acceptable Job-Related Experience: 8 β 10 years related experience
Technical/Other Skills Required:
Strong hands-on experience in Spark, Scala, R, Python, and/or Java.
Programming experience with the Hadoop ecosystem of applications and functional understanding of distributed data processing systems architecture (Data Lake / Big Data /Hadoop/ Spark / HIVE, etc).
Amazon Big Data ecosystem (EMR, Kinesis, Aurora, DynamoDB etcβ¦) experience
Microsoft Azure Data ecosystem (Databricks, Stream analytics, Purview, Synapse Analytics etc.)
Proficient in working with AWS Sagemaker and Azure ML for building AI/ML Models.
Excellent communication and collaboration skills to work effectively with business teams, engineers and operational teams.
Must be able to convey key messages in technical terms and business terms.
Must be able to create technical documentation, such as specifications, design documents, and testing documents.
Familiarity with systems like AVEVA PI, sensor networks, PLCs, SCADA systems is a plus.
Oral: Ability to collaborate and communicate with a wide range of partners, including IT and business, across all levels of the organization. Must actively manage expectations with stakeholders.
Problem Solving: Must understand the business need and develop technical solutions to meet those needs. Innovation, creativity, and critical problem-solving skills are required to be successful in this role. Solutions need to be comprehensive, flexible for future changes, and delivered with a high degree of quality.
Data Scientist - Hybrid, VA
8 hours ago
richmond,varginia
Job Title : Data Scientist
Location: Virginia - Hybrid
Job Description:
Understand and prioritize business problems and identify ways to leverage data to recommend solutions to business problems. Organize and synthesize data into actionable business decisions, focused on insights. Provide insight into, trends, financial and business operations through data analysis and the development of business intelligence visuals.
Work with advanced business intelligence tools to complete complex calculations, table calculations, geographic mapping, data blending, and optimization of data extracts.
Apply all Phases of Software Development Life Cycle (Analysis, Design, Development, Testing and Maintenance) using Waterfall and Agile methodologies
Proficient in working on Apache Hadoop ecosystem components like Map-Reduce, Hive, Pig, SQOOP, Spark, Flume, HBase and Oozie with AWS EC2/Azure VMβs cloud computing
Expertise in using Hive for creating tables, data distribution by implementing Partitioning and Bucketing. Capable in developing, tuning and optimizing the HQL queries
Proficient in importing and exporting the data using SQOOP from HDFS to Relational Database systems and vice-versa
Expert in Spark SQL and Spark Data Frames using Scala for Distributed Data Processing
Develop Data Frame and RDD (Resilient Distributed Datasets) to achieve unified transformations on the data load
Expertise in various scripting languages like Linux/Unix shell scripts and Python
Develop scheduling and monitoring Oozie workflows for parallel execution of jobs
Experience in working with cloud environment AWS EMR, EC2, S3 and Athena and GCP Big Query
Transfer data from different platformβs into AWS platform
Diverse experience in working with variety of Database like SQL Server, MySql, IBM DB2 and Netezza
Manage the source code in GitHub
Track and delivery requirements in Jira
Expertise in using IDEs and Tools like Eclipse, GitHub, Jenkins, Maven and IntelliJ
Optimize the Spark application to improve performance and reduced time on the Hadoop cluster
Proficient in executing Hive queries using Hive cli, Web GUI Hue and Impala to read, write and query the data
Build distributed, scalable, and reliable data pipelines that ingest and process data at scale and in real-time
Create metrics and apply business logic using Spark, Scala, R, Python, and/or Java
Model, design, develop, code, test, debug, document and deploy application to production through standard processes also in addition build business models using Data science skills
Harmonize, transform, and move data from a raw format to consumable and curated view
Apply strong Data Governance principles, standards, and frameworks to promote data consistency and quality while effectively managing and protecting the integrity of corporate data
POSITION QUALIFICATIONS:
Education Required: Bachelorβs and/or Masterβs degree in Computer Science, Analytics, Statistics, or similar field.
Required or Acceptable Job-Related Experience: 8 β 10 years related experience
Technical/Other Skills Required:
Strong hands-on experience in Spark, Scala, R, Python, and/or Java.
Programming experience with the Hadoop ecosystem of applications and functional understanding of distributed data processing systems architecture (Data Lake / Big Data /Hadoop/ Spark / HIVE, etc).
Amazon Big Data ecosystem (EMR, Kinesis, Aurora, DynamoDB etcβ¦) experience
Microsoft Azure Data ecosystem (Databricks, Stream analytics, Purview, Synapse Analytics etc.)
Proficient in working with AWS Sagemaker and Azure ML for building AI/ML Models.
Excellent communication and collaboration skills to work effectively with business teams, engineers and operational teams.
Must be able to convey key messages in technical terms and business terms.
Must be able to create technical documentation, such as specifications, design documents, and testing documents.
Familiarity with systems like AVEVA PI, sensor networks, PLCs, SCADA systems is a plus.
Oral: Ability to collaborate and communicate with a wide range of partners, including IT and business, across all levels of the organization. Must actively manage expectations with stakeholders.
Problem Solving: Must understand the business need and develop technical solutions to meet those needs. Innovation, creativity, and critical problem-solving skills are required to be successful in this role. Solutions need to be comprehensive, flexible for future changes, and delivered with a high degree of quality.





