

Speech Scientist
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Speech Scientist (AI Engineer) on a remote contract, focusing on developing and optimizing voice models. Required skills include Python, machine learning libraries, and speech processing experience. A Bachelor's/Master's in a related field is essential.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
-
ποΈ - Date discovered
September 23, 2025
π - Project duration
Unknown
-
ποΈ - Location type
Remote
-
π - Contract type
Unknown
-
π - Security clearance
Unknown
-
π - Location detailed
United States
-
π§ - Skills detailed
#Deep Learning #ML (Machine Learning) #Cloud #Computer Science #Automatic Speech Recognition (ASR) #Datasets #Quality Assurance #PyTorch #TensorFlow #AI (Artificial Intelligence) #Python #AWS (Amazon Web Services) #Libraries #"ETL (Extract #Transform #Load)" #GCP (Google Cloud Platform)
Role description
Role: AI Engineer
Location- Remote
Job Description-
Looking for strong AI Engineer for experience to assist with the AI Voice project. We are seeking a skilled and innovative AI Engineer with hands-on experience in building and optimizing voice models. In this role, you will work on developing, training, and refining AI models for voice synthesis, voice cloning, speech recognition, and/or voice transformation. Your work will contribute to cutting-edge applications in conversational AI, voice assistants, and generative audio.
An ideal candidate would be someone who has:
β’ Developed and optimized text-to-speech models that achieved human-like voice synthesis, maintaining the unique style of voice actors across multiple languages.
β’ Implemented real-time processing solutions that reduced inference time to under 1 second, enhancing user interaction and experience.
β’ Managed large-scale datasets for voice cloning projects, ensuring high performance and reliability while supporting multilingual transcriptions.
Key Responsibilities
β’ Design, develop, and fine-tune deep learning models for voice synthesis (e.g., TTS, voice cloning).
β’ Implement and optimize neural network architectures such as Tacotron, FastSpeech, WaveNet, or similar.
β’ Collect, preprocess, and augment speech datasets.
β’ Collaborate with product and engineering teams to integrate voice models into production systems.
β’ Perform evaluation and quality assurance of voice model outputs.
β’ Research and stay current on advancements in speech processing, audio generation, and machine learning.
Required Qualifications
β’ Bachelorβs or Masterβs degree in Computer Science, Electrical Engineering, or related field.
β’ Strong experience with Python and machine learning libraries (e.g., PyTorch, TensorFlow).
β’ Hands-on experience with speech/audio processing and relevant toolkits (e.g., Librosa, ESPnet, Kaldi).
β’ Familiarity with voice model architectures (TTS, ASR, vocoders).
β’ Understanding of deep learning concepts and model training processes.
Preferred Qualifications
β’ Experience with deploying models to real-time applications or mobile devices.
β’ Knowledge of data labeling, voice dataset creation, and noise handling techniques.
β’ Experience with cloud-based AI/ML infrastructure (e.g., AWS, GCP).
Contributions to open-source projects or published papers in speech/voice-related domains.
Role: AI Engineer
Location- Remote
Job Description-
Looking for strong AI Engineer for experience to assist with the AI Voice project. We are seeking a skilled and innovative AI Engineer with hands-on experience in building and optimizing voice models. In this role, you will work on developing, training, and refining AI models for voice synthesis, voice cloning, speech recognition, and/or voice transformation. Your work will contribute to cutting-edge applications in conversational AI, voice assistants, and generative audio.
An ideal candidate would be someone who has:
β’ Developed and optimized text-to-speech models that achieved human-like voice synthesis, maintaining the unique style of voice actors across multiple languages.
β’ Implemented real-time processing solutions that reduced inference time to under 1 second, enhancing user interaction and experience.
β’ Managed large-scale datasets for voice cloning projects, ensuring high performance and reliability while supporting multilingual transcriptions.
Key Responsibilities
β’ Design, develop, and fine-tune deep learning models for voice synthesis (e.g., TTS, voice cloning).
β’ Implement and optimize neural network architectures such as Tacotron, FastSpeech, WaveNet, or similar.
β’ Collect, preprocess, and augment speech datasets.
β’ Collaborate with product and engineering teams to integrate voice models into production systems.
β’ Perform evaluation and quality assurance of voice model outputs.
β’ Research and stay current on advancements in speech processing, audio generation, and machine learning.
Required Qualifications
β’ Bachelorβs or Masterβs degree in Computer Science, Electrical Engineering, or related field.
β’ Strong experience with Python and machine learning libraries (e.g., PyTorch, TensorFlow).
β’ Hands-on experience with speech/audio processing and relevant toolkits (e.g., Librosa, ESPnet, Kaldi).
β’ Familiarity with voice model architectures (TTS, ASR, vocoders).
β’ Understanding of deep learning concepts and model training processes.
Preferred Qualifications
β’ Experience with deploying models to real-time applications or mobile devices.
β’ Knowledge of data labeling, voice dataset creation, and noise handling techniques.
β’ Experience with cloud-based AI/ML infrastructure (e.g., AWS, GCP).
Contributions to open-source projects or published papers in speech/voice-related domains.