Speech Scientist

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Speech Scientist (AI Engineer) on a remote contract, focusing on developing and optimizing voice models. Required skills include Python, machine learning libraries, and speech processing experience. A Bachelor's/Master's in a related field is essential.
🌎 - Country
United States
πŸ’± - Currency
$ USD
-
πŸ’° - Day rate
-
πŸ—“οΈ - Date discovered
September 23, 2025
πŸ•’ - Project duration
Unknown
-
🏝️ - Location type
Remote
-
πŸ“„ - Contract type
Unknown
-
πŸ”’ - Security clearance
Unknown
-
πŸ“ - Location detailed
United States
-
🧠 - Skills detailed
#Deep Learning #ML (Machine Learning) #Cloud #Computer Science #Automatic Speech Recognition (ASR) #Datasets #Quality Assurance #PyTorch #TensorFlow #AI (Artificial Intelligence) #Python #AWS (Amazon Web Services) #Libraries #"ETL (Extract #Transform #Load)" #GCP (Google Cloud Platform)
Role description
Role: AI Engineer Location- Remote Job Description- Looking for strong AI Engineer for experience to assist with the AI Voice project. We are seeking a skilled and innovative AI Engineer with hands-on experience in building and optimizing voice models. In this role, you will work on developing, training, and refining AI models for voice synthesis, voice cloning, speech recognition, and/or voice transformation. Your work will contribute to cutting-edge applications in conversational AI, voice assistants, and generative audio. An ideal candidate would be someone who has: β€’ Developed and optimized text-to-speech models that achieved human-like voice synthesis, maintaining the unique style of voice actors across multiple languages. β€’ Implemented real-time processing solutions that reduced inference time to under 1 second, enhancing user interaction and experience. β€’ Managed large-scale datasets for voice cloning projects, ensuring high performance and reliability while supporting multilingual transcriptions. Key Responsibilities β€’ Design, develop, and fine-tune deep learning models for voice synthesis (e.g., TTS, voice cloning). β€’ Implement and optimize neural network architectures such as Tacotron, FastSpeech, WaveNet, or similar. β€’ Collect, preprocess, and augment speech datasets. β€’ Collaborate with product and engineering teams to integrate voice models into production systems. β€’ Perform evaluation and quality assurance of voice model outputs. β€’ Research and stay current on advancements in speech processing, audio generation, and machine learning. Required Qualifications β€’ Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, or related field. β€’ Strong experience with Python and machine learning libraries (e.g., PyTorch, TensorFlow). β€’ Hands-on experience with speech/audio processing and relevant toolkits (e.g., Librosa, ESPnet, Kaldi). β€’ Familiarity with voice model architectures (TTS, ASR, vocoders). β€’ Understanding of deep learning concepts and model training processes. Preferred Qualifications β€’ Experience with deploying models to real-time applications or mobile devices. β€’ Knowledge of data labeling, voice dataset creation, and noise handling techniques. β€’ Experience with cloud-based AI/ML infrastructure (e.g., AWS, GCP). Contributions to open-source projects or published papers in speech/voice-related domains.