OPEN POSITION
OPEN POSITION
Machine Learning Engineer (TTS)
Machine Learning Engineer (TTS)
We are now expanding our team and are looking for skilled, goal-oriented MLE to join our teams.
We are now expanding our team and are looking for skilled, goal-oriented MLE to join our teams.
Apply now
About us
About us
Aiphoria is an innovative start-up specializing in machine learning and AI solutions.
We are a one-stop shop for customers who want to enhance their business with proven, up-to-date solutions based on conversational AI, LLMs, natural language, and speech processing. We offer solutions based on virtual employees (virtual supporters, sales managers, personal assistants, etc), goal-oriented conversational AI, ASR/TTS solutions, and many others.
Our team brings huge of cutting-edge, specialized expertise in Machine Learning and Speech Technologies, which are used daily by hundreds of millions of people worldwide.
We already have several major projects underway and are looking to strengthen our team for a MLE (TTS)!
Aiphoria is an innovative start-up specializing in machine learning and AI solutions.
We are a one-stop shop for customers who want to enhance their business with proven, up-to-date solutions based on conversational AI, LLMs, natural language, and speech processing. We offer solutions based on virtual employees (virtual supporters, sales managers, personal assistants, etc), goal-oriented conversational AI, ASR/TTS solutions, and many others.
Our team brings huge of cutting-edge, specialized expertise in Machine Learning and Speech Technologies, which are used daily by hundreds of millions of people worldwide.
We already have several major projects underway and are looking to strengthen our team for a MLE (TTS)!
Responsibilities
Responsibilities
Design and optimize TTS models to ensure our voice assistant sounds as natural and accurate as possible.
Collaborate closely with product managers and engineers to integrate TTS tech, making it seamless and intuitive for users.
Partner with data teams to build efficient audio data pipelines, from speaker recording/preprocessing to model training.
Regularly update and refine TTS models to adapt to various accents, dialects, and speech styles, enhancing user satisfaction and responsiveness.
Keep up-to-date with the latest TTS advancements, bringing in innovative techniques and tools to keep us at the forefront of voice-assisted banking.
Rigorously test and validate models to meet strict standards
Design and optimize TTS models to ensure our voice assistant sounds as natural and accurate as possible.
Collaborate closely with product managers and engineers to integrate TTS tech, making it seamless and intuitive for users.
Partner with data teams to build efficient audio data pipelines, from speaker recording/preprocessing to model training.
Regularly update and refine TTS models to adapt to various accents, dialects, and speech styles, enhancing user satisfaction and responsiveness.
Keep up-to-date with the latest TTS advancements, bringing in innovative techniques and tools to keep us at the forefront of voice-assisted banking.
Rigorously test and validate models to meet strict standards
Requirements
Requirements
Proficiency in Python and deep learning frameworks (especially, PyTorch).
Strong understanding of speech synthesis processing techniques.
Experience with Fast Attention-Based Models: (FastPitch, FastSpeech 2) and modern variative approaches: (e.g., VITS, Glow-TTS).
Strong understanding of techniques to control prosody, rhythm, and emotional tone for expressive speech synthesis.
Knowledge of normalization techniques, FSTs, NN for normalization.
Familiarity with TTS evaluation techniques, including MOS and A/B testing.
Familiarity with vocoder models (e.g. Vocos, HiFi-GAN, mimi).
Knowledge of signal processing, statistical modeling, and language structure.
Proficiency in Python and deep learning frameworks (especially, PyTorch).
Strong understanding of speech synthesis processing techniques.
Experience with Fast Attention-Based Models: (FastPitch, FastSpeech 2) and modern variative approaches: (e.g., VITS, Glow-TTS).
Strong understanding of techniques to control prosody, rhythm, and emotional tone for expressive speech synthesis.
Knowledge of normalization techniques, FSTs, NN for normalization.
Familiarity with TTS evaluation techniques, including MOS and A/B testing.
Familiarity with vocoder models (e.g. Vocos, HiFi-GAN, mimi).
Knowledge of signal processing, statistical modeling, and language structure.
What we offer
What we offer
Rapid career progression, facilitated by our team of seasoned senior professionals who hail from prestigious, industry-leading companies.
Remote work opportunities from anywhere globally.
Company has prominent clients with an opportunity for you to work on different projects and/or to be involved in developing our proprietary own products.
Competitive compensation in Euro/USD, surpassing market standards.
Rapid career progression, facilitated by our team of seasoned senior professionals who hail from prestigious, industry-leading companies.