OPEN POSITION
Speech Data Engineer
The Speech Data Engineer is a key specialist bridging the data market with our technological needs. You will be responsible for identifying unique data sources, evaluating their quality, and building strong relationships with data providers.
Apply now
Responsibilities
Collect and prepare speech datasets (ASR/TTS) across multiple languages when customer data is unavailable.
Process raw audio data, including speech segmentation, speaker separation, and basic preprocessing.
Run speech recognition and pseudo-labeling, and collaborate with crowdsourcing/labeling platforms to improve data quality.
Understand and apply differences between ASR data (noisy, real-world speech) and TTS data (clean, high-quality recordings).
Organize, version, and maintain speech datasets, ensuring teams always know what data exists and where it lives.
Support existing data infrastructure and pipelines (e.g. DVC).
Work with external data providers, evaluating dataset quality and contributing to make-vs-buy decisions.
Requirements
Hands-on experience with speech data processing and labeling tools, such as VAD, Pyannote, whisper, and other segmentation or diarization frameworks.
Familiarity with quality assessment metrics, including SNR (Signal-to-Noise Ratio) and other acoustic analysis indicators.
Collect, process, and curate speech datasets, including audio recordings, transcripts, and metadata for multilingual ASR and TTS applications.
Work closely with internal ASR/TTS development teams to align dataset specifications with model training needs.
Label and validate audio data, ensuring transcription accuracy, speaker diversity, and consistent metadata standards.
Why join us?
Experienced team, Aiphoria is formed by a team of enthusiastic professionals who created award-winning devices, voice assistants and other AI-driven products for BigTech corporations.
Cutting-edge technologies, we build a technology using our areas of expertise including Computer Vision, Speech Technologies, Natural Language Understanding, Generative AI incl. LLM and Diffusion models.
Rapid career progression, facilitated by our team of seasoned senior professionals who hail from prestigious, industry-leading companies.
Remote work opportunities from Europe.
Company has prominent clients with an opportunity for you to work on different projects and/or to be involved in developing our proprietary own products.
Competitive compensation surpassing market standards.
A company with entrepreneurial spirit. We offer a unique mix of a secure workspace thanks to the big clients raised along with a true start-up culture!
Responsibilities
Collect and prepare speech datasets (ASR/TTS) across multiple languages when customer data is unavailable.
Process raw audio data, including speech segmentation, speaker separation, and basic preprocessing.
Run speech recognition and pseudo-labeling, and collaborate with crowdsourcing/labeling platforms to improve data quality.
Understand and apply differences between ASR data (noisy, real-world speech) and TTS data (clean, high-quality recordings).
Organize, version, and maintain speech datasets, ensuring teams always know what data exists and where it lives.
Support existing data infrastructure and pipelines (e.g. DVC).
Work with external data providers, evaluating dataset quality and contributing to make-vs-buy decisions.
Requirements
Hands-on experience with speech data processing and labeling tools, such as VAD, Pyannote, whisper, and other segmentation or diarization frameworks.
Familiarity with quality assessment metrics, including SNR (Signal-to-Noise Ratio) and other acoustic analysis indicators.
Collect, process, and curate speech datasets, including audio recordings, transcripts, and metadata for multilingual ASR and TTS applications.
Work closely with internal ASR/TTS development teams to align dataset specifications with model training needs.
Label and validate audio data, ensuring transcription accuracy, speaker diversity, and consistent metadata standards.
Why join us?
Experienced team, Aiphoria is formed by a team of enthusiastic professionals who created award-winning devices, voice assistants and other AI-driven products for BigTech corporations.
Cutting-edge technologies, we build a technology using our areas of expertise including Computer Vision, Speech Technologies, Natural Language Understanding, Generative AI incl. LLM and Diffusion models.
Rapid career progression, facilitated by our team of seasoned senior professionals who hail from prestigious, industry-leading companies.
Remote work opportunities from Europe.
Company has prominent clients with an opportunity for you to work on different projects and/or to be involved in developing our proprietary own products.
Competitive compensation surpassing market standards.
A company with entrepreneurial spirit. We offer a unique mix of a secure workspace thanks to the big clients raised along with a true start-up culture!