Building Next-Gen Speech Synthesis for Bookmate Audiobook Service
Data Science Conference via YouTube
Build GenAI Apps from Scratch — UCSB PaCE Certificate Program
Live Online Classes in Design, Coding & AI — Small Classes, Free Retakes
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
In this 22-minute conference talk from DSC EUROPE 24, Vladimir Platonov delves into the development and evolution of text-to-speech technology for Bookmate's audiobook service. Discover how they built a high-quality TTS system using just 90 hours of training data that was successfully deployed across more than 100,000 books. Learn about critical aspects of audiobook production including voice selection, dataset construction, and strategies for meeting user expectations for synthetic speech. Follow the journey to next-generation TTS as Vladimir explains how they leveraged tens of thousands of hours of data, managed large-scale datasets, implemented advanced neural network architectures, and developed effective quality assessment methods through crowdsourcing and user feedback. This presentation provides valuable insights for anyone interested in speech synthesis technology and its practical applications in digital content delivery.
Syllabus
Building next-gen speech synthesis for Bookmate audiobook service |Vladimir Platonov | DSC EUROPE 24
Taught by
Data Science Conference