Magic Data, a global AI data service provider, has launched an accumulation of more than 200,000 hours of training datasets, including 140,000 hours of conversational AI training datasets and 60,000 hours of read speech datasets, covering Asian languages, English dialects, and European languages, boosting the rapid development of human-computer interaction in artificial intelligence.
Top iTechnology Analytics News: Alida Brings New Platform Capabilities in Winter 2022 Product Release
Why conversational AI dataset?
Experiments show conversational data has better performance on ASR machine learning. Magic Data R&D Center works on conversational speech data and read speech data comparison, where 3,000 hours of conversational speech training data and read speech training data were respectively used to train Automatic Speech Recognition (ASR) models under customer service scenario, broadcasting, and navigation command. It shows that compared with read speech data, conversational speech data word accuracy is improved relatively up to 84%.
Magic Data R&D Center
In addition, Magic Data R&D Center conduct experiment with 3,000 hours and 30,000 hours conversational training data. The result shows the more the conversational data is used, the higher the word accuracy comes.
Top iTechnology Cloud News: ChannelAdvisor Achieves New Milestone to Help Fuel Global E-Commerce Growth for Brands
[To share your insights with us, please write to sghosh@martechseries.com]