如何分析语音并存储结果?
我正在考虑创建一个应用程序。
以下为说明: 1.人们将之前录制的对话上传到服务器。 2.来自服务器的应用程序将检测该声音的音高、速度、重音、发音等并创建个人组合。 3. 如果您呼叫服务器,则服务器应用程序将以该人的确切语音与您交谈(其语音服务器在步骤 2 中检测到)。
请分享您认为对这个项目有用的任何链接、资源、pdf 演示文稿......
主要是我陷入了第 2 步。我不清楚如何分解声音并分析它并获取速度信息、 Pitch 等。是否有任何现有的 API 可用于语音部分?
I am thinking to create an application.
Following are the description:
1. People will upload previously recorded conversation to the server.
2. application from the server will detect pitch, speed,emphasis,pronunciation etc of that voice and create a personal portfolio.
3. If you call to the server then server application will talk with you in the exact voice tone of that person(whose voice server detected in Step 2).
Please share links, resources, pdf presentation whatever you find useful for this project.....
Mainly I am stuck on the STEP 2. I don't have clear idea how to break down a voice and analysis it and get info of Speed, Pitch etc. Is there any existing API available for the voice part?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我找到了这个:
您还可以检查这个问题:
I was able to find this :
You might also check into this SO question: