* [gentoo-user] speech recognition? @ 2016-05-15 14:34 lee 2016-05-15 23:44 ` wabe ` (2 more replies) 0 siblings, 3 replies; 8+ messages in thread From: lee @ 2016-05-15 14:34 UTC (permalink / raw To: gentoo-user Hi, is there a speech recognition software or the like which is capable to listen in on a phone call in order to put on screen as text what the other person is saying? I'd like to connect that to a softphone so that someone who suffers from very bad hearing can talk to people on the phone more easily. It must work for German. If there's a phone capable of this, I'd like to know about it. Surely we should be able with nowadays technology to achieve this. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [gentoo-user] speech recognition? 2016-05-15 14:34 [gentoo-user] speech recognition? lee @ 2016-05-15 23:44 ` wabe 2016-05-17 21:24 ` Andrew Savchenko 2016-05-18 0:32 ` [gentoo-user] " James 2016-05-26 14:31 ` Hans 2 siblings, 1 reply; 8+ messages in thread From: wabe @ 2016-05-15 23:44 UTC (permalink / raw To: gentoo-user lee <lee@yagibdah.de> wrote: > Hi, > > is there a speech recognition software or the like which is capable to > listen in on a phone call in order to put on screen as text what the > other person is saying? > > I'd like to connect that to a softphone so that someone who suffers > from very bad hearing can talk to people on the phone more easily. > It must work for German. > > If there's a phone capable of this, I'd like to know about it. > > Surely we should be able with nowadays technology to achieve this. Maybe this is helpful for you: http://m.heise.de/developer/meldung/Google-oeffnet-seine-Spracherkennungs-API-fuer-Entwickler-3150287.html https://cloud.google.com/speech/ -- Regards wabe ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [gentoo-user] speech recognition? 2016-05-15 23:44 ` wabe @ 2016-05-17 21:24 ` Andrew Savchenko 2016-05-18 1:40 ` wabe 0 siblings, 1 reply; 8+ messages in thread From: Andrew Savchenko @ 2016-05-17 21:24 UTC (permalink / raw To: gentoo-user [-- Attachment #1: Type: text/plain, Size: 1256 bytes --] On Mon, 16 May 2016 01:44:40 +0200 wabe wrote: > lee <lee@yagibdah.de> wrote: > > > Hi, > > > > is there a speech recognition software or the like which is capable to > > listen in on a phone call in order to put on screen as text what the > > other person is saying? > > > > I'd like to connect that to a softphone so that someone who suffers > > from very bad hearing can talk to people on the phone more easily. > > It must work for German. > > > > If there's a phone capable of this, I'd like to know about it. > > > > Surely we should be able with nowadays technology to achieve this. > > Maybe this is helpful for you: > > http://m.heise.de/developer/meldung/Google-oeffnet-seine-Spracherkennungs-API-fuer-Entwickler-3150287.html > > https://cloud.google.com/speech/ Sad, but there are no free software solutions available in this area; at least I'm not aware of them completely. This is understandable: the task is extraordinary in both manpower and computational resources required. And what Google offers here is not even a proprietary application, but an access to a proprietary cloud service which is temporarily free of charge as long as it is in alpha-testing stage. Best regards, Andrew Savchenko [-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [gentoo-user] speech recognition? 2016-05-17 21:24 ` Andrew Savchenko @ 2016-05-18 1:40 ` wabe 0 siblings, 0 replies; 8+ messages in thread From: wabe @ 2016-05-18 1:40 UTC (permalink / raw To: gentoo-user Andrew Savchenko <bircoph@gentoo.org> wrote: > > https://cloud.google.com/speech/ > > Sad, but there are no free software solutions available in this > area; at least I'm not aware of them completely. This is > understandable: the task is extraordinary in both manpower and > computational resources required. Actually there are some open source speech recognition programs. https://en.wikipedia.org/wiki/List_of_speech_recognition_software Sphinx2 and Sphinx3 are even available as gentoo ebuilds (app-accessibility/sphinx2 and app-accessibility/sphinx3). Sphinx4 as well as a lots of information about it is available here: http://cmusphinx.sourceforge.net/ A German language model can be found here: https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/German%20Voxforge/ However I don't know how much effort will be necessary to get the whole thing working for your purposes. > And what Google offers here is not even a proprietary application, > but an access to a proprietary cloud service which is temporarily > free of charge as long as it is in alpha-testing stage. That's right. Beside the fact that every spoken word will be transfered to Google (bad enough), it will not be free of charge for ever. -- Regards wabe ^ permalink raw reply [flat|nested] 8+ messages in thread
* [gentoo-user] Re: speech recognition? 2016-05-15 14:34 [gentoo-user] speech recognition? lee 2016-05-15 23:44 ` wabe @ 2016-05-18 0:32 ` James 2016-05-18 17:11 ` James 2016-05-26 14:31 ` Hans 2 siblings, 1 reply; 8+ messages in thread From: James @ 2016-05-18 0:32 UTC (permalink / raw To: gentoo-user lee <lee <at> yagibdah.de> writes: > is there a speech recognition software or the like which is capable to > listen in on a phone call in order to put on screen as text what the > other person is saying? I like to say that there are (2) main categories of effort here, one very do-able (a single voice), the other (infinite voices) plausibly intractable atm. > I'd like to connect that to a softphone so that someone who suffers from > very bad hearing can talk to people on the phone more easily. This is possible, if only a few voices; that have had their speech patterns analyzed, manipulated into storage with ample resources, then what you seek is possible, accuracy is the constraint. > If there's a phone capable of this, I'd like to know about it. If you are after a solution that can work with any voice, even limited to a single language, then the answer is a long way away. Some would say intractable. There is the question of accuracy required and the complexity of vocabulary, sentence structure and allowed nominal variation on the voice(s). > Surely we should be able with nowadays technology to achieve this. With google sized resources, you can masquerade the problem with templates for many different voices, but the underlying problems abound without limit. What you actually do is 'train' the google system to customize it's translation of a given voice, very accurately over time. Now say I disguise my voice with a throat infection, depressed attitude, exuberance etc etc, you can see the troubles. In fact, the day after watching horrible English cinema, which is often contagious (monty python --life_of_bryan), I often develop a temporary 'Manchester slang' in the vernacular. Endless, unlimited gyrations should one want to have a bit-o-fun with language, particularly when any number of 'hill_billy' contaminants manifest. My mathematical belief is the problem is intractable, certainly as you approach a high level of required accuracy. In fact folks routinely joust with one anther around the 'looseness of language' and the various varieties of layered meanings.... Truly intractable, but a 'dumbed down' simile surely will exist at some point. Google ibm in your searches as they did quite a bit of foundational research in a variety of related areas of (speech/sound/voice) research. Still, google's offering might prove acceptable for your needs. hth, James ^ permalink raw reply [flat|nested] 8+ messages in thread
* [gentoo-user] Re: speech recognition? 2016-05-18 0:32 ` [gentoo-user] " James @ 2016-05-18 17:11 ` James 0 siblings, 0 replies; 8+ messages in thread From: James @ 2016-05-18 17:11 UTC (permalink / raw To: gentoo-user James <wireless <at> tampabay.rr.com> writes: > lee <lee <at> yagibdah.de> writes: You know, it just dawned on me a solution that *may* fit your needs. If folks where to first employ a speech to text interpolate software, on their (originating) end, then your friend would receive mostly accurate text communications. The single source (voice) solution becomes more accurate over time as one uses them. If this sort of semantic occurred before the text was sent to your friend, then it's much easier to support and refine. Sadly, it does nothing for new sources of contact, until they refine their own speech-to-text system. Sometimes a simultaneous video feed help folks comprehend, by reading lips whilst discerning audio signals, but that requires low-latency connections. hth, James ^ permalink raw reply [flat|nested] 8+ messages in thread
* [gentoo-user] Re: speech recognition? 2016-05-15 14:34 [gentoo-user] speech recognition? lee 2016-05-15 23:44 ` wabe 2016-05-18 0:32 ` [gentoo-user] " James @ 2016-05-26 14:31 ` Hans 2016-06-03 15:58 ` James 2 siblings, 1 reply; 8+ messages in thread From: Hans @ 2016-05-26 14:31 UTC (permalink / raw To: gentoo-user On 16/05/16 00:34, lee wrote: > Hi, > > is there a speech recognition software or the like which is capable to > listen in on a phone call in order to put on screen as text what the > other person is saying? > > I'd like to connect that to a softphone so that someone who suffers from > very bad hearing can talk to people on the phone more easily. It must > work for German. > > If there's a phone capable of this, I'd like to know about it. > > Surely we should be able with nowadays technology to achieve this. > > There is a commercial dictation software for Windows and Mac. It may work with whine. http://nuance.com/dragon/index.htm ^ permalink raw reply [flat|nested] 8+ messages in thread
* [gentoo-user] Re: speech recognition? 2016-05-26 14:31 ` Hans @ 2016-06-03 15:58 ` James 0 siblings, 0 replies; 8+ messages in thread From: James @ 2016-06-03 15:58 UTC (permalink / raw To: gentoo-user Hans <linux <at> c5ace.com> writes: > > is there a speech recognition software or the like which is capable to > > listen in on a phone call in order to put on screen as text what the > > other person is saying? > http://nuance.com/dragon/index.htm I just ran accross this article:: http://chrislord.net/index.php/2016/06/01/open-source-speech-recognition/ hth, James ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-06-03 15:58 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-05-15 14:34 [gentoo-user] speech recognition? lee 2016-05-15 23:44 ` wabe 2016-05-17 21:24 ` Andrew Savchenko 2016-05-18 1:40 ` wabe 2016-05-18 0:32 ` [gentoo-user] " James 2016-05-18 17:11 ` James 2016-05-26 14:31 ` Hans 2016-06-03 15:58 ` James
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox