[gentoo-user] speech recognition?

public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed

* [gentoo-user] speech recognition?
@ 2016-05-15 14:34 lee
  2016-05-15 23:44 ` wabe
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: lee @ 2016-05-15 14:34 UTC (permalink / raw
  To: gentoo-user

Hi,

is there a speech recognition software or the like which is capable to
listen in on a phone call in order to put on screen as text what the
other person is saying?

I'd like to connect that to a softphone so that someone who suffers from
very bad hearing can talk to people on the phone more easily.  It must
work for German.

If there's a phone capable of this, I'd like to know about it.

Surely we should be able with nowadays technology to achieve this.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [gentoo-user] speech recognition?
  2016-05-15 14:34 [gentoo-user] speech recognition? lee
@ 2016-05-15 23:44 ` wabe
  2016-05-17 21:24   ` Andrew Savchenko
  2016-05-18  0:32 ` [gentoo-user] " James
  2016-05-26 14:31 ` Hans
  2 siblings, 1 reply; 8+ messages in thread
From: wabe @ 2016-05-15 23:44 UTC (permalink / raw
  To: gentoo-user

lee <lee@yagibdah.de> wrote:

> Hi,
> 
> is there a speech recognition software or the like which is capable to
> listen in on a phone call in order to put on screen as text what the
> other person is saying?
> 
> I'd like to connect that to a softphone so that someone who suffers
> from very bad hearing can talk to people on the phone more easily.
> It must work for German.
> 
> If there's a phone capable of this, I'd like to know about it.
> 
> Surely we should be able with nowadays technology to achieve this.

Maybe this is helpful for you:

http://m.heise.de/developer/meldung/Google-oeffnet-seine-Spracherkennungs-API-fuer-Entwickler-3150287.html

https://cloud.google.com/speech/

--
Regards
wabe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [gentoo-user] speech recognition?
  2016-05-15 23:44 ` wabe
@ 2016-05-17 21:24   ` Andrew Savchenko
  2016-05-18  1:40     ` wabe
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Savchenko @ 2016-05-17 21:24 UTC (permalink / raw
  To: gentoo-user

[-- Attachment #1: Type: text/plain, Size: 1256 bytes --]

On Mon, 16 May 2016 01:44:40 +0200 wabe wrote:
> lee <lee@yagibdah.de> wrote:
> 
> > Hi,
> > 
> > is there a speech recognition software or the like which is capable to
> > listen in on a phone call in order to put on screen as text what the
> > other person is saying?
> > 
> > I'd like to connect that to a softphone so that someone who suffers
> > from very bad hearing can talk to people on the phone more easily.
> > It must work for German.
> > 
> > If there's a phone capable of this, I'd like to know about it.
> > 
> > Surely we should be able with nowadays technology to achieve this.
> 
> Maybe this is helpful for you:
> 
> http://m.heise.de/developer/meldung/Google-oeffnet-seine-Spracherkennungs-API-fuer-Entwickler-3150287.html
> 
> https://cloud.google.com/speech/

Sad, but there are no free software solutions available in this
area; at least I'm not aware of them completely. This is
understandable: the task is extraordinary in both manpower and
computational resources required.

And what Google offers here is not even a proprietary application,
but an access to a proprietary cloud service which is temporarily
free of charge as long as it is in alpha-testing stage.

Best regards,
Andrew Savchenko

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [gentoo-user] Re: speech recognition?
  2016-05-15 14:34 [gentoo-user] speech recognition? lee
  2016-05-15 23:44 ` wabe
@ 2016-05-18  0:32 ` James
  2016-05-18 17:11   ` James
  2016-05-26 14:31 ` Hans
  2 siblings, 1 reply; 8+ messages in thread
From: James @ 2016-05-18  0:32 UTC (permalink / raw
  To: gentoo-user

lee <lee <at> yagibdah.de> writes:

> is there a speech recognition software or the like which is capable to
> listen in on a phone call in order to put on screen as text what the
> other person is saying?

I like to say that there  are (2) main categories of effort here, one
very do-able (a single voice),  the other (infinite voices) plausibly
intractable atm.

> I'd like to connect that to a softphone so that someone who suffers from
> very bad hearing can talk to people on the phone more easily.  

This is possible, if only a few voices; that have had their speech patterns
analyzed, manipulated into storage with ample resources, then what you seek
is possible, accuracy is the constraint. 

> If there's a phone capable of this, I'd like to know about it.

If you are after a solution that can work with any voice, even limited
to a single language, then the answer is a long way away. Some would say
intractable. There is the question of accuracy required and the complexity
of vocabulary, sentence structure and allowed nominal variation on the voice(s).

> Surely we should be able with nowadays technology to achieve this.

With google sized resources, you can masquerade the problem with templates
for many different voices, but the underlying problems abound without limit.
What you  actually do is 'train' the google system to customize it's
translation of a given voice, very accurately over time.

Now say I disguise my voice with a throat infection, depressed attitude,
exuberance etc etc, you can see the troubles. In fact, the day after
watching horrible English cinema, which is often contagious (monty python
--life_of_bryan), I often develop a temporary 'Manchester slang' in the
vernacular. Endless, unlimited gyrations should one want to have a bit-o-fun
with language, particularly when any number of 'hill_billy' contaminants
manifest.

My mathematical belief is the problem is intractable, certainly as you
approach a high level of required accuracy. In fact folks routinely joust
with one anther around the 'looseness of language' and the various varieties
of layered meanings.... 

Truly intractable, but a 'dumbed down' simile surely will exist at some
point.  Google ibm in your searches as they did quite a bit of foundational
research in a variety of related areas of (speech/sound/voice) research.

Still, google's offering might prove acceptable for your needs.

hth,
James

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [gentoo-user] speech recognition?
  2016-05-17 21:24   ` Andrew Savchenko
@ 2016-05-18  1:40     ` wabe
  0 siblings, 0 replies; 8+ messages in thread
From: wabe @ 2016-05-18  1:40 UTC (permalink / raw
  To: gentoo-user

Andrew Savchenko <bircoph@gentoo.org> wrote:

> > https://cloud.google.com/speech/  
> 
> Sad, but there are no free software solutions available in this
> area; at least I'm not aware of them completely. This is
> understandable: the task is extraordinary in both manpower and
> computational resources required.

Actually there are some open source speech recognition programs. 

https://en.wikipedia.org/wiki/List_of_speech_recognition_software

Sphinx2 and Sphinx3 are even available as gentoo ebuilds 
(app-accessibility/sphinx2 and app-accessibility/sphinx3).

Sphinx4 as well as a lots of information about it is available 
here:

http://cmusphinx.sourceforge.net/

A German language model can be found here:

https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/German%20Voxforge/

However I don't know how much effort will be necessary to get the 
whole thing working for your purposes.

> And what Google offers here is not even a proprietary application,
> but an access to a proprietary cloud service which is temporarily
> free of charge as long as it is in alpha-testing stage.

That's right. Beside the fact that every spoken word will be 
transfered to Google (bad enough), it will not be free of charge for
ever.

--
Regards
wabe

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [gentoo-user] Re: speech recognition?
  2016-05-18  0:32 ` [gentoo-user] " James
@ 2016-05-18 17:11   ` James
  0 siblings, 0 replies; 8+ messages in thread
From: James @ 2016-05-18 17:11 UTC (permalink / raw
  To: gentoo-user

James <wireless <at> tampabay.rr.com> writes:

> lee <lee <at> yagibdah.de> writes:

You know, it just dawned on me a solution that *may* fit your needs.

If folks where to first employ a speech to text interpolate software,
on their (originating) end, then your friend would receive mostly accurate
text communications. The single source (voice) solution becomes more accurate
over time as one uses them. If this sort of semantic occurred before the
text was sent to your friend, then it's much easier to support and
refine.

Sadly, it does nothing for new sources of contact, until they refine their
own speech-to-text system.

Sometimes a simultaneous video feed help folks comprehend, by reading lips
whilst discerning audio signals, but that requires low-latency connections.

hth,
James

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [gentoo-user] Re: speech recognition?
  2016-05-15 14:34 [gentoo-user] speech recognition? lee
  2016-05-15 23:44 ` wabe
  2016-05-18  0:32 ` [gentoo-user] " James
@ 2016-05-26 14:31 ` Hans
  2016-06-03 15:58   ` James
  2 siblings, 1 reply; 8+ messages in thread
From: Hans @ 2016-05-26 14:31 UTC (permalink / raw
  To: gentoo-user

On 16/05/16 00:34, lee wrote:
> Hi,
>
> is there a speech recognition software or the like which is capable to
> listen in on a phone call in order to put on screen as text what the
> other person is saying?
>
> I'd like to connect that to a softphone so that someone who suffers from
> very bad hearing can talk to people on the phone more easily.  It must
> work for German.
>
> If there's a phone capable of this, I'd like to know about it.
>
> Surely we should be able with nowadays technology to achieve this.
>
>
There is a commercial dictation software for Windows and Mac. It may 
work with whine.
http://nuance.com/dragon/index.htm



^ permalink raw reply	[flat|nested] 8+ messages in thread

* [gentoo-user] Re: speech recognition?
  2016-05-26 14:31 ` Hans
@ 2016-06-03 15:58   ` James
  0 siblings, 0 replies; 8+ messages in thread
From: James @ 2016-06-03 15:58 UTC (permalink / raw
  To: gentoo-user

Hans <linux <at> c5ace.com> writes:


> > is there a speech recognition software or the like which is capable to
> > listen in on a phone call in order to put on screen as text what the
> > other person is saying?

> http://nuance.com/dragon/index.htm

I just ran accross this article::

http://chrislord.net/index.php/2016/06/01/open-source-speech-recognition/


hth,
James




^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-06-03 15:58 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-15 14:34 [gentoo-user] speech recognition? lee
2016-05-15 23:44 ` wabe
2016-05-17 21:24   ` Andrew Savchenko
2016-05-18  1:40     ` wabe
2016-05-18  0:32 ` [gentoo-user] " James
2016-05-18 17:11   ` James
2016-05-26 14:31 ` Hans
2016-06-03 15:58   ` James

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox