public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-dev] Using LINGUAS
@ 2014-07-21  4:23 Thomas Kahle
  2014-07-21 12:03 ` Jeroen Roovers
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Thomas Kahle @ 2014-07-21  4:23 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1061 bytes --]

Hi,

the OCR software tesseract has many different plugins for
language packs used for OCR for different languages.  The ebuild
uses the LINGUAS variable to pass the choice of which packages to
install to the user.

A reverse dependency is app-text/pdfsandwich which roughly puts
OCR'ed text in a scanned pdf.  Since it uses tesseract it
supports exactly those languages that tesseract supports.

Should its ebuild have LINGUAS use flags and then depend on
tesseract with at least those flags set?

While it seems consistent to put the LINGUAS choice in the most
user facing package, in this case I would actually not put it in
here.  It would introduces a point of failure and maintenance
work for the each tesseract upgrade (since the language set
slightly changes from time to time).  A typical user would set
LINGUAS in her make.conf anyway.  In this case the same choice
applies to both packages anyway.  Maybe an einfo is sufficient to
inform the user it?

Cheers,
Thomas



-- 
Thomas Kahle
http://dev.gentoo.org/~tomka/


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 601 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [gentoo-dev] Using LINGUAS
  2014-07-21  4:23 [gentoo-dev] Using LINGUAS Thomas Kahle
@ 2014-07-21 12:03 ` Jeroen Roovers
  2014-07-21 12:26   ` Thomas Kahle
  2014-07-21 12:42 ` Michał Górny
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Jeroen Roovers @ 2014-07-21 12:03 UTC (permalink / raw
  To: gentoo-dev

On Mon, 21 Jul 2014 13:23:46 +0900
Thomas Kahle <tomka@gentoo.org> wrote:

> the OCR software tesseract has many different plugins for
> language packs used for OCR for different languages.  The ebuild
> uses the LINGUAS variable to pass the choice of which packages to
> install to the user.

Every ebuild uses LINGUAS implicitly. What you mean is that you
expand LINGUAS to USE Flags...

> A reverse dependency is app-text/pdfsandwich which roughly puts
> OCR'ed text in a scanned pdf.  Since it uses tesseract it
> supports exactly those languages that tesseract supports.
> 
> Should its ebuild have LINGUAS use flags and then depend on
> tesseract with at least those flags set?

... in which case you can simply use USE dependencies since you have
USE-expanded flags matching LINGUAS. It should be easy to match one
ebuild's IUSE with another's.

linguas_tlh? ( app-text/tesseract[linguas_tlh] )


     jer


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [gentoo-dev] Using LINGUAS
  2014-07-21 12:03 ` Jeroen Roovers
@ 2014-07-21 12:26   ` Thomas Kahle
  2014-07-21 12:31     ` Jeroen Roovers
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Kahle @ 2014-07-21 12:26 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1522 bytes --]

On 21/07/14 21:03, Jeroen Roovers wrote:
> On Mon, 21 Jul 2014 13:23:46 +0900
> Thomas Kahle <tomka@gentoo.org> wrote:
> 
>> the OCR software tesseract has many different plugins for
>> language packs used for OCR for different languages.  The ebuild
>> uses the LINGUAS variable to pass the choice of which packages to
>> install to the user.
> 
> Every ebuild uses LINGUAS implicitly. What you mean is that you
> expand LINGUAS to USE Flags...

Yes, I did not use the correct terminology.

>> A reverse dependency is app-text/pdfsandwich which roughly puts
>> OCR'ed text in a scanned pdf.  Since it uses tesseract it
>> supports exactly those languages that tesseract supports.
>>
>> Should its ebuild have LINGUAS use flags and then depend on
>> tesseract with at least those flags set?
> 
> ... in which case you can simply use USE dependencies since you have
> USE-expanded flags matching LINGUAS. It should be easy to match one
> ebuild's IUSE with another's.
> 
> linguas_tlh? ( app-text/tesseract[linguas_tlh] )

I know how to specify USE dependcies.

Since you deleted it, let me ask my question again: If I follow
this method I will have 37 dependencies all of this form.  This
is pointless because

a) Everytime tesseract gains or loses a language support (it does
happen!) the pdfsandwich ebuild needs to be updated,

b) Nobody wants to set different linguas on tesseract and
pdfsandwich anyway.

Cheers,
Thomas


-- 
Thomas Kahle
http://dev.gentoo.org/~tomka/


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 601 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [gentoo-dev] Using LINGUAS
  2014-07-21 12:26   ` Thomas Kahle
@ 2014-07-21 12:31     ` Jeroen Roovers
  0 siblings, 0 replies; 8+ messages in thread
From: Jeroen Roovers @ 2014-07-21 12:31 UTC (permalink / raw
  To: gentoo-dev

On Mon, 21 Jul 2014 21:26:09 +0900
Thomas Kahle <tomka@gentoo.org> wrote:

> Since you deleted it

sorry

>, let me ask my question again: If I follow
> this method I will have 37 dependencies all of this form.  This
> is pointless because
> 
> a) Everytime tesseract gains or loses a language support (it does
> happen!) the pdfsandwich ebuild needs to be updated,
> 
> b) Nobody wants to set different linguas on tesseract and
> pdfsandwich anyway.

Sounds like it's useful, not pointless at all. And you can have the
ebuilds automatically generate those dependencies. It's a lot less work
if you keep all the linguas in a variables and loop over it to generate
IUSE+= and a dependency list.


     jer


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [gentoo-dev] Using LINGUAS
  2014-07-21  4:23 [gentoo-dev] Using LINGUAS Thomas Kahle
  2014-07-21 12:03 ` Jeroen Roovers
@ 2014-07-21 12:42 ` Michał Górny
  2014-07-22  1:31   ` Thomas Kahle
  2014-07-22  1:43 ` Alex Xu
  2014-07-22 20:30 ` Mart Raudsepp
  3 siblings, 1 reply; 8+ messages in thread
From: Michał Górny @ 2014-07-21 12:42 UTC (permalink / raw
  To: Thomas Kahle; +Cc: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1882 bytes --]

Dnia 2014-07-21, o godz. 13:23:46
Thomas Kahle <tomka@gentoo.org> napisał(a):

> the OCR software tesseract has many different plugins for
> language packs used for OCR for different languages.  The ebuild
> uses the LINGUAS variable to pass the choice of which packages to
> install to the user.
> 
> A reverse dependency is app-text/pdfsandwich which roughly puts
> OCR'ed text in a scanned pdf.  Since it uses tesseract it
> supports exactly those languages that tesseract supports.

Do I understand correctly that pdfsandwich doesn't have any explicit
switches for language support? In other words, adding support for
another language requires rebuilding tesseract and not pdfsandwich?

> Should its ebuild have LINGUAS use flags and then depend on
> tesseract with at least those flags set?
> 
> While it seems consistent to put the LINGUAS choice in the most
> user facing package, in this case I would actually not put it in
> here.  It would introduces a point of failure and maintenance
> work for the each tesseract upgrade (since the language set
> slightly changes from time to time).  A typical user would set
> LINGUAS in her make.conf anyway.  In this case the same choice
> applies to both packages anyway.  Maybe an einfo is sufficient to
> inform the user it?

I have no idea where did you get the 'most user facing' idea from but
this is not really true or useful. The whole idea of libraries like
imagemagick is about hiding unnecessary dependencies under single
interface -- now imagine every package using imagemagick declaring
flags for all the formats supported by it...

If pdfsandwich itself doesn't do anything with LINGUAS, don't declare
it. The rule about USE flags not doing anything applies here. Moreover,
LINGUAS are usually set globally so scope is not really an issue here.

-- 
Best regards,
Michał Górny

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 949 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [gentoo-dev] Using LINGUAS
  2014-07-21 12:42 ` Michał Górny
@ 2014-07-22  1:31   ` Thomas Kahle
  0 siblings, 0 replies; 8+ messages in thread
From: Thomas Kahle @ 2014-07-22  1:31 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 2440 bytes --]

Hi,

On 21/07/14 21:42, Michał Górny wrote:
> Dnia 2014-07-21, o godz. 13:23:46
> Thomas Kahle <tomka@gentoo.org> napisał(a):
> 
>> the OCR software tesseract has many different plugins for
>> language packs used for OCR for different languages.  The ebuild
>> uses the LINGUAS variable to pass the choice of which packages to
>> install to the user.
>>
>> A reverse dependency is app-text/pdfsandwich which roughly puts
>> OCR'ed text in a scanned pdf.  Since it uses tesseract it
>> supports exactly those languages that tesseract supports.
> 
> Do I understand correctly that pdfsandwich doesn't have any explicit
> switches for language support? In other words, adding support for
> another language requires rebuilding tesseract and not pdfsandwich?

Exactly, pdfsandwich combines tesseract with some postprocessing
that is not language specific.

>> Should its ebuild have LINGUAS use flags and then depend on
>> tesseract with at least those flags set?
>>
>> While it seems consistent to put the LINGUAS choice in the most
>> user facing package, in this case I would actually not put it in
>> here.  It would introduces a point of failure and maintenance
>> work for the each tesseract upgrade (since the language set
>> slightly changes from time to time).  A typical user would set
>> LINGUAS in her make.conf anyway.  In this case the same choice
>> applies to both packages anyway.  Maybe an einfo is sufficient to
>> inform the user it?
> 
> I have no idea where did you get the 'most user facing' idea from but
> this is not really true or useful. The whole idea of libraries like
> imagemagick is about hiding unnecessary dependencies under single
> interface -- now imagine every package using imagemagick declaring
> flags for all the formats supported by it...

If I don't know anything about tesseract but only install
pdfsandwich and then try to scan japanese it won't work out of
the box.  How should the user know that she has to put japanese
in ther LINGUAS variable and rebuild tesseract afterwards?

Probably a simple einfo in pdfsandwich should do it.

> If pdfsandwich itself doesn't do anything with LINGUAS, don't declare
> it. The rule about USE flags not doing anything applies here.
> Moreover, LINGUAS are usually set globally so scope is not
> really an issue here.

I agree.

Cheers,
Thomas



-- 
Thomas Kahle
http://dev.gentoo.org/~tomka/


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 601 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [gentoo-dev] Using LINGUAS
  2014-07-21  4:23 [gentoo-dev] Using LINGUAS Thomas Kahle
  2014-07-21 12:03 ` Jeroen Roovers
  2014-07-21 12:42 ` Michał Górny
@ 2014-07-22  1:43 ` Alex Xu
  2014-07-22 20:30 ` Mart Raudsepp
  3 siblings, 0 replies; 8+ messages in thread
From: Alex Xu @ 2014-07-22  1:43 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1840 bytes --]

On 21/07/14 12:23 AM, Thomas Kahle wrote:
> Hi,
> 
> the OCR software tesseract has many different plugins for
> language packs used for OCR for different languages.  The ebuild
> uses the LINGUAS variable to pass the choice of which packages to
> install to the user.
> 
> A reverse dependency is app-text/pdfsandwich which roughly puts
> OCR'ed text in a scanned pdf.  Since it uses tesseract it
> supports exactly those languages that tesseract supports.
> 
> Should its ebuild have LINGUAS use flags and then depend on
> tesseract with at least those flags set?
> 
> While it seems consistent to put the LINGUAS choice in the most
> user facing package, in this case I would actually not put it in
> here.  It would introduces a point of failure and maintenance
> work for the each tesseract upgrade (since the language set
> slightly changes from time to time).  A typical user would set
> LINGUAS in her make.conf anyway.  In this case the same choice
> applies to both packages anyway.  Maybe an einfo is sufficient to
> inform the user it?
> 
> Cheers,
> Thomas
> 

there are two possible scenarios here.

1. the dependency is COMPILE TIME (ABI, API, whatever). in this
scenario, the depender *must* have appropriate LINGUAS, even if that
means copying and pasting from the dependee. this is necessary for
correct rebuilding, and everything else associated with automagic deps.

2. the dependency is RUN TIME. in this scenario, the case is the same
with all other runtime USE dependencies; that is to say, the correct
solution is USE_RUNTIME or something along those lines. [0] here, I
would say that einfo is superior to copying IUSE, since these flags
should be set globally anyways to make sense.


[0] please no bikeshedding on whether to call it RUNTIME_USE or ǝsn‾ǝɯıʇunɹ.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [gentoo-dev] Using LINGUAS
  2014-07-21  4:23 [gentoo-dev] Using LINGUAS Thomas Kahle
                   ` (2 preceding siblings ...)
  2014-07-22  1:43 ` Alex Xu
@ 2014-07-22 20:30 ` Mart Raudsepp
  3 siblings, 0 replies; 8+ messages in thread
From: Mart Raudsepp @ 2014-07-22 20:30 UTC (permalink / raw
  To: gentoo-dev

Hello,

LINGUAS is a concept in gettext tooling. I do not understand why we
overload it in package management in the first place.
It is an environment variable that we set up in make.conf, because
that's an easy way to get it into the build environment to have the
standard way of limiting translations work.

By overloading it for IUSE_EXPAND we effectively make it pretty much
impossible to have the choice of ALL translation files, except when it
means extra packages; without conditional LINGUAS setting, that is.


The standard LINGUAS variable acts as follows:

If unset: Build all translations
If set to an UNORDERED listing of language codes: Include translations
for listed languages (or dialects)
If set to an empty string or similar: Don't include any translations


We currently have wrong behaviour for when it's unset, as as far as
IUSE_EXPAND is concerned - we don't have a default that includes all
available linguas as far as I know.


Though in the real world, I don't think it matters much, and it's
convenient for those that just build a gentoo machine for use within the
family, with known language capabilities within.


As a side note: LINGUAS does not only control which .mo files happen to
be installed (which you could get rid of later easily with localepurge)
- it also is used to filter out unwanted translations in files which
have all the translations in the same file; this includes, but is not
limited to .desktop files.
This used to be a intltool thing, but nowadays gettext has derived such
support directly as well.


Mart



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-07-22 20:30 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-21  4:23 [gentoo-dev] Using LINGUAS Thomas Kahle
2014-07-21 12:03 ` Jeroen Roovers
2014-07-21 12:26   ` Thomas Kahle
2014-07-21 12:31     ` Jeroen Roovers
2014-07-21 12:42 ` Michał Górny
2014-07-22  1:31   ` Thomas Kahle
2014-07-22  1:43 ` Alex Xu
2014-07-22 20:30 ` Mart Raudsepp

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox