public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-dev] [PATCH] profiles: add more language codes to desc/l10n.desc
@ 2019-06-12 15:02 Marek Szuba
  2019-06-12 15:12 ` Michał Górny
  0 siblings, 1 reply; 4+ messages in thread
From: Marek Szuba @ 2019-06-12 15:02 UTC (permalink / raw
  To: gentoo-dev


[-- Attachment #1.1: Type: text/plain, Size: 2256 bytes --]


All of these are supported by recent versions of app-text/tesseract.
Checked against ISO-639 using the code tables from
https://iso639-3.sil.org/ .

Signed-off-by: Marek Szuba <marecki@gentoo.org>
---
 profiles/desc/l10n.desc | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/profiles/desc/l10n.desc b/profiles/desc/l10n.desc
index 4d30aa57eb3..e5e21346174 100644
--- a/profiles/desc/l10n.desc
+++ b/profiles/desc/l10n.desc
@@ -41,8 +41,10 @@ bs - Bosnian
 ca - Catalan
 ca-valencia - Catalan (Valencian)
 cak - Kaqchikel
+ceb - Cebuano
 chr - Cherokee
 cnr - Montenegrin
+co - Corsican
 cs - Czech
 cy - Welsh
 da - Danish
@@ -53,6 +55,7 @@ de-DE - German (Germany)
 dgo - Dogri (individual language)
 doi - Dogri (macrolanguage)
 dsb - Lower Sorbian
+dv - Dhivehi
 dz - Dzongkha
 el - Modern Greek
 en - English
@@ -88,13 +91,16 @@ he - Hebrew
 hi - Hindi
 hr - Croatian
 hsb - Upper Sorbian
+ht - Haitian
 hu - Hungarian
 hy - Armenian
 ia - Interlingua
 id - Indonesian
 is - Icelandic
 it - Italian
+iu - Inuktitut (macrolanguage)
 ja - Japanese
+jv - Javanese
 ka - Georgian
 kab - Kabyle
 kk - Kazakh
@@ -120,6 +126,7 @@ mn - Mongolian
 mni - Manipuri
 mr - Marathi
 ms - Malay (macrolanguage)
+mt - Maltese
 my - Burmese
 nan - Min Nan Chinese
 nb - Norwegian Bokmål
@@ -134,9 +141,11 @@ om - Oromo
 or - Oriya (macrolanguage)
 pa - Punjabi
 pl - Polish
+ps - Pashto (macrolanguage)
 pt - Portuguese
 pt-BR - Portuguese (Brazil)
 pt-PT - Portuguese (Portugal)
+qu - Quechua (macrolanguage)
 rm - Romansh
 ro - Romanian
 ru - Russian
@@ -156,6 +165,7 @@ sr - Serbian
 sr-Latn - Serbian (Latin script)
 ss - Swati
 st - Southern Sotho
+su - Sundanese
 sv - Swedish
 sw - Swahili (macrolanguage)
 sw-TZ - Swahili (Tanzania)
@@ -165,9 +175,11 @@ ta-LK - Tamil (Sri Lanka)
 te - Telugu
 tg - Tajik
 th - Thai
+ti - Tigrinya
 tk - Turkmen
 tl - Tagalog
 tn - Tswana
+to-TO - Tonga (Tonga Islands)
 tr - Turkish
 ts - Tsonga
 tt - Tatar
@@ -178,6 +190,8 @@ uz - Uzbek
 ve - Venda
 vi - Vietnamese
 xh - Xhosa
+yi - Yiddish (macrolanguage)
+yo - Yoruba
 zh - Chinese
 zh-CN - Chinese (China)
 zh-TW - Chinese (Taiwan)
-- 
2.21.0




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [gentoo-dev] [PATCH] profiles: add more language codes to desc/l10n.desc
  2019-06-12 15:02 [gentoo-dev] [PATCH] profiles: add more language codes to desc/l10n.desc Marek Szuba
@ 2019-06-12 15:12 ` Michał Górny
  2019-06-12 15:21   ` Marek Szuba
  0 siblings, 1 reply; 4+ messages in thread
From: Michał Górny @ 2019-06-12 15:12 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 2861 bytes --]

On Wed, 2019-06-12 at 16:02 +0100, Marek Szuba wrote:
> All of these are supported by recent versions of app-text/tesseract.
> Checked against ISO-639 using the code tables from
> https://iso639-3.sil.org/ .
> 
> Signed-off-by: Marek Szuba <marecki@gentoo.org>
> ---
>  profiles/desc/l10n.desc | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/profiles/desc/l10n.desc b/profiles/desc/l10n.desc
> index 4d30aa57eb3..e5e21346174 100644
> --- a/profiles/desc/l10n.desc
> +++ b/profiles/desc/l10n.desc
> @@ -41,8 +41,10 @@ bs - Bosnian
>  ca - Catalan
>  ca-valencia - Catalan (Valencian)
>  cak - Kaqchikel
> +ceb - Cebuano
>  chr - Cherokee
>  cnr - Montenegrin
> +co - Corsican
>  cs - Czech
>  cy - Welsh
>  da - Danish
> @@ -53,6 +55,7 @@ de-DE - German (Germany)
>  dgo - Dogri (individual language)
>  doi - Dogri (macrolanguage)
>  dsb - Lower Sorbian
> +dv - Dhivehi

IANA registry says:

| Subtag: dv
| Description: Dhivehi
| Description: Divehi
| Description: Maldivian

So maybe make it 'Dhivehi (Maldivian)'?

>  dz - Dzongkha
>  el - Modern Greek
>  en - English
> @@ -88,13 +91,16 @@ he - Hebrew
>  hi - Hindi
>  hr - Croatian
>  hsb - Upper Sorbian
> +ht - Haitian
>  hu - Hungarian
>  hy - Armenian
>  ia - Interlingua
>  id - Indonesian
>  is - Icelandic
>  it - Italian
> +iu - Inuktitut (macrolanguage)
>  ja - Japanese
> +jv - Javanese
>  ka - Georgian
>  kab - Kabyle
>  kk - Kazakh
> @@ -120,6 +126,7 @@ mn - Mongolian
>  mni - Manipuri
>  mr - Marathi
>  ms - Malay (macrolanguage)
> +mt - Maltese
>  my - Burmese
>  nan - Min Nan Chinese
>  nb - Norwegian Bokmål
> @@ -134,9 +141,11 @@ om - Oromo
>  or - Oriya (macrolanguage)
>  pa - Punjabi
>  pl - Polish
> +ps - Pashto (macrolanguage)
>  pt - Portuguese
>  pt-BR - Portuguese (Brazil)
>  pt-PT - Portuguese (Portugal)
> +qu - Quechua (macrolanguage)
>  rm - Romansh
>  ro - Romanian
>  ru - Russian
> @@ -156,6 +165,7 @@ sr - Serbian
>  sr-Latn - Serbian (Latin script)
>  ss - Swati
>  st - Southern Sotho
> +su - Sundanese
>  sv - Swedish
>  sw - Swahili (macrolanguage)
>  sw-TZ - Swahili (Tanzania)
> @@ -165,9 +175,11 @@ ta-LK - Tamil (Sri Lanka)
>  te - Telugu
>  tg - Tajik
>  th - Thai
> +ti - Tigrinya
>  tk - Turkmen
>  tl - Tagalog
>  tn - Tswana
> +to-TO - Tonga (Tonga Islands)

Why not just 'to'?

>  tr - Turkish
>  ts - Tsonga
>  tt - Tatar
> @@ -178,6 +190,8 @@ uz - Uzbek
>  ve - Venda
>  vi - Vietnamese
>  xh - Xhosa
> +yi - Yiddish (macrolanguage)
> +yo - Yoruba
>  zh - Chinese
>  zh-CN - Chinese (China)
>  zh-TW - Chinese (Taiwan)

Everything else looks correct according to IANA registry [1].

[1] http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry

-- 
Best regards,
Michał Górny


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 618 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [gentoo-dev] [PATCH] profiles: add more language codes to desc/l10n.desc
  2019-06-12 15:12 ` Michał Górny
@ 2019-06-12 15:21   ` Marek Szuba
  2019-06-12 16:55     ` Ulrich Mueller
  0 siblings, 1 reply; 4+ messages in thread
From: Marek Szuba @ 2019-06-12 15:21 UTC (permalink / raw
  To: gentoo-dev


[-- Attachment #1.1: Type: text/plain, Size: 705 bytes --]

On 2019-06-12 16:12, Michał Górny wrote:

>> +dv - Dhivehi
> 
> IANA registry says:
> 
> | Subtag: dv
> | Description: Dhivehi
> | Description: Divehi
> | Description: Maldivian
> 
> So maybe make it 'Dhivehi (Maldivian)'?

Originally I had all three names here but then I noticed that for at
least some of the existing entries which in the SIL database are
described by multiple names, lb for instance, only the first one from
the list is used. Personally I have got no opinion either way.

>> +to-TO - Tonga (Tonga Islands)
> 
> Why not just 'to'?

Good point, "to" does specifically refer to Tonga Islands Tonga after
all. I think I've been overly cautious here.

-- 
MS


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [gentoo-dev] [PATCH] profiles: add more language codes to desc/l10n.desc
  2019-06-12 15:21   ` Marek Szuba
@ 2019-06-12 16:55     ` Ulrich Mueller
  0 siblings, 0 replies; 4+ messages in thread
From: Ulrich Mueller @ 2019-06-12 16:55 UTC (permalink / raw
  To: Marek Szuba; +Cc: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1336 bytes --]

>>>>> On Wed, 12 Jun 2019, Marek Szuba wrote:

>>> +dv - Dhivehi
>> 
>> IANA registry says:
>> 
>> | Subtag: dv
>> | Description: Dhivehi
>> | Description: Divehi
>> | Description: Maldivian
>> 
>> So maybe make it 'Dhivehi (Maldivian)'?

> Originally I had all three names here but then I noticed that for at
> least some of the existing entries which in the SIL database are
> described by multiple names, lb for instance, only the first one from
> the list is used. Personally I have got no opinion either way.

The reference I had originally used is (warning, large text file):
https://www.iana.org/assignments/language-subtag-registry

If there are multiple descriptions for a language, use the first one.
So it's "Dhivehi" here.

>>> +to-TO - Tonga (Tonga Islands)
>> 
>> Why not just 'to'?

> Good point, "to" does specifically refer to Tonga Islands Tonga after
> all. I think I've been overly cautious here.

Right, this should be:
to - Tonga (Tonga Islands)

> +ps - Pashto (macrolanguage)

For this one, the first description in the IANA registry is "Pushto".

> +iu - Inuktitut (macrolanguage)
> +qu - Quechua (macrolanguage)
> +yi - Yiddish (macrolanguage)

Please omit "(macrolanguage)" from all of the above. It should be listed
only if it's part of the description.

Ulrich

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-06-12 16:56 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-06-12 15:02 [gentoo-dev] [PATCH] profiles: add more language codes to desc/l10n.desc Marek Szuba
2019-06-12 15:12 ` Michał Górny
2019-06-12 15:21   ` Marek Szuba
2019-06-12 16:55     ` Ulrich Mueller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox