* [gentoo-dev] [PATCH] profiles: add more language codes to desc/l10n.desc
@ 2019-06-12 15:02 Marek Szuba
2019-06-12 15:12 ` Michał Górny
0 siblings, 1 reply; 4+ messages in thread
From: Marek Szuba @ 2019-06-12 15:02 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1.1: Type: text/plain, Size: 2256 bytes --]
All of these are supported by recent versions of app-text/tesseract.
Checked against ISO-639 using the code tables from
https://iso639-3.sil.org/ .
Signed-off-by: Marek Szuba <marecki@gentoo.org>
---
profiles/desc/l10n.desc | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/profiles/desc/l10n.desc b/profiles/desc/l10n.desc
index 4d30aa57eb3..e5e21346174 100644
--- a/profiles/desc/l10n.desc
+++ b/profiles/desc/l10n.desc
@@ -41,8 +41,10 @@ bs - Bosnian
ca - Catalan
ca-valencia - Catalan (Valencian)
cak - Kaqchikel
+ceb - Cebuano
chr - Cherokee
cnr - Montenegrin
+co - Corsican
cs - Czech
cy - Welsh
da - Danish
@@ -53,6 +55,7 @@ de-DE - German (Germany)
dgo - Dogri (individual language)
doi - Dogri (macrolanguage)
dsb - Lower Sorbian
+dv - Dhivehi
dz - Dzongkha
el - Modern Greek
en - English
@@ -88,13 +91,16 @@ he - Hebrew
hi - Hindi
hr - Croatian
hsb - Upper Sorbian
+ht - Haitian
hu - Hungarian
hy - Armenian
ia - Interlingua
id - Indonesian
is - Icelandic
it - Italian
+iu - Inuktitut (macrolanguage)
ja - Japanese
+jv - Javanese
ka - Georgian
kab - Kabyle
kk - Kazakh
@@ -120,6 +126,7 @@ mn - Mongolian
mni - Manipuri
mr - Marathi
ms - Malay (macrolanguage)
+mt - Maltese
my - Burmese
nan - Min Nan Chinese
nb - Norwegian Bokmål
@@ -134,9 +141,11 @@ om - Oromo
or - Oriya (macrolanguage)
pa - Punjabi
pl - Polish
+ps - Pashto (macrolanguage)
pt - Portuguese
pt-BR - Portuguese (Brazil)
pt-PT - Portuguese (Portugal)
+qu - Quechua (macrolanguage)
rm - Romansh
ro - Romanian
ru - Russian
@@ -156,6 +165,7 @@ sr - Serbian
sr-Latn - Serbian (Latin script)
ss - Swati
st - Southern Sotho
+su - Sundanese
sv - Swedish
sw - Swahili (macrolanguage)
sw-TZ - Swahili (Tanzania)
@@ -165,9 +175,11 @@ ta-LK - Tamil (Sri Lanka)
te - Telugu
tg - Tajik
th - Thai
+ti - Tigrinya
tk - Turkmen
tl - Tagalog
tn - Tswana
+to-TO - Tonga (Tonga Islands)
tr - Turkish
ts - Tsonga
tt - Tatar
@@ -178,6 +190,8 @@ uz - Uzbek
ve - Venda
vi - Vietnamese
xh - Xhosa
+yi - Yiddish (macrolanguage)
+yo - Yoruba
zh - Chinese
zh-CN - Chinese (China)
zh-TW - Chinese (Taiwan)
--
2.21.0
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [gentoo-dev] [PATCH] profiles: add more language codes to desc/l10n.desc
2019-06-12 15:02 [gentoo-dev] [PATCH] profiles: add more language codes to desc/l10n.desc Marek Szuba
@ 2019-06-12 15:12 ` Michał Górny
2019-06-12 15:21 ` Marek Szuba
0 siblings, 1 reply; 4+ messages in thread
From: Michał Górny @ 2019-06-12 15:12 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 2861 bytes --]
On Wed, 2019-06-12 at 16:02 +0100, Marek Szuba wrote:
> All of these are supported by recent versions of app-text/tesseract.
> Checked against ISO-639 using the code tables from
> https://iso639-3.sil.org/ .
>
> Signed-off-by: Marek Szuba <marecki@gentoo.org>
> ---
> profiles/desc/l10n.desc | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/profiles/desc/l10n.desc b/profiles/desc/l10n.desc
> index 4d30aa57eb3..e5e21346174 100644
> --- a/profiles/desc/l10n.desc
> +++ b/profiles/desc/l10n.desc
> @@ -41,8 +41,10 @@ bs - Bosnian
> ca - Catalan
> ca-valencia - Catalan (Valencian)
> cak - Kaqchikel
> +ceb - Cebuano
> chr - Cherokee
> cnr - Montenegrin
> +co - Corsican
> cs - Czech
> cy - Welsh
> da - Danish
> @@ -53,6 +55,7 @@ de-DE - German (Germany)
> dgo - Dogri (individual language)
> doi - Dogri (macrolanguage)
> dsb - Lower Sorbian
> +dv - Dhivehi
IANA registry says:
| Subtag: dv
| Description: Dhivehi
| Description: Divehi
| Description: Maldivian
So maybe make it 'Dhivehi (Maldivian)'?
> dz - Dzongkha
> el - Modern Greek
> en - English
> @@ -88,13 +91,16 @@ he - Hebrew
> hi - Hindi
> hr - Croatian
> hsb - Upper Sorbian
> +ht - Haitian
> hu - Hungarian
> hy - Armenian
> ia - Interlingua
> id - Indonesian
> is - Icelandic
> it - Italian
> +iu - Inuktitut (macrolanguage)
> ja - Japanese
> +jv - Javanese
> ka - Georgian
> kab - Kabyle
> kk - Kazakh
> @@ -120,6 +126,7 @@ mn - Mongolian
> mni - Manipuri
> mr - Marathi
> ms - Malay (macrolanguage)
> +mt - Maltese
> my - Burmese
> nan - Min Nan Chinese
> nb - Norwegian Bokmål
> @@ -134,9 +141,11 @@ om - Oromo
> or - Oriya (macrolanguage)
> pa - Punjabi
> pl - Polish
> +ps - Pashto (macrolanguage)
> pt - Portuguese
> pt-BR - Portuguese (Brazil)
> pt-PT - Portuguese (Portugal)
> +qu - Quechua (macrolanguage)
> rm - Romansh
> ro - Romanian
> ru - Russian
> @@ -156,6 +165,7 @@ sr - Serbian
> sr-Latn - Serbian (Latin script)
> ss - Swati
> st - Southern Sotho
> +su - Sundanese
> sv - Swedish
> sw - Swahili (macrolanguage)
> sw-TZ - Swahili (Tanzania)
> @@ -165,9 +175,11 @@ ta-LK - Tamil (Sri Lanka)
> te - Telugu
> tg - Tajik
> th - Thai
> +ti - Tigrinya
> tk - Turkmen
> tl - Tagalog
> tn - Tswana
> +to-TO - Tonga (Tonga Islands)
Why not just 'to'?
> tr - Turkish
> ts - Tsonga
> tt - Tatar
> @@ -178,6 +190,8 @@ uz - Uzbek
> ve - Venda
> vi - Vietnamese
> xh - Xhosa
> +yi - Yiddish (macrolanguage)
> +yo - Yoruba
> zh - Chinese
> zh-CN - Chinese (China)
> zh-TW - Chinese (Taiwan)
Everything else looks correct according to IANA registry [1].
[1] http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry
--
Best regards,
Michał Górny
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 618 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [gentoo-dev] [PATCH] profiles: add more language codes to desc/l10n.desc
2019-06-12 15:12 ` Michał Górny
@ 2019-06-12 15:21 ` Marek Szuba
2019-06-12 16:55 ` Ulrich Mueller
0 siblings, 1 reply; 4+ messages in thread
From: Marek Szuba @ 2019-06-12 15:21 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1.1: Type: text/plain, Size: 705 bytes --]
On 2019-06-12 16:12, Michał Górny wrote:
>> +dv - Dhivehi
>
> IANA registry says:
>
> | Subtag: dv
> | Description: Dhivehi
> | Description: Divehi
> | Description: Maldivian
>
> So maybe make it 'Dhivehi (Maldivian)'?
Originally I had all three names here but then I noticed that for at
least some of the existing entries which in the SIL database are
described by multiple names, lb for instance, only the first one from
the list is used. Personally I have got no opinion either way.
>> +to-TO - Tonga (Tonga Islands)
>
> Why not just 'to'?
Good point, "to" does specifically refer to Tonga Islands Tonga after
all. I think I've been overly cautious here.
--
MS
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [gentoo-dev] [PATCH] profiles: add more language codes to desc/l10n.desc
2019-06-12 15:21 ` Marek Szuba
@ 2019-06-12 16:55 ` Ulrich Mueller
0 siblings, 0 replies; 4+ messages in thread
From: Ulrich Mueller @ 2019-06-12 16:55 UTC (permalink / raw
To: Marek Szuba; +Cc: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 1336 bytes --]
>>>>> On Wed, 12 Jun 2019, Marek Szuba wrote:
>>> +dv - Dhivehi
>>
>> IANA registry says:
>>
>> | Subtag: dv
>> | Description: Dhivehi
>> | Description: Divehi
>> | Description: Maldivian
>>
>> So maybe make it 'Dhivehi (Maldivian)'?
> Originally I had all three names here but then I noticed that for at
> least some of the existing entries which in the SIL database are
> described by multiple names, lb for instance, only the first one from
> the list is used. Personally I have got no opinion either way.
The reference I had originally used is (warning, large text file):
https://www.iana.org/assignments/language-subtag-registry
If there are multiple descriptions for a language, use the first one.
So it's "Dhivehi" here.
>>> +to-TO - Tonga (Tonga Islands)
>>
>> Why not just 'to'?
> Good point, "to" does specifically refer to Tonga Islands Tonga after
> all. I think I've been overly cautious here.
Right, this should be:
to - Tonga (Tonga Islands)
> +ps - Pashto (macrolanguage)
For this one, the first description in the IANA registry is "Pushto".
> +iu - Inuktitut (macrolanguage)
> +qu - Quechua (macrolanguage)
> +yi - Yiddish (macrolanguage)
Please omit "(macrolanguage)" from all of the above. It should be listed
only if it's part of the description.
Ulrich
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-06-12 16:56 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-06-12 15:02 [gentoo-dev] [PATCH] profiles: add more language codes to desc/l10n.desc Marek Szuba
2019-06-12 15:12 ` Michał Górny
2019-06-12 15:21 ` Marek Szuba
2019-06-12 16:55 ` Ulrich Mueller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox