* [gentoo-user] help renaming files
@ 2010-04-07 16:21 luis jure
2010-04-07 16:32 ` KH
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: luis jure @ 2010-04-07 16:21 UTC (permalink / raw
To: gentoo-user
hello list.
i have a bunch of files with accented characters in their names, both
upper- and lower case. i want to rename them using the non-accented
equivalent. i thought that would be easy to do using something like tr.
big mistake. confronted with accented characters, tr outputs garbage.
searching the web, i found this: "Although the tr command respects C
locale environment variables, don't expect it to do anything sensible
with UTF-8 documents, such as being able to replace lower-case accented
characters with appropriate upper-case characters. The tr command works
best with ASCII and the other standard C locales."
i'm using es_UY.UTF8 and i can't make tr do anything useful.
any ideas?
best,
lj
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [gentoo-user] help renaming files
2010-04-07 16:21 [gentoo-user] help renaming files luis jure
@ 2010-04-07 16:32 ` KH
2010-04-07 17:29 ` Jonas de Buhr
2010-04-07 19:01 ` [gentoo-user] " Kerin Millar
2 siblings, 0 replies; 5+ messages in thread
From: KH @ 2010-04-07 16:32 UTC (permalink / raw
To: gentoo-user
Am 07.04.2010 18:21, schrieb luis jure:
>
> hello list.
>
> i have a bunch of files with accented characters in their names, both
> upper- and lower case. i want to rename them using the non-accented
> equivalent. i thought that would be easy to do using something like tr.
> big mistake. confronted with accented characters, tr outputs garbage.
>
> searching the web, i found this: "Although the tr command respects C
> locale environment variables, don't expect it to do anything sensible
> with UTF-8 documents, such as being able to replace lower-case accented
> characters with appropriate upper-case characters. The tr command works
> best with ASCII and the other standard C locales."
>
> i'm using es_UY.UTF8 and i can't make tr do anything useful.
>
> any ideas?
>
> best,
>
> lj
>
Hi,
I am really not in this but maybe something like this can help you:
http://www.gentoo-wiki.info/HOWTO_Create_an_Audio_CD#Clean_up_the_file_names
Regards
kh
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [gentoo-user] help renaming files
2010-04-07 16:21 [gentoo-user] help renaming files luis jure
2010-04-07 16:32 ` KH
@ 2010-04-07 17:29 ` Jonas de Buhr
2010-04-07 19:01 ` [gentoo-user] " Kerin Millar
2 siblings, 0 replies; 5+ messages in thread
From: Jonas de Buhr @ 2010-04-07 17:29 UTC (permalink / raw
To: gentoo-user
>i'm using es_UY.UTF8 and i can't make tr do anything useful.
>
>any ideas?
script it. python for example works well with unicode. you may want
os.rename() and maybe a dictionary with your substitutions.
/jdb
^ permalink raw reply [flat|nested] 5+ messages in thread
* [gentoo-user] Re: help renaming files
2010-04-07 16:21 [gentoo-user] help renaming files luis jure
2010-04-07 16:32 ` KH
2010-04-07 17:29 ` Jonas de Buhr
@ 2010-04-07 19:01 ` Kerin Millar
2010-04-07 19:47 ` luis jure
2 siblings, 1 reply; 5+ messages in thread
From: Kerin Millar @ 2010-04-07 19:01 UTC (permalink / raw
To: gentoo-user
On 07/04/2010 17:21, luis jure wrote:
>
> hello list.
>
> i have a bunch of files with accented characters in their names, both
> upper- and lower case. i want to rename them using the non-accented
> equivalent. i thought that would be easy to do using something like tr.
> big mistake. confronted with accented characters, tr outputs garbage.
>
> searching the web, i found this: "Although the tr command respects C
> locale environment variables, don't expect it to do anything sensible
> with UTF-8 documents, such as being able to replace lower-case accented
> characters with appropriate upper-case characters. The tr command works
> best with ASCII and the other standard C locales."
>
> i'm using es_UY.UTF8 and i can't make tr do anything useful.
It can be done with Perl. For example:
$ echo "El castellano es la lengua española oficial del Estado. Las
demás lenguas españolas serán también oficiales en las respectivas
Comunidades Autónomas" | perl -M'encoding utf8' -MUnicode::Normalize -pe
'$_=NFKD($_);s/\pM//og'
The following output should be seen:
El castellano es la lengua espanola oficial del Estado. Las demas
lenguas espanolas seran tambien oficiales en las respectivas Comunidades
Autonomas
Cheers,
--Kerin
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [gentoo-user] Re: help renaming files
2010-04-07 19:01 ` [gentoo-user] " Kerin Millar
@ 2010-04-07 19:47 ` luis jure
0 siblings, 0 replies; 5+ messages in thread
From: luis jure @ 2010-04-07 19:47 UTC (permalink / raw
To: gentoo-user
on 2010-04-07 at 20:01 Kerin Millar wrote:
>It can be done with Perl.
i was afraid someone was going to say that... :-)
> perl -M'encoding utf8' -MUnicode::Normalize -pe
> '$_=NFKD($_);s/\pM//og'
that works great, kerin, thank you! no idea though what $_=NFKD($_)
might mean... that's fine, i'll write it down in a script for future
use.
best,
lj
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-04-07 19:47 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-07 16:21 [gentoo-user] help renaming files luis jure
2010-04-07 16:32 ` KH
2010-04-07 17:29 ` Jonas de Buhr
2010-04-07 19:01 ` [gentoo-user] " Kerin Millar
2010-04-07 19:47 ` luis jure
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox