public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-user] help renaming files
@ 2010-04-07 16:21 luis jure
  2010-04-07 16:32 ` KH
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: luis jure @ 2010-04-07 16:21 UTC (permalink / raw
  To: gentoo-user


hello list.

i have a bunch of files with accented characters in their names, both
upper- and lower case. i want to rename them using the non-accented
equivalent. i thought that would be easy to do using something like tr.
big mistake. confronted with accented characters, tr outputs garbage. 

searching the web, i found this: "Although the tr command respects C
locale environment variables, don't expect it to do anything sensible
with UTF-8 documents, such as being able to replace lower-case accented
characters with appropriate upper-case characters. The tr command works
best with ASCII and the other standard C locales."

i'm using es_UY.UTF8 and i can't make tr do anything useful.

any ideas?

best,

lj



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [gentoo-user] help renaming files
  2010-04-07 16:21 [gentoo-user] help renaming files luis jure
@ 2010-04-07 16:32 ` KH
  2010-04-07 17:29 ` Jonas de Buhr
  2010-04-07 19:01 ` [gentoo-user] " Kerin Millar
  2 siblings, 0 replies; 5+ messages in thread
From: KH @ 2010-04-07 16:32 UTC (permalink / raw
  To: gentoo-user

Am 07.04.2010 18:21, schrieb luis jure:
>
> hello list.
>
> i have a bunch of files with accented characters in their names, both
> upper- and lower case. i want to rename them using the non-accented
> equivalent. i thought that would be easy to do using something like tr.
> big mistake. confronted with accented characters, tr outputs garbage.
>
> searching the web, i found this: "Although the tr command respects C
> locale environment variables, don't expect it to do anything sensible
> with UTF-8 documents, such as being able to replace lower-case accented
> characters with appropriate upper-case characters. The tr command works
> best with ASCII and the other standard C locales."
>
> i'm using es_UY.UTF8 and i can't make tr do anything useful.
>
> any ideas?
>
> best,
>
> lj
>


Hi,
I am really not in this but maybe something like this can help you:

http://www.gentoo-wiki.info/HOWTO_Create_an_Audio_CD#Clean_up_the_file_names

Regards
kh



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [gentoo-user] help renaming files
  2010-04-07 16:21 [gentoo-user] help renaming files luis jure
  2010-04-07 16:32 ` KH
@ 2010-04-07 17:29 ` Jonas de Buhr
  2010-04-07 19:01 ` [gentoo-user] " Kerin Millar
  2 siblings, 0 replies; 5+ messages in thread
From: Jonas de Buhr @ 2010-04-07 17:29 UTC (permalink / raw
  To: gentoo-user

>i'm using es_UY.UTF8 and i can't make tr do anything useful.
>
>any ideas?

script it. python for example works well with unicode. you may want
os.rename() and maybe a dictionary with your substitutions.

/jdb



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [gentoo-user] Re: help renaming files
  2010-04-07 16:21 [gentoo-user] help renaming files luis jure
  2010-04-07 16:32 ` KH
  2010-04-07 17:29 ` Jonas de Buhr
@ 2010-04-07 19:01 ` Kerin Millar
  2010-04-07 19:47   ` luis jure
  2 siblings, 1 reply; 5+ messages in thread
From: Kerin Millar @ 2010-04-07 19:01 UTC (permalink / raw
  To: gentoo-user

On 07/04/2010 17:21, luis jure wrote:
>
> hello list.
>
> i have a bunch of files with accented characters in their names, both
> upper- and lower case. i want to rename them using the non-accented
> equivalent. i thought that would be easy to do using something like tr.
> big mistake. confronted with accented characters, tr outputs garbage.
>
> searching the web, i found this: "Although the tr command respects C
> locale environment variables, don't expect it to do anything sensible
> with UTF-8 documents, such as being able to replace lower-case accented
> characters with appropriate upper-case characters. The tr command works
> best with ASCII and the other standard C locales."
>
> i'm using es_UY.UTF8 and i can't make tr do anything useful.

It can be done with Perl. For example:

$ echo "El castellano es la lengua española oficial del Estado. Las 
demás lenguas españolas serán también oficiales en las respectivas 
Comunidades Autónomas" | perl -M'encoding utf8' -MUnicode::Normalize -pe 
'$_=NFKD($_);s/\pM//og'

The following output should be seen:

El castellano es la lengua espanola oficial del Estado. Las demas 
lenguas espanolas seran tambien oficiales en las respectivas Comunidades 
Autonomas

Cheers,

--Kerin




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [gentoo-user] Re: help renaming files
  2010-04-07 19:01 ` [gentoo-user] " Kerin Millar
@ 2010-04-07 19:47   ` luis jure
  0 siblings, 0 replies; 5+ messages in thread
From: luis jure @ 2010-04-07 19:47 UTC (permalink / raw
  To: gentoo-user

on 2010-04-07 at 20:01 Kerin Millar wrote:

>It can be done with Perl.

i was afraid someone was going to say that... :-)


> perl -M'encoding utf8' -MUnicode::Normalize -pe
> '$_=NFKD($_);s/\pM//og'

that works great, kerin, thank you! no idea though what $_=NFKD($_)
might mean... that's fine, i'll write it down in a script for future
use.

best,

lj



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-04-07 19:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-07 16:21 [gentoo-user] help renaming files luis jure
2010-04-07 16:32 ` KH
2010-04-07 17:29 ` Jonas de Buhr
2010-04-07 19:01 ` [gentoo-user] " Kerin Millar
2010-04-07 19:47   ` luis jure

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox