From: Hongyi Zhao on
Hi all,

Suppose I've some files with their names consist Chinese characters
and all of these files are resided in the same directory. Now, I want
to delete Chinese characters from all of these filenames. What should
I do?

Thanks in advance.
--
..: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :.
From: Ben Bacarisse on
Hongyi Zhao <hongyi.zhao(a)gmail.com> writes:

> Suppose I've some files with their names consist Chinese characters
> and all of these files are resided in the same directory. Now, I want
> to delete Chinese characters from all of these filenames. What should
> I do?

A very useful tool for such things is iconv. It will convert between
one character encoding an another, but it can also be asked to drop
any characters that can't be encoded in the target character set (the
-c flag). Thus:

iconv -c --from=utf-8 --to=ascii

drops all UTF-8 encoded characters that are not in the 7-bit ASCII table.

for f in *; do
mv $f $(echo "$f" | iconv -c --from=utf-8 --to=ascii)
done

(untested -- check before using!!) should do what you want. Of
course, UTF-8 is only one possible encoding for Chinese characters, so
you might have to change that part of the example.

--
Ben.
From: Ben Finney on
Ben Bacarisse <ben.usenet(a)bsb.me.uk> writes:

> for f in *; do
> mv $f $(echo "$f" | iconv -c --from=utf-8 --to=ascii)
> done
>
> (untested -- check before using!!) should do what you want. Of course,
> UTF-8 is only one possible encoding for Chinese characters, so you
> might have to change that part of the example.

You will also need to consider the scenario when the removal of some
characters results in a collision in the names.

--
\ “The whole area of [treating source code as intellectual |
`\ property] is almost assuring a customer that you are not going |
_o__) to do any innovation in the future.” —Gary Barnett |
Ben Finney
From: Ben Bacarisse on
Ben Finney <ben+unix(a)benfinney.id.au> writes:

> Ben Bacarisse <ben.usenet(a)bsb.me.uk> writes:
>
>> for f in *; do
>> mv $f $(echo "$f" | iconv -c --from=utf-8 --to=ascii)

<sigh> missing quotes round both arguments to mv:

mv "$f" "$(echo """$f""" | iconv -c --from=utf-8 --to=ascii)"

>> done
>>
>> (untested -- check before using!!) should do what you want. Of course,
>> UTF-8 is only one possible encoding for Chinese characters, so you
>> might have to change that part of the example.
>
> You will also need to consider the scenario when the removal of some
> characters results in a collision in the names.

Well, the OP will, yes.

To the OP: if you extend this idea to a more general move of a path
name, take care with other effects of dropping characters. Components
of the path can turn into . or .. (this can happen even with the loop
above) or might be dropped altogether.

--
Ben.