From: Glen on
I am trying to learn the "tr" command. But am confused with this
behaviour.

###################################################################

C-4547:~/rajshell# cat test.txt
Mon Nov 13 13:17:04 IST 2000
root pts/0 Nov 13 10:03 (192.168.29.157)
/root/raj
..:
total 196
1456508 d-wx--xr-- 3 root root 4096 Nov 13 13:16 ./
32705 drwxr-xr-x 31 root root 4096 Nov 11 12:30 ../
1456586 -rwxrwxrwx 1 root root 1786 Nov 11 15:22
1st.txt*
1457067 -rwxrwxrwx 1 root root 1786 Nov 11 15:23
2nd.txt*
1456512 -rwxrwxrwx 1 root root 376 Nov 11 14:26
directorycontents.txt*
1815773 drwxrwxrwx 3 root root 4096 Nov 11 15:17 login/
1457068 lrwxrwxrwx 2 root root 6 Nov 12 10:18
loginmodule -> logins*



C-4547:~/rajshell# cat test.txt | tr '/->*()' ' '
Mon Nov IST
root pts Nov . . .
root raj
..
total
d-wx--xr-- root root Nov .
drwxr-xr-x root root Nov ..
-rwxrwxrwx root root Nov
st.txt
-rwxrwxrwx root root Nov
nd.txt
-rwxrwxrwx root root Nov
directorycontents.txt
drwxrwxrwx root root Nov login
lrwxrwxrwx root root Nov
loginmodule - logins
C-4547:~/rajshell#

###################################################################

I have not mentioned the removal of the [0-9] in set1. But it is
removed. I have tried providing the set1 '/->*()' one character at a
time and it works. However, when the full construction is performed,
there is removal of numbers.

Pls help

Thanks,
Glen
------- Beginner in Shell & Unix -------
From: Laurianne Gardeux on
Le Fri, 19 Dec 2008 00:53:46 -0800, Glen a �crit�:

> I am trying to learn the "tr" command. But am confused with this
> behaviour.
>
> ###################################################################
>
> C-4547:~/rajshell# cat test.txt
> Mon Nov 13 13:17:04 IST 2000
> root pts/0 Nov 13 10:03 (192.168.29.157) /root/raj
> .:
> total 196
> 1456508 d-wx--xr-- 3 root root 4096 Nov 13 13:16 ./
> 32705 drwxr-xr-x 31 root root 4096 Nov 11 12:30 ../
> 1456586 -rwxrwxrwx 1 root root 1786 Nov 11 15:22 1st.txt*
> 1457067 -rwxrwxrwx 1 root root 1786 Nov 11 15:23 2nd.txt*
> 1456512 -rwxrwxrwx 1 root root 376 Nov 11 14:26
> directorycontents.txt*
> 1815773 drwxrwxrwx 3 root root 4096 Nov 11 15:17 login/
> 1457068 lrwxrwxrwx 2 root root 6 Nov 12 10:18
> loginmodule -> logins*
>
>
>
> C-4547:~/rajshell# cat test.txt | tr '/->*()' ' ' Mon Nov
> IST
> root pts Nov . . .
> root raj
> .
> total
> d-wx--xr-- root root Nov .
> drwxr-xr-x root root Nov ..
> -rwxrwxrwx root root Nov
> st.txt
> -rwxrwxrwx root root Nov
> nd.txt
> -rwxrwxrwx root root Nov
> directorycontents.txt
> drwxrwxrwx root root Nov login
> lrwxrwxrwx root root Nov
> loginmodule - logins
> C-4547:~/rajshell#
>
> ###################################################################
>
> I have not mentioned the removal of the [0-9] in set1. But it is
> removed. I have tried providing the set1 '/->*()' one character at a
> time and it works. However, when the full construction is performed,
> there is removal of numbers.


Try '/\->*()'
So, the special CHAR '-' is escaped.

LG
From: Dave B on
Glen wrote:

> C-4547:~/rajshell# cat test.txt | tr '/->*()' ' '

You are including a character range (like eg "a-z") in the set, namely
"/->", which happens to include digits (as well as some other
characters: "/ 0 1 2 3 4 5 6 7 8 9 : ; < = >"). Try putting the dash
either at the beginning or at the end of the set, like '-/>*()' or
'/>*()-' (untested, but that's the usual way to do that, and should work).

--
echo 0|sed 's909=mO#3u)o19;s0#0co*)].O0;s()(0bu}=(;s#}#m1$"?02{#;
s)")9v2@3%"9$);s[%[o]x(.$e#![;sz(z^+.z;su+ur!z"au;sxzxd?_{g)/x;:b;
s/\(\(.\).\)\(\(..\)*\)\(\(.\).\)\(\(..\)*#.*\6.*\2.*\)/\5\3\1\7/;
tb'|awk '{while((i+=2)<=length($1)-24)a=a substr($1,i,1);print a}'
From: Glen on
On Dec 19, 4:10 pm, Dave B <da...(a)addr.invalid> wrote:
> Glen wrote:
> > C-4547:~/rajshell# cat test.txt | tr '/->*()' ' '
>
> You are including a character range (like eg "a-z") in the set, namely
> "/->", which happens to include digits (as well as some other
> characters: "/ 0 1 2 3 4 5 6 7 8 9 : ; < = >"). Try putting the dash
> either at the beginning or at the end of the set, like '-/>*()' or
> '/>*()-' (untested, but that's the usual way to do that, and should work)..
>
> --
> echo 0|sed 's909=mO#3u)o19;s0#0co*)].O0;s()(0bu}=(;s#}#m1$"?02{#;
> s)")9v2@3%"9$);s[%[o]x(.$e#![;sz(z^+.z;su+ur!z"au;sxzxd?_{g)/x;:b;
> s/\(\(.\).\)\(\(..\)*\)\(\(.\).\)\(\(..\)*#.*\6.*\2.*\)/\5\3\1\7/;
> tb'|awk '{while((i+=2)<=length($1)-24)a=a substr($1,i,1);print a}'

I understand the same.

Are character sets formed based on any specific rules. .i.e [a-z] or
[A-Z] or [0-9] are character sets.
1) They are homogenous in type
2) They are in progressive order.

Can character sets be constructed based on the ASCII value (or any
other character encoding, that is understood by Shell) ?

e.g [!-z] would be from ASCII Value 33 to 122.

Hence, instead of using many character sets, we an use a comprehensive
set. Pls clarify

Thanks for the replies.
From: Maxwell Lol on
Glen <rajagopalan.vk(a)gmail.com> writes:

> Are character sets formed based on any specific rules. .i.e [a-z] or
> [A-Z] or [0-9] are character sets.
> 1) They are homogenous in type
> 2) They are in progressive order.


Yes. Consider EBCIDIC, ASCII, UNICODE, UTF-8, Mac-roman, Latin, Cyrillic, etc.

They all have different rules amd ordering or characters.

Thats why some utilities have character class names (type "man regex") like

alnum digit punct
alpha graph space
blank lower upper
cntrl print xdigit

as in

# echo 123abc | tr '[:digit:]' '.'
....abc

This is safer and more portable.