|
From: sam billabong on 18 May 2006 04:53 Hi all, the syntax of the SORT command id baffling me :-(( I'm trying to sort this flat file by two "absolute" positions. That is: I would like to sort it by the 9-th column (the one containing '9' '5' '5' '5') and THEN by the 13-th column ( '3' '7' '4' '1') 1234567890123456789 zzzzzz 5a 6781.... zzz zz 5z 7481.... z b a r 5b 1181.... I've tried : sort +0.8 -0.8 +0.12 -0.12 o.txt and variations like : sort +0.9 -0.9 +0.13 -0.13 o.txt ... and also sort -k 1.9,1.9 -k 1.13,1.13 o.txt ... and variations but I'm not even able to understand the output ( i know: i'm definitely stupid ) I've googled and MAN'nned and re-googled and re-MAN'ed to no avail . is there anyone willing to help me ? the problem is: i want to sort a file : first based on the content of byte 9 , and then on the content of byte 13. every "line" of the file is a record, the record does NOT contains fileds. THANKYOU if that matters, here's the output of "uname": Linux 2.6.5-7.202.7 ... i386GNU/Linux
From: Rainer Temme on 18 May 2006 05:34 sam billabong wrote: Hi sam, > I'm trying to sort this flat file by two "absolute" positions. > That is: I would like to sort it by the 9-th column (the one containing > '9' '5' '5' '5') and THEN by the 13-th column ( '3' '7' '4' '1') > > 1234567890123456789 > zzzzzz 5a 6781.... > zzz zz 5z 7481.... > z b a r 5b 1181.... > sort -k 1.9,1.9 -k 1.13,1.13 o.txt ... and variations the output of this on my system (SuSE Linux 9.3) is z b a r 5b 1181.... zzz zz 5z 7481.... zzzzzz 5a 6781.... 1234567890123456789 wha'ts wrong with that? > the problem is: > i want to sort a file : > first based on the content of byte 9 , and then on the content of byte > 13. > every "line" of the file is a record, the record does NOT contains > fileds. Well, may be your sort it a bit overpicky with fields ... how about sort --field-separator=# -k 1.9,1.9 -k 1.13,1.13 o.txt this might help. Rainer
From: Daniel P. Valentine on 18 May 2006 05:56 In article <1147942403.337028.298130(a)j33g2000cwa.googlegroups.com>, "sam billabong" <rfacco(a)email.it> wrote: > Hi all, > the syntax of the SORT command id baffling me :-(( > > I'm trying to sort this flat file by two "absolute" positions. > That is: I would like to sort it by the 9-th column (the one containing > '9' '5' '5' '5') and THEN by the 13-th column ( '3' '7' '4' '1') > > 1234567890123456789 > zzzzzz 5a 6781.... > zzz zz 5z 7481.... > z b a r 5b 1181.... > > I've tried : > > sort +0.8 -0.8 +0.12 -0.12 o.txt > and variations like : sort +0.9 -0.9 +0.13 -0.13 o.txt ... > > and also > sort -k 1.9,1.9 -k 1.13,1.13 o.txt ... and variations The biggest problem you are having is that sort is splitting your record into fields based on the spaces in the record. To make this last one work, all I had to do was to specify a field delimiter that forced it to consider the (at least) the first 13 bytes as a single field. You can use any character you "know" will not appear in the first 13 bytes--all you need is to ensure that the first 13 bytes don't get split up. For the data you show here, a semicolon does the trick: % sort -t ";" +0.9 -0.9 +0.13 -0.13 testsort.txt 1234567890123456789 z b a r 5b 1181.... zzz zz 5z 7481.... zzzzzz 5a 6781.... If you can't make a prediction on what may or may not appear in the first 13 characters, you're probably safest using the newline character (or whatever separates records from each other) as the field delimiter. % sort -t "\ " +0.9 -0.9 +0.13 -0.13 testsort.txt 1234567890123456789 z b a r 5b 1181.... zzz zz 5z 7481.... zzzzzz 5a 6781.... To make that happen, I typed \, ^V, and ^M between the quotes. The backslash told the shell that the newline character was not the end of the command line, and the control-V told the shell to take the next character verbatim. The control-M is a newline. > [...] > if that matters, here's the output of "uname": Linux 2.6.5-7.202.7 ... > i386GNU/Linux If it had mattered, the version of sort would be more important in this case: % sort --version sort (GNU coreutils) 5.93 Copyright (C) 2005 Free Software Foundation, Inc. This is free software. You may redistribute copies of it under the terms of the GNU General Public License <http://www.gnu.org/licenses/gpl.html>. There is NO WARRANTY, to the extent permitted by law. Written by Mike Haertel and Paul Eggert. -- dpv
From: sam billabong on 18 May 2006 06:10 Daniel P. Valentine wrote: > % sort -t ";" +0.9 -0.9 +0.13 -0.13 testsort.txt > 1234567890123456789 > z b a r 5b 1181.... > zzz zz 5z 7481.... > zzzzzz 5a 6781.... Dan: thank you for your help ! on my system I have the same output as you ... unfortunately I don't understand if and how it is sorted , for sure it is NOT sorted nor for the 9-th nor for the 13-th column ... or am i completely out to lunch ? the 9-th column in the output is '9' '5' '5' '5' (reverse order ?? !) and the 13-th is '3' '1' '4' '7' Thank you for your patience. ------------------------------------------ ~ # sort --version sort (coreutils) 5.2.1 Written by Mike Haertel and Paul Eggert. Copyright (C) 2004 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
From: Rainer Temme on 18 May 2006 07:02
sam billabong wrote: > Daniel P. Valentine wrote: >> % sort -t ";" +0.9 -0.9 +0.13 -0.13 testsort.txt >> 1234567890123456789 >> z b a r 5b 1181.... >> zzz zz 5z 7481.... >> zzzzzz 5a 6781.... > > Dan: thank you for your help ! > on my system I have the same output as you ... unfortunately > I don't understand if and how it is sorted ... The Startposition is the very first character. > ~ # sort --version > sort (coreutils) 5.2.1 $ cat in 1234567890123456789 zzzzzz 5a 6781.... zzz zz 5z 7481.... z b a r 5b 1181.... $ sort --version sort (GNU coreutils) 5.3.0 $ cat in | sort -t '#' -k 1.9,1.9 -k 1.13,1.13 z b a r 5b 1181.... zzz zz 5z 7481.... zzzzzz 5a 6781.... 1234567890123456789 May be you should try to upgrade to version 5.3.0 Rainer |