From: Giacomo on
I need to extract a substring of n adjacent digits from every single
line of a file. The position of the n digits are different from line to
line.

For example:

asdasd 123 asd 191991 1234
lijoioi 4567 asdi 67567 iojoii

For n=4 the result for each line must be 1234 e 4567.

Thanks in advance,
Giacomo.
From: Janis Papanagnou on
Giacomo wrote:
> I need to extract a substring of n adjacent digits from every single
> line of a file. The position of the n digits are different from line to
> line.

What type of shell or programs you do have to use?

What have you tried to program thus far?

General outline, for example...
Depending on whether the shell/tool/program supports extended regular
expressions or not you have to either define a regexp like [0-9]{n} or
construct one from n sequences of [0-9]. This regexp must be embedded
within white space [ \t] or non-numerical patterns [^0-9] depending on
your requirements. Take care of the line boundaries, so you'll likely
have to consider start of line ^ for the left and end of line $ for
the right boundary. Finally extract the substring from the matching
part. Consider to add spaces to the front and read of the input line
to simplify the matching and extraction of the substring pattern.

> For example:
>
> asdasd 123 asd 191991 1234
> lijoioi 4567 asdi 67567 iojoii
>
> For n=4 the result for each line must be 1234 e 4567.

Janis
From: William James on

Giacomo wrote:
> I need to extract a substring of n adjacent digits from every single
> line of a file. The position of the n digits are different from line to
> line.
>
> For example:
>
> asdasd 123 asd 191991 1234
> lijoioi 4567 asdi 67567 iojoii
>
> For n=4 the result for each line must be 1234 e 4567.
>
> Thanks in advance,
> Giacomo.

ruby -ne 'puts $1 if /(?:^|\D)(\d{4})(?!\d)/'

From: William Park on
Giacomo <a(a)b.cde> wrote:
> I need to extract a substring of n adjacent digits from every single
> line of a file. The position of the n digits are different from line to
> line.
>
> For example:
>
> asdasd 123 asd 191991 1234
> lijoioi 4567 asdi 67567 iojoii
>
> For n=4 the result for each line must be 1234 e 4567.

a='asdasd 123 asd 191991 1234 lijoioi 4567 asdi 67567 iojoii'
RE='\<[0-9]{4}\>'
echo "${a|+$RE}"

Ref:
http://home.eol.ca/~parkw/index.html#parameter_expansion

--
William Park <opengeometry(a)yahoo.ca>, Toronto, Canada
ThinFlash: Linux thin-client on USB key (flash) drive
http://home.eol.ca/~parkw/thinflash.html
BashDiff: Super Bash shell
http://freshmeat.net/projects/bashdiff/
From: Ed Morton on
Giacomo wrote:

> I need to extract a substring of n adjacent digits from every single
> line of a file. The position of the n digits are different from line to
> line.
>
> For example:
>
> asdasd 123 asd 191991 1234
> lijoioi 4567 asdi 67567 iojoii
>
> For n=4 the result for each line must be 1234 e 4567.
>
> Thanks in advance,
> Giacomo.

Using a POSIX awk:

awk '{for (i=1;i<=NF;i++) if ($i ~ /^[0-9]{4}$/) print $i}'

To get GNU awk (gawk) to behave like that, use
awk --posix ... or awk --re-interval ....

There are cuter ways to get the same result in awk, but this is the
simplest and most obvious.

Regards,

Ed.