From: Hongyi Zhao on
Hi all,

I want to write a script to note specific IP
addresses by appending the corresponding location informations. For
detail, I describe my issue as follows:

Suppose I have two files, the first file is used to store the specific
IP
addresses which I want to note, and the second file is used to store
the IP database along with the corresponding location informations.

The first file has one IP address per line with dotted decimal format,
e.g.:

0.125.125.125
4.19.79.28
4.36.124.150
....

The second file has four field per line delimited by CHARACTER
TABULATION (U+0009). These four field are: StartIP, EndIP, Country,
and Local, e.g.:

StartIP EndIP Country Local
0.0.0.0 0.255.255.255 IANA CZ88.NET
4.19.79.0 4.19.79.63 American Armed Forces
Radio/Television
4.36.124.128 4.36.124.255 American Technical Resource
Connections Inc
....

Based on the second file, I want to reformat the first file by
appending the corresponding location informations for each IP address
in it, i.e., for the above example, I want to obain the following
result:

0.125.125.125#IANA CZ88.NET
4.19.79.28#American Armed Forces Radio/Television
4.36.124.150#American Technical Resource Connections
....

Any hints on this issue will be highly appreciated.
Thanks in advance.
--
..: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :.
From: Ed Morton on
Hongyi Zhao wrote:
> Hi all,
>
> I want to write a script to note specific IP
> addresses by appending the corresponding location informations. For
> detail, I describe my issue as follows:
>
> Suppose I have two files, the first file is used to store the specific
> IP
> addresses which I want to note, and the second file is used to store
> the IP database along with the corresponding location informations.
>
> The first file has one IP address per line with dotted decimal format,
> e.g.:
>
> 0.125.125.125
> 4.19.79.28
> 4.36.124.150
> ...
>
> The second file has four field per line delimited by CHARACTER
> TABULATION (U+0009). These four field are: StartIP, EndIP, Country,
> and Local, e.g.:
>
> StartIP EndIP Country Local
> 0.0.0.0 0.255.255.255 IANA CZ88.NET
> 4.19.79.0 4.19.79.63 American Armed Forces
> Radio/Television
> 4.36.124.128 4.36.124.255 American Technical Resource
> Connections Inc
> ...
>
> Based on the second file, I want to reformat the first file by
> appending the corresponding location informations for each IP address
> in it, i.e., for the above example, I want to obain the following
> result:
>
> 0.125.125.125#IANA CZ88.NET
> 4.19.79.28#American Armed Forces Radio/Television
> 4.36.124.150#American Technical Resource Connections
> ...
>
> Any hints on this issue will be highly appreciated.
> Thanks in advance.

$ cat file1
0.125.125.125
4.19.79.28
4.36.124.150
$
$ cat file2
StartIP EndIP Country Local
0.0.0.0 0.255.255.255 IANA CZ88.NET
4.19.79.0 4.19.79.63 American Armed Forces
Radio/Television
4.36.124.128 4.36.124.255 American Technical Resource
Connections Inc
$
$ cat tst.awk
BEGIN{ FS="\t"; OFS="#" }
function ip2nr(ip, nr,ipA) {
# aaa.bbb.ccc.ddd
split(ip,ipA,".")
nr = ipA[1] * 1000000000 + ipA[2] * 1000000 + ipA[3] * 1000 + ipA[4]
return nr
}
NR==FNR { addrs[$0] = ip2nr($0); next }
FNR>1 {
start = ip2nr($1)
end = ip2nr($2)
for (ip in addrs) {
if (addrs[ip] >= start && addrs[ip] <= end) {
print ip,$3" "$4
}
}
}
$
$ awk -f tst.awk file1 file2
0.125.125.125#IANA CZ88.NET
4.19.79.28#American Armed Forces Radio/Television
4.36.124.150#American Technical Resource Connections Inc

Regards,

Ed.
From: Ed Morton on
Ed Morton wrote:
> Hongyi Zhao wrote:
>> Hi all,
>>
>> I want to write a script to note specific IP
>> addresses by appending the corresponding location informations. For
>> detail, I describe my issue as follows:
>>
>> Suppose I have two files, the first file is used to store the specific
>> IP
>> addresses which I want to note, and the second file is used to store
>> the IP database along with the corresponding location informations.
>>
>> The first file has one IP address per line with dotted decimal format,
>> e.g.:
>>
>> 0.125.125.125
>> 4.19.79.28
>> 4.36.124.150
>> ...
>>
>> The second file has four field per line delimited by CHARACTER
>> TABULATION (U+0009). These four field are: StartIP, EndIP, Country,
>> and Local, e.g.:
>>
>> StartIP EndIP Country Local
>> 0.0.0.0 0.255.255.255 IANA CZ88.NET
>> 4.19.79.0 4.19.79.63 American Armed Forces
>> Radio/Television
>> 4.36.124.128 4.36.124.255 American Technical Resource
>> Connections Inc
>> ...
>>
>> Based on the second file, I want to reformat the first file by
>> appending the corresponding location informations for each IP address
>> in it, i.e., for the above example, I want to obain the following
>> result:
>>
>> 0.125.125.125#IANA CZ88.NET
>> 4.19.79.28#American Armed Forces Radio/Television
>> 4.36.124.150#American Technical Resource Connections
>> ...
>>
>> Any hints on this issue will be highly appreciated.
>> Thanks in advance.
>
> $ cat file1
> 0.125.125.125
> 4.19.79.28
> 4.36.124.150
> $
> $ cat file2
> StartIP EndIP Country Local
> 0.0.0.0 0.255.255.255 IANA CZ88.NET
> 4.19.79.0 4.19.79.63 American Armed Forces
> Radio/Television
> 4.36.124.128 4.36.124.255 American Technical Resource
> Connections Inc
> $
> $ cat tst.awk
> BEGIN{ FS="\t"; OFS="#" }
> function ip2nr(ip, nr,ipA) {
> # aaa.bbb.ccc.ddd
> split(ip,ipA,".")
> nr = ipA[1] * 1000000000 + ipA[2] * 1000000 + ipA[3] * 1000 + ipA[4]
> return nr
> }
> NR==FNR { addrs[$0] = ip2nr($0); next }
> FNR>1 {
> start = ip2nr($1)
> end = ip2nr($2)
> for (ip in addrs) {
> if (addrs[ip] >= start && addrs[ip] <= end) {
> print ip,$3" "$4
> }
> }
> }
> $
> $ awk -f tst.awk file1 file2
> 0.125.125.125#IANA CZ88.NET
> 4.19.79.28#American Armed Forces Radio/Television
> 4.36.124.150#American Technical Resource Connections Inc
>
> Regards,
>
> Ed.

Adding a "delete" and a "next" would make the script more efficient if
you have a large list of IP addresses in file1 and each range in file2
is distinct:

BEGIN{ FS="\t"; OFS="#" }
function ip2nr(ip, nr,ipA) {
# aaa.bbb.ccc.ddd
split(ip,ipA,".")
nr = ipA[1] * 1000000000 + ipA[2] * 1000000 + ipA[3] * 1000 + ipA[4]
return nr
}
NR==FNR { addrs[$0] = ip2nr($0); next }
FNR>1 {
start = ip2nr($1)
end = ip2nr($2)
for (ip in addrs) {
if (addrs[ip] >= start && addrs[ip] <= end) {
print ip,$3" "$4
delete addrs[ip]
next
}
}
}

Ed.
From: Sidney Lambe on
On comp.unix.shell, Ed Morton <mortonspam(a)gmail.com> wrote:
> Ed Morton wrote:
>> Hongyi Zhao wrote:
>>> Hi all,
>>>
>>> I want to write a script to note specific IP
>>> addresses by appending the corresponding location informations. For
>>> detail, I describe my issue as follows:
>>>
>>> Suppose I have two files, the first file is used to store the specific
>>> IP
>>> addresses which I want to note, and the second file is used to store
>>> the IP database along with the corresponding location informations.
>>>
>>> The first file has one IP address per line with dotted decimal format,
>>> e.g.:
>>>
>>> 0.125.125.125
>>> 4.19.79.28
>>> 4.36.124.150
>>> ...
>>>
>>> The second file has four field per line delimited by CHARACTER
>>> TABULATION (U+0009). These four field are: StartIP, EndIP, Country,
>>> and Local, e.g.:
>>>
>>> StartIP EndIP Country Local
>>> 0.0.0.0 0.255.255.255 IANA CZ88.NET
>>> 4.19.79.0 4.19.79.63 American Armed Forces
>>> Radio/Television
>>> 4.36.124.128 4.36.124.255 American Technical Resource
>>> Connections Inc
>>> ...
>>>
>>> Based on the second file, I want to reformat the first file by
>>> appending the corresponding location informations for each IP address
>>> in it, i.e., for the above example, I want to obain the following
>>> result:
>>>
>>> 0.125.125.125#IANA CZ88.NET
>>> 4.19.79.28#American Armed Forces Radio/Television
>>> 4.36.124.150#American Technical Resource Connections
>>> ...
>>>
>>> Any hints on this issue will be highly appreciated.
>>> Thanks in advance.
>>
>> $ cat file1
>> 0.125.125.125
>> 4.19.79.28
>> 4.36.124.150
>> $
>> $ cat file2
>> StartIP EndIP Country Local
>> 0.0.0.0 0.255.255.255 IANA CZ88.NET
>> 4.19.79.0 4.19.79.63 American Armed Forces
>> Radio/Television
>> 4.36.124.128 4.36.124.255 American Technical Resource
>> Connections Inc
>> $
>> $ cat tst.awk
>> BEGIN{ FS="\t"; OFS="#" }
>> function ip2nr(ip, nr,ipA) {
>> # aaa.bbb.ccc.ddd
>> split(ip,ipA,".")
>> nr = ipA[1] * 1000000000 + ipA[2] * 1000000 + ipA[3] * 1000 + ipA[4]
>> return nr
>> }
>> NR==FNR { addrs[$0] = ip2nr($0); next }
>> FNR>1 {
>> start = ip2nr($1)
>> end = ip2nr($2)
>> for (ip in addrs) {
>> if (addrs[ip] >= start && addrs[ip] <= end) {
>> print ip,$3" "$4
>> }
>> }
>> }
>> $
>> $ awk -f tst.awk file1 file2
>> 0.125.125.125#IANA CZ88.NET
>> 4.19.79.28#American Armed Forces Radio/Television
>> 4.36.124.150#American Technical Resource Connections Inc
>>
>> Regards,
>>
>> Ed.
>
> Adding a "delete" and a "next" would make the script more efficient if
> you have a large list of IP addresses in file1 and each range in file2
> is distinct:
>
> BEGIN{ FS="\t"; OFS="#" }
> function ip2nr(ip, nr,ipA) {
> # aaa.bbb.ccc.ddd
> split(ip,ipA,".")
> nr = ipA[1] * 1000000000 + ipA[2] * 1000000 + ipA[3] * 1000 + ipA[4]
> return nr
> }
> NR==FNR { addrs[$0] = ip2nr($0); next }
> FNR>1 {
> start = ip2nr($1)
> end = ip2nr($2)
> for (ip in addrs) {
> if (addrs[ip] >= start && addrs[ip] <= end) {
> print ip,$3" "$4
> delete addrs[ip]
> next
> }
> }
> }
>
> Ed.


Why is it that Ed Morton, who is supposed to be the
Great Awk Educator, doesn't even comment his scripts,
which is basic to good scripting and obviously necessary
for educating people on the use of awk?



Sid


From: Ed Morton on
Sidney Lambe wrote:
> On comp.unix.shell, Ed Morton <mortonspam(a)gmail.com> wrote:
>> Ed Morton wrote:
>>> Hongyi Zhao wrote:
>>>> Hi all,
>>>>
>>>> I want to write a script to note specific IP
>>>> addresses by appending the corresponding location informations. For
>>>> detail, I describe my issue as follows:
>>>>
>>>> Suppose I have two files, the first file is used to store the specific
>>>> IP
>>>> addresses which I want to note, and the second file is used to store
>>>> the IP database along with the corresponding location informations.
>>>>
>>>> The first file has one IP address per line with dotted decimal format,
>>>> e.g.:
>>>>
>>>> 0.125.125.125
>>>> 4.19.79.28
>>>> 4.36.124.150
>>>> ...
>>>>
>>>> The second file has four field per line delimited by CHARACTER
>>>> TABULATION (U+0009). These four field are: StartIP, EndIP, Country,
>>>> and Local, e.g.:
>>>>
>>>> StartIP EndIP Country Local
>>>> 0.0.0.0 0.255.255.255 IANA CZ88.NET
>>>> 4.19.79.0 4.19.79.63 American Armed Forces
>>>> Radio/Television
>>>> 4.36.124.128 4.36.124.255 American Technical Resource
>>>> Connections Inc
>>>> ...
>>>>
>>>> Based on the second file, I want to reformat the first file by
>>>> appending the corresponding location informations for each IP address
>>>> in it, i.e., for the above example, I want to obain the following
>>>> result:
>>>>
>>>> 0.125.125.125#IANA CZ88.NET
>>>> 4.19.79.28#American Armed Forces Radio/Television
>>>> 4.36.124.150#American Technical Resource Connections
>>>> ...
>>>>
>>>> Any hints on this issue will be highly appreciated.
>>>> Thanks in advance.
>>> $ cat file1
>>> 0.125.125.125
>>> 4.19.79.28
>>> 4.36.124.150
>>> $
>>> $ cat file2
>>> StartIP EndIP Country Local
>>> 0.0.0.0 0.255.255.255 IANA CZ88.NET
>>> 4.19.79.0 4.19.79.63 American Armed Forces
>>> Radio/Television
>>> 4.36.124.128 4.36.124.255 American Technical Resource
>>> Connections Inc
>>> $
>>> $ cat tst.awk
>>> BEGIN{ FS="\t"; OFS="#" }
>>> function ip2nr(ip, nr,ipA) {
>>> # aaa.bbb.ccc.ddd
>>> split(ip,ipA,".")
>>> nr = ipA[1] * 1000000000 + ipA[2] * 1000000 + ipA[3] * 1000 + ipA[4]
>>> return nr
>>> }
>>> NR==FNR { addrs[$0] = ip2nr($0); next }
>>> FNR>1 {
>>> start = ip2nr($1)
>>> end = ip2nr($2)
>>> for (ip in addrs) {
>>> if (addrs[ip] >= start && addrs[ip] <= end) {
>>> print ip,$3" "$4
>>> }
>>> }
>>> }
>>> $
>>> $ awk -f tst.awk file1 file2
>>> 0.125.125.125#IANA CZ88.NET
>>> 4.19.79.28#American Armed Forces Radio/Television
>>> 4.36.124.150#American Technical Resource Connections Inc
>>>
>>> Regards,
>>>
>>> Ed.
>> Adding a "delete" and a "next" would make the script more efficient if
>> you have a large list of IP addresses in file1 and each range in file2
>> is distinct:
>>
>> BEGIN{ FS="\t"; OFS="#" }
>> function ip2nr(ip, nr,ipA) {
>> # aaa.bbb.ccc.ddd
>> split(ip,ipA,".")
>> nr = ipA[1] * 1000000000 + ipA[2] * 1000000 + ipA[3] * 1000 + ipA[4]
>> return nr
>> }
>> NR==FNR { addrs[$0] = ip2nr($0); next }
>> FNR>1 {
>> start = ip2nr($1)
>> end = ip2nr($2)
>> for (ip in addrs) {
>> if (addrs[ip] >= start && addrs[ip] <= end) {
>> print ip,$3" "$4
>> delete addrs[ip]
>> next
>> }
>> }
>> }
>>
>> Ed.
>
>
> Why is it that Ed Morton, who is supposed to be the
> Great Awk Educator, doesn't even comment his scripts,
> which is basic to good scripting and obviously necessary
> for educating people on the use of awk?
>
>
>
> Sid
>

Sid - what part of that code did you find confusing and I'll be happy to
explain it to you.

Ed.