From: Hongyi Zhao on
On Mon, 29 Mar 2010 10:55:37 +0800, Hongyi Zhao
<hongyi.zhao(a)gmail.com> wrote:

>3- As the output, I only want obtain the IP_ADDRESS other than the
>entire records.

This is always the output form I want to obtain.
--
..: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :.
From: Arcege on
On Mar 27, 4:20 am, Hongyi Zhao <hongyi.z...(a)gmail.com> wrote:
> Hi all,
>
> I've the following file which includes three fields in each line:
>
> "IP_ADDRESS" "ISP_NAME" "DOMAIN_NAME"
> "109.86.226.38" "-" "-"
> "117.18.75.235" "SUNNYVISION LIMITED" "SUNNYVISIONDATACENTRE.COM"
> "119.11.13.169" "-" "-"
> "119.11.42.164" "-" "-"
> "121.44.240.31" "INTERNET SERVICE PROVIDER" "ON.NET"
> "122.155.3.145" "CAT TELECOM PUBLIC COMPANY LTD" "-"
> "140.109.17.180" "T-SINICA.EDU.TW-NET" "-"
> "145.100.100.190" "UVA-MASTER-SNE-NET" "-"
> "149.9.0.57" "PSI" "BNA.COM"
> "149.9.0.58" "PSI" "BNA.COM"
> "149.9.0.59" "PSI" "BNA.COM"
> "151.15.8.46" "ITALIA ONLINE S.P.A" "15-151.IOL.IT"
> "151.16.191.218" "IUNET-BNET" "38-151.NET24.IT"
> "151.21.86.208" "FREE INTERNET DIAL-UP SERVICES" "21-151.LIBERO.IT"
> "151.23.7.196" "ITALIA ONLINE S.P.A" "PPP-POOL-23-0-10.IOL.IT"
> "151.48.43.174" "IUNET-BNET" "48-151.NET24.IT"
> "151.53.80.237" "IUNET-BNET" "38-151.NET24.IT"
> "151.54.214.97" "IUNET-BNET" "38-151.NET24.IT"
>
> Now, I want to delete some records from this file based on "ISP_NAME"
> or "DOMAIN_NAME".  I describe the details of my requirements as
> follows:
>
> 1- If a record's "ISP_NAME" and "DOMAIN_NAME" fields are "-", delete
> it from the file.
>
> 2- Based on the given IP_ADDRESS, say, 151.48.43.174, delete the
> records which have the same "ISP_NAME" or "DOMAIN_NAME" as it has.  In
> this case, the following records should be deleted from the file:
>
> "151.48.43.174" "IUNET-BNET" "48-151.NET24.IT"
> "151.53.80.237" "IUNET-BNET" "38-151.NET24.IT"
> "151.54.214.97" "IUNET-BNET" "38-151.NET24.IT"
>
> Any hints on this issue?
>
> BR.
> --
> .: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :.

Hi there,

The first step that I would do is to normalize the data to make it
better manageable.

$ sed 's/" "/!/g;s/"//g' your_file | awk -v ip=151.48.43.174 -F\!
'$1==ip{isp=$2;dom=$3} $2 !="-" && $2!=isp && $3!=dom' | sed
's/.*/"&";s/!/" "/g'
"IP_ADDRESS" "ISP_NAME" "DOMAIN_NAME"
"117.18.75.235" "SUNNYVISION LIMITED" "SUNNYVISIONDATACENTRE.COM"
"121.44.240.31" "INTERNET SERVICE PROVIDER" "ON.NET"
"122.155.3.145" "CAT TELECOM PUBLIC COMPANY LTD" "-"
"140.109.17.180" "T-SINICA.EDU.TW-NET" "-"
"145.100.100.190" "UVA-MASTER-SNE-NET" "-"
"149.9.0.57" "PSI" "BNA.COM"
"149.9.0.58" "PSI" "BNA.COM"
"149.9.0.59" "PSI" "BNA.COM"
"151.15.8.46" "ITALIA ONLINE S.P.A" "15-151.IOL.IT"
"151.16.191.218" "IUNET-BNET" "38-151.NET24.IT"
"151.21.86.208" "FREE INTERNET DIAL-UP SERVICES" "21-151.LIBERO.IT"
"151.23.7.196" "ITALIA ONLINE S.P.A" "PPP-POOL-23-0-10.IOL.IT"

To check the data, pass the output through: | diff your_file -

Filtering the data can often help things greatly.

-Arcege
From: Glenn Jackman on
At 2010-03-27 05:46AM, "Janis Papanagnou" wrote:
> awk -v ipaddr="151.48.43.174" '$1 ~ ipaddr {print $2}' your_file
[...]
> [*] This is, strictly speaking, not correct since the dots match any
> character in the first field, but your data seems to allow for that
> simplification.

Then don't use regular expressions: '$1 == ipaddr {print $2}'

--
Glenn Jackman
Write a wise saying and your name will live forever. -- Anonymous