From: srikanth on
On May 28, 4:21 am, Thomas 'PointedEars' Lahn <PointedE...(a)web.de>
wrote:
> srikanth wrote:
> > Thomas 'PointedEars' Lahn wrote:
> >> You do not need a browser as you only want to make an HTTP request.
> >> libwww-perl contains `HEAD' (an alias for lwp-request(1p)) which option
> >> -d does something very similar to what you describe.
>
> >> Web site content can vary based on the request headers; you can use
> >> HEAD's -H option to send specific HTTP headers, like User-Agent,
> >> Accept-Language aso.  You can even log in to HTTP Auth-protected sites
> >> with the -C option.
>
> >> [examples]
>
> >> Reading the URI from a file and doing that in a loop is left as an
> >> exercise to the reader.
>
> > Thanks for the detailed info and also about the libwww-perl.
>
> You're welcome.
>
> > First time i am seeing these utilities. It would be great if you can give
> > me any useful info related to command line browsers like you have told
> > now.
>
> I do not understand your question.  I have not referred you to "command line
> browsers"; I have said that you do not need *any* Web browser as you only
> want to make an HTTP request.  All you need is an HTTP client implementation
> (which is also part of or used by Web browsers).
>
> In fact, using a plain-text browser like lynx(1), links(1) or w3m(1) might
> result in quite different response status codes, as I indicated -- obviously
> Google is filtering either in favor of the substring "Firefox" or against
> the substring "lwp-request" in their Groups application, BTW a misguided
> approach:
>
> $ nc -lp 1337 &
> [1] 21574
> $ HEADhttp://localhost:1337
> HEAD / HTTP/1.1
> TE: deflate,gzip;q=0.3
> Connection: TE, close
> Host: localhost:1337
> User-Agent: lwp-request/5.810
>
> ^C
> [1]+  Done                    nc -lp 1337
>
> > Is there any way to get the response by using  wget, curl?
>
> Yes.
>
> Please trim your quotes to the relevant minimum, usually do not quote
> signatures.
>
> --
> PointedEars

How to print the file content and along with the HTTP status. If i do
like this
for i in `cat $1`
do
HEAD -d $i
done

It is showing only the HTTP status. Instead i want HTTPS status along
with the URL corresponding to it.
Ex: http://www.google.com - 200 OK
From: srikanth on
On May 30, 12:21 pm, srikanth <srikanth0...(a)gmail.com> wrote:
> On May 28, 4:21 am, Thomas 'PointedEars' Lahn <PointedE...(a)web.de>
> wrote:
>
>
>
> > srikanth wrote:
> > > Thomas 'PointedEars' Lahn wrote:
> > >> You do not need a browser as you only want to make an HTTP request.
> > >> libwww-perl contains `HEAD' (an alias for lwp-request(1p)) which option
> > >> -d does something very similar to what you describe.
>
> > >> Web site content can vary based on the request headers; you can use
> > >> HEAD's -H option to send specific HTTP headers, like User-Agent,
> > >> Accept-Language aso.  You can even log in to HTTP Auth-protected sites
> > >> with the -C option.
>
> > >> [examples]
>
> > >> Reading the URI from a file and doing that in a loop is left as an
> > >> exercise to the reader.
>
> > > Thanks for the detailed info and also about the libwww-perl.
>
> > You're welcome.
>
> > > First time i am seeing these utilities. It would be great if you can give
> > > me any useful info related to command line browsers like you have told
> > > now.
>
> > I do not understand your question.  I have not referred you to "command line
> > browsers"; I have said that you do not need *any* Web browser as you only
> > want to make an HTTP request.  All you need is an HTTP client implementation
> > (which is also part of or used by Web browsers).
>
> > In fact, using a plain-text browser like lynx(1), links(1) or w3m(1) might
> > result in quite different response status codes, as I indicated -- obviously
> > Google is filtering either in favor of the substring "Firefox" or against
> > the substring "lwp-request" in their Groups application, BTW a misguided
> > approach:
>
> > $ nc -lp 1337 &
> > [1] 21574
> > $ HEADhttp://localhost:1337
> > HEAD / HTTP/1.1
> > TE: deflate,gzip;q=0.3
> > Connection: TE, close
> > Host: localhost:1337
> > User-Agent: lwp-request/5.810
>
> > ^C
> > [1]+  Done                    nc -lp 1337
>
> > > Is there any way to get the response by using  wget, curl?
>
> > Yes.
>
> > Please trim your quotes to the relevant minimum, usually do not quote
> > signatures.
>
> > --
> > PointedEars
>
> How to print the file content and along with the HTTP status. If i do
> like this
> for i in `cat $1`
> do
> HEAD -d $i
> done
>
> It is showing only the HTTP status. Instead i want HTTPS status along
> with the URL corresponding to it.
> Ex:http://www.google.com- 200 OK

I have tried like this and I am able to get what i want

#!/bin/bash
for i in `cat $1`
do
echo "$i - `HEAD -d $i`"
done
printf "\n"
printf "Total nunmber URLs processed --> `wc -l < $1` \n"
exit 0

output:
http://yahoo.com - 200 OK

One more thing here I need to know. How to redirect std output to a
file. How to give it in the script to log the standard output to a
file?

From: Bit Twister on
On Sun, 30 May 2010 02:48:39 -0700 (PDT), srikanth wrote:


> One more thing here I need to know. How to redirect std output to a
> file. How to give it in the script to log the standard output to a
> file?

You may want to bookmark this url,
http://tldp.org/LDP/abs/html/index.html

Part 2, Chapter 3 Special Characters and search for "redirection" will
give an example. Click the redirection link for a detailed explanation.


FYI: I would change the "for loop" using cat to a while loop.

while read -r i ; do

done < $1
From: srikanth on
On May 30, 6:32 pm, Bit Twister <BitTwis...(a)mouse-potato.com> wrote:
> On Sun, 30 May 2010 02:48:39 -0700 (PDT), srikanth wrote:
> > One more thing here I need to know. How to redirect std output to a
> > file. How to give it in the script to log the standard output to a
> > file?
>
> You may want to bookmark this url,
>    http://tldp.org/LDP/abs/html/index.html
>
> Part 2, Chapter 3 Special Characters and search for "redirection" will
> give an example. Click the redirection link for a detailed explanation.
>
> FYI: I would change the "for loop" using cat to a while loop.
>
> while read -r i ; do
>
> done < $1

Thanks Twister for valuable information sharing. It would be very
helpful for me a lot.
BTW, when i change to a while loop it is showing HEAD usage at the end
of the script execution.

sh$./browse.sh /tmp/tobrowse.lst
http://yahoo.com - 200 OK
http://orkut.com - 200 OK
http://google.co.in - 200 OK
http://google.com - 200 OK
http://rediff.com - 200 OK
http://msn.com - 200 OK
http://youtube.com - 200 OK
http://naukri.com - 200 OK
http://blogger.com - 405 Method Not Allowed
http://live.com - 404 Not Found
http://rapidshare.com - 200 OK
http://wikipedia.org - 403 Forbidden
http://indiatimes.com - 404 Not Found
Usage: HEAD [-options] <url>...
-m <method> use method for the request (default is 'HEAD')
-f make request even if HEAD believes method is illegal
-b <base> Use the specified URL as base
-t <timeout> Set timeout value
-i <time> Set the If-Modified-Since header on the request
-c <conttype> use this content-type for POST, PUT, CHECKIN
-a Use text mode for content I/O
-p <proxyurl> use this as a proxy
-P don't load proxy settings from environment
-H <header> send this HTTP header (you can specify several)
-C <username>:<password>
provide credentials for basic authentication

-u Display method and URL before any response
-U Display request headers (implies -u)
-s Display response status code
-S Display response status chain
-e Display response headers
-d Do not display content
-o <format> Process HTML content in various ways

-v Show program version
-h Print this message

-x Extra debugging output
-

Am i using HEAD in a wrong way?
From: Bit Twister on
On Sun, 30 May 2010 19:08:01 -0700 (PDT), srikanth wrote:

> BTW, when i change to a while loop it is showing HEAD usage at the end
> of the script execution.
> Am i using HEAD in a wrong way?

I can only guess last line(s) of the input file is not what
head expected.