|
Prev: copy the content from corrosponding paths from two files
Next: html to ascii conversion: quick google translate from thecommand line
From: Rahul on 16 Apr 2008 00:29 Is there a way to convert a html snippet "sensibly" to ascii plain-text. I just want to display a no-frills version of this google translate query quickly from the command-line: curl -s 'http://translate.google.com/translate_dict?q=cat&hl=en&langpair=en%7Cde' "cat" could be replaced by "dog" "beer" whatever and lo and behold I've a German translation on the command line (I wish!). This snippet throws a load of html at me. Is there a easy way to convert it to a "displayable" format? Basically just column-formatting or at most using bold etc. that my xterm-color console can support. html has all this info. embedded in its tags, right? So looks possible in theory; just wondering what's the best tool for the job. I have no intention of browsing further from that page so lynx seems an overkill. -- Rahul
From: Barry Margolin on 16 Apr 2008 01:03 In article <Xns9A81EEFBFD8FAnospamnospamcom(a)85.214.90.236>, Rahul <nospam(a)nospam.invalid> wrote: > Is there a way to convert a html snippet "sensibly" to ascii plain-text. > I just want to display a no-frills version of this google translate > query quickly from the command-line: > > curl -s > 'http://translate.google.com/translate_dict?q=cat&hl=en&langpair=en%7Cde' > > "cat" could be replaced by "dog" "beer" whatever and lo and behold I've > a German translation on the command line (I wish!). This snippet throws > a load of html at me. Is there a easy way to convert it to a > "displayable" format? Basically just column-formatting or at most using > bold etc. that my xterm-color console can support. html has all this > info. embedded in its tags, right? So looks possible in theory; just > wondering what's the best tool for the job. > > I have no intention of browsing further from that page so lynx seems an > overkill. How about the -dump option to lynx? It just displays the result, without going into an interactive browser. -- Barry Margolin, barmar(a)alum.mit.edu Arlington, MA *** PLEASE post questions in newsgroups, not directly to me *** *** PLEASE don't copy me on replies, I'll read them in the group ***
From: Rahul on 16 Apr 2008 12:43 Stephane CHAZELAS <this.address(a)is.invalid> wrote in news:slrng0bd0d.8cn.stephane.chazelas(a)spam.is.invalid: > See elinks or w3m. In the old ages, you would have used lynx, > but it's quite bad on tables and frames. > > Compare: > > elinks -no-references -no-numbering -dump \ > 'http://translate.google.com/translate_dict?q=cat&hl=en&langpair=en%7 > Cde' > > w3m -dump \ > 'http://translate.google.com/translate_dict?q=cat&hl=en&langpair=en%7 > Cde' > > lynx -dump -nolist \ > 'http://translate.google.com/translate_dict?q=cat&hl=en&langpair=en%7 > Cde' > I like these options much better. Thanks Stephane! I only have to solve some font issues now. Seem to be a problem with all three. d�nne Eisschicht --> dünne Eisschicht K�tzin --> Kätzin Hühner -->Hühner Seems like something to do with umlaut rendering in my font set.....Any ideas? -- Rahul
From: Allodoxaphobia on 16 Apr 2008 14:12 On Wed, 16 Apr 2008 10:08:45 +0200 (CEST), Stephane CHAZELAS wrote: > 2008-04-16, 04:29(+00), Rahul: >> Is there a way to convert a html snippet "sensibly" to ascii plain-text. >> I just want to display a no-frills version of this google translate >> query quickly from the command-line: >> >> curl -s >> 'http://translate.google.com/translate_dict?q=cat&hl=en&langpair=en%7Cde' > [...] > > See elinks or w3m. In the old ages, you would have used lynx, > but it's quite bad on tables and frames. > > Compare: > > lynx -dump -nolist \ > 'http://translate.google.com/translate_dict?q=cat&hl=en&langpair=en%7Cde' lynx won't work unless you spoof the useragent. $ lynx -dump -nolist 'http://google.com/' Error Bad Request Your client has issued a malformed or illegal request. Please see Google's Terms of Service posted at http://www.google.com/terms_of_service.html ...... They have no compunction about crawling all over your web site, indexing all your images, and enabling email and usenet spam. But, gawd forbid that you might try to use a text-only browser to visit their website(s). Jonesy -- Marvin L Jones | jonz | W3DHJ | linux 38.24N 104.55W | @ config.com | Jonesy | OS/2 *** Killfiling google posts: <http://jonz.net/ng.htm>
From: Allodoxaphobia on 16 Apr 2008 14:23
On 16 Apr 2008 18:12:43 GMT, Allodoxaphobia wrote: > On Wed, 16 Apr 2008 10:08:45 +0200 (CEST), Stephane CHAZELAS wrote: >> 2008-04-16, 04:29(+00), Rahul: >>> Is there a way to convert a html snippet "sensibly" to ascii plain-text. >>> I just want to display a no-frills version of this google translate >>> query quickly from the command-line: >>> >>> curl -s >>> 'http://translate.google.com/translate_dict?q=cat&hl=en&langpair=en%7Cde' >> [...] >> >> See elinks or w3m. In the old ages, you would have used lynx, >> but it's quite bad on tables and frames. >> >> Compare: >> >> lynx -dump -nolist \ >> 'http://translate.google.com/translate_dict?q=cat&hl=en&langpair=en%7Cde' > > lynx won't work unless you spoof the useragent. > > $ lynx -dump -nolist 'http://google.com/' > Error > Bad Request > > Your client has issued a malformed or illegal request. > Please see Google's Terms of Service posted at > http://www.google.com/terms_of_service.html > > ...... mea culpa. I now see this is b0rk3d on only one of the machines here. And, Murphy's Law required that it be _my_ workstation. sigh... OK, now to slink off and find out what the problem is on this box. Jonesy -- Marvin L Jones | jonz | W3DHJ | linux 38.24N 104.55W | @ config.com | Jonesy | OS/2 *** Killfiling google posts: <http://jonz.net/ng.htm> |