|
Prev: copy the content from corrosponding paths from two files
Next: html to ascii conversion: quick google translate from thecommand line
From: Rahul on 16 Apr 2008 19:32 ebenZEROONE(a)verizon.net (Hactar) wrote in news:4maid5-02j.ln1(a)royalty.mine.nu: > > Try setting LANG to en.US or C. Thanks eben! Partial success. Tried both en.US and C: lynx and w3m still are not cured. But elinks now does not spit out the strange characters anymore: It uses the "compromise" sort of a "poor-man's-umlaut": u-umlaut = ue o- umlaut=oe etc. (common convention). Could live with that; unless someone has any other ideas to get my console to print "real" umlauts! :) -- Rahul
From: Hactar on 17 Apr 2008 00:07 In article <Xns9A82BC9BB4C4Anospamnospamcom(a)85.214.90.236>, Rahul <nospam(a)nospam.invalid> wrote: > ebenZEROONE(a)verizon.net (Hactar) wrote in > news:4maid5-02j.ln1(a)royalty.mine.nu: > > > > > Try setting LANG to en.US or C. > > Thanks eben! Partial success. > > Tried both en.US and C: lynx and w3m still are not cured. > > But elinks now does not spit out the strange characters anymore: > > It uses the "compromise" sort of a "poor-man's-umlaut": u-umlaut = ue o- > umlaut=oe etc. (common convention). > > Could live with that; unless someone has any other ideas to get my console to > print "real" umlauts! :) LANG=de.DE? -- -eben QebWenE01R(a)vTerYizUonI.nOetP royalty.mine.nu:81 Your pretended fear lest error might step in is like the man who would keep all wine out of the country lest men should be drunk. -- Oliver Cromwell
From: Rahul on 17 Apr 2008 12:16 ebenZEROONE(a)verizon.net (Hactar) wrote in news:49sid5-kj5.ln1(a)royalty.mine.nu: > > LANG=de.DE? > Thanks eben! Almost... LANG = de_DE seems to be the option that works for me. I don't know why; but only that one works! Now all my umlauts are perfect. Thanks for all those helpful leads guys. Although, there's a problem with apostrope marks still. Was with EN and also persists now. All my apostrophe's seem to be rendered as '*'by elinks. elinks -no-references -no-numbering -dump 'http://translate.google.com/translate_dict?q=dog&hl=en&langpair=en%7Cde' the dog doesn't bite ==> the dog doesn*t bite etc. This is a funny one. Tried looking at the curl op for the raw html. curl -s 'http://translate.google.com/translate_dict? q=dog&hl=en&langpair=en%7Cde' the <span class="highlight">dog</span> doesn�t bite </span><br> On my screen again I see an space (probably unprintable character). But here I copy-paste it into Xnews and my apostrophe is again visible! Sorry guys, I seem to have a really messed up terminal! -- Rahul
From: Hactar on 17 Apr 2008 17:07 In article <Xns9A83724BEDA9nospamnospamcom(a)85.214.90.236>, Rahul <nospam(a)nospam.invalid> wrote: > ebenZEROONE(a)verizon.net (Hactar) wrote in > news:49sid5-kj5.ln1(a)royalty.mine.nu: > > > LANG=de.DE? > > Thanks eben! Almost... LANG = de_DE seems to be the option that works for > me. I don't know why; but only that one works! Now all my umlauts are > perfect. Thanks for all those helpful leads guys. > > Although, there's a problem with apostrope marks still. Was with EN and > also persists now. All my apostrophe's seem to be rendered as '*'by elinks. > > elinks -no-references -no-numbering -dump > 'http://translate.google.com/translate_dict?q=dog&hl=en&langpair=en%7Cde' > > the dog doesn't bite ==> the dog doesn*t bite etc. > > This is a funny one. Tried looking at the curl op for the raw html. > > curl -s 'http://translate.google.com/translate_dict? > q=dog&hl=en&langpair=en%7Cde' > > the <span class="highlight">dog</span> doesn�t bite </span><br> That's not a quote (0x27), it's an 0x92, some sort of "smart quote", I presume. Your browser may render it as a quote, but that's not relevant here. "pr" may fix that, "tr" or "sed" definitely will. "curl" may have a relevant option. Just for kicks, "od" wouldn't show that: 0003460 64 6f 65 73 6e 27 74 20 62 69 74 65 20 3d 3d 3e >doesn't bite ==>< ^^ so I had to use less: the <span class="highlight">dog</span> doesn<92>t bite </span><br> ^^ Anyone know a more reliable method? -- -eben QebWenE01R(a)vTerYizUonI.nOetP royalty.mine.nu:81 Your pretended fear lest error might step in is like the man who would keep all wine out of the country lest men should be drunk. -- Oliver Cromwell
From: Bill Marcum on 17 Apr 2008 17:18
["Followup-To:" header set to comp.unix.shell.] On 2008-04-17, Hactar <ebenZEROONE(a)verizon.net> wrote: > > > That's not a quote (0x27), it's an 0x92, some sort of "smart quote", I > presume. Your browser may render it as a quote, but that's not relevant > here. "pr" may fix that, "tr" or "sed" definitely will. "curl" may have > a relevant option. > > Just for kicks, "od" wouldn't show that: > > 0003460 64 6f 65 73 6e 27 74 20 62 69 74 65 20 3d 3d 3e >doesn't bite ==>< > ^^ > so I had to use less: > > the <span class="highlight">dog</span> doesn<92>t bite </span><br> > ^^ > Anyone know a more reliable method? > It's strange that od and less show different characters. Were you using the contents of a file or a pipe? Try "LANG=C od" or pipe the text through "recode cp1252..iso-8859-15" or "recode cp1252..utf-8" |