From: MRAB on
Νίκος wrote:
>> On 3 Αύγ, 21:00, Dave Angel <da...(a)ieee.org> wrote:
>
>> A string is an object containing characters. A string literal is one of
>> the ways you create such an object. When you create it that way, you
>> need to make sure the compiler knows the correct encoding, by using the
>> encoding: line at beginning of file.
>
[snip]
> Tell me something. What encoding should i pick for my scripts knowing
> that only contain english + greek chars??
> iso-8859-7 or utf-8 and why?
>
This is easy to answer: UTF-8 with the:

# -*- coding: UTF-8 -*-

comment to tell Python that your script file is encoded in UTF-8.

I was once given a file in a language I don't know (translations for
display messages). Some of the text didn't look quite right. It took me
a while to figure out that it was written on a machine which used CP1250
and my machine used CP1252. If everybody used the same encoding then
such problems wouldn't occur, and UTF-8 can handle any characters which
are in Unicode: Latin, Greek, Cyrillic, Arabic, etc.
From: Νίκος on
For the cookie problem iam tryign houts now and even this aint
working:

========================================
cookie = Cookie.SimpleCookie()

if os.environ.get('HTTP_COOKIE') and cookie.has_key('visitor') ==
'nikos': #if visitor cookie exist
print "Cookie Unset"
cookie['visitor'] = 'nikos'
cookie['visitor']['expires'] = -1 #this cookie will expire now
else:
print "Cookie is set!"
cookie['visitor'] = 'nikos'
cookie['visitor']['expires'] = 1000 #this cookie will expire now
========================================

i tried in IDLE enviroment as well and for some reason even with a
single number isnated of time() function the cookie is never set,
because the print of

>>>print os.environ.get('HTTP_COOKIE')

result to

None

:(
From: Dotan Cohen on
2010/8/4 Νίκος <nikos.the.gr33k(a)gmail.com>:
> Encodings still give me headaches. I try to understand them as
> different ways to store data in a media.
>
> Tell me something. What encoding should i pick for my scripts knowing
> that only contain english + greek chars??
> iso-8859-7 or utf-8 and why?
>

Always use UTF-8, every modern system supports it, and it will let you
use any arbitrary character that you need, such as maybe a smiley or a
Euro sign. You will avoid headaches with databases and files and all
sorts of other things that you don't yet expect. Declare it in the
HTTP header, and in the HTML meta tag.

Trust me, I maintain gibberish.co.il which specializes in encoding
problems. Just use UTF-8 everywhere and you will save a lot of
headaches.


> Can i save the sting lets say "Νίκος" in different encodings and still
> print out correctly in browser?
>

No.


> ascii = the standard english character set only, right?
>

Pretty much, plus the numbers, some symbols, and a few nonprinting
characters. Read here:
http://en.wikipedia.org/wiki/Ascii


--
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com
From: Steven D'Aprano on
On Tue, 03 Aug 2010 20:08:46 -0700, Νίκος wrote:

> i tried in IDLE enviroment as well and for some reason even with a
> single number isnated of time() function the cookie is never set,
> because the print of
>
>>>>print os.environ.get('HTTP_COOKIE')
>
> result to
>
> None


What happens if you open up a NEW xterm and do this?

echo $HTTP_COOKIE


Or, to put it another way... are you sure that the environment variable
is actually being set?


--
Steven
From: Dave Angel on


¯º¿Â wrote:
>> On 3 Αύγ, 21:00, Dave Angel <da...(a)ieee.org> wrote:
>>
>
>
>> A string is an object containing characters. A string literal is one of
>> the ways you create such an object. When you create it that way, you
>> need to make sure the compiler knows the correct encoding, by using the
>> encoding: line at beginning of file.
>>
>
>
> mymessage = "καλημέρα" <==== string
> mymessage = u"καλημέρα" <==== string literal?
>
> So, a string literal is one of the encodings i use to create a string
> object?
>
>
No, both lines take a string literal, create an object, and bind a name
to that object. In the first case, the object is a string, and in the
second it's a unicode-string. But the literal is the stuff after the
equals sign in both these cases.

Think about numbers for a moment. When you say
salary = 4.1

you've got a numeric literal that's three characters long, and a name
that's six characters long. When the interpreter encounters this line,
it builds an object of type float, whose value approximates 4.1,
according to the language rules. It then binds the name salary to this
object.

> Can the encodign of a python script file be in iso-8859-7 which means
> the file contents is saved to the hdd as greek-iso but the part of
> this variabel value mymessage ="καλημέρα" is saved as utf-8 ot the
> opposite?
>
>
A given file needs to have a single encoding, or you're in big trouble.
So a script file is encoded by the text editor in a single encoding
method, which is not saved to the file (except indirectly if you specify
BOM). It's up to you to add a line to the beginning to tell Python how
to decode the file. One decoding for one file.
> have the file saved as utf-8 but one variuable value as greek
> encoding?
>
>
Variables are not saved to source (script) files. Literals are in the file.
> Encodings still give me headaches. I try to understand them as
> different ways to store data in a media.
>
> Tell me something. What encoding should i pick for my scripts knowing
> that only contain english + greek chars??
> iso-8859-7 or utf-8 and why?
>
>
Depends on how sure you are that your program will never need characters
outside your greek character set. Remember Y2K?

> Can i save the sting lets say "Νίκος" in different encodings and still
> print out correctly in browser?
>
> ascii =he standard english character set only, right?
>
>
>> The web server wraps a few characters before and after your html stream,
>> but it shouldn't touch the stream itself.
>>
>
> So the pythoon compiler using the cgi module is the one that is
> producing the html output that immediately after send to the web
> server, right?
>
>
>
>>> For example if i say mymessage =καλημέρα" and the i say mymessage = u"καλημέρα" then the 1st one is a greek encoding variable while the
>>> 2nd its a utf-8 one?
>>>
>> No, the first is an 8 bit copy of whatever bytes your editor happened to
>> save.
>>
>
> But since mymessage =καλημέρα" is a string containing greek
> characaters why the editor doesn't save it as such?
>
>
Because the editor is editing text, not python objects. It's job is
solely to represent all your keystrokes in some consistent manner so
that they can be interpreted later by some other program, possibly a
compiler.
> It reminds me of varibles an valeus where if you say
>
> a = 5, a var becomes instantly an integer variable
> while
> a = 'hello' , become instantly a string variable
>
>
>
>> mymessage = u"καλημέρα"
>>
>> creates an object that is *not* encoded.
>>
>
> Because it isn't saved by the editor yet? In what satet is this object
> in before it gets encoded?
> And it egts encoded the minute i tell the editor to save the file?
>
>
You're confusing timeframes here. Notepad++ doesn't know Python, and
it's long gone by the time the compiler deals with that line. In
Notepad++, there are no python objects, encoded or not.
>> Encoding is taking the unicode
>> stream and representing it as a stream of bytes, which may or may have
>> more bytes than the original has characters.
>>
>
>
> So this line mymessage = u"καλημέρα" what it does is tell the browser
> thats when its time to save the whole file to save this string as
> utf-8?
>
>
No idea what you mean. The browser isn't saving anything; it doesn't
even get involved till after the python code has completed.
> If yes, then if were to save the above string as greek encoding how
> was i suppose to right it?
>
> Also if u ise the 'coding line' in the beggining of the file is there
> a need for using the u literal?
>
>
If you don't use the u literal, then don't even try to use utf-8. You'll
find that strings have the wrong lengths, and therefore subscripts and
formatting will sometimes fail in strange ways.
>> I personally haven't done any cookie code. If I were debugging this, I'd
>> factor out the multiple parts of that if statement, and find out which
>> one isn't true. From here I can't guess.
>>
>
> I did what you say and foudn out that both of the if condition parts
> were always false thast why the if code blck never got executed.
>
> And it is alwsy wrong because the cookie never gets set.
>
> So can you please tell me why this line
>
> cookie['visitor'] = 'nikos', time() + 60*60*24*365 ) #this cookie
> will expire in an year
>
> never created a cookie?
>
>
As I said, I've never coded with cookies. But to create a cookie, you
have to communicate with a browser, and that takes lots more than just
adding an item to a map. Further, your getenv() will normally give you
the state of the environment at the time your program was launched, so I
wouldn't expect it to change.

If I had to guess how cookies are done in CGI, I'd say that you probably
have to talk to the CGI server and terminate, and that afterwards
there'd be a new launch of your code, from which you could check that
environment variable to see the results of the cookie.


DaveA