From: T on
All I'm looking to do is to download messages from a POP account and
retrieve the sender and subject from their headers. Right now I'm 95%
of the way there, except I can't seem to figure out how to *just* get
the headers. Problem is, certain email clients also include headers
in the message body (i.e. if you're replying to a message), and these
are all picked up as additional senders/subjects. So, I want to avoid
processing anything from the message body.

Here's a sample of what I have:

# For each line in message
for j in M.retr(i+1)[1]:
# Create email message object from returned string
emailMessage = email.message_from_string(j)
# Get fields
fields = emailMessage.keys()
# If email contains "From" field
if emailMessage.has_key("From"):
# Get contents of From field
from_field = emailMessage.__getitem__("From")

I also tried using the following, but got the same results:
emailMessage =
email.Parser.HeaderParser().parsestr(j, headersonly=True)

Any help would be appreciated!
From: MRAB on
T wrote:
> All I'm looking to do is to download messages from a POP account and
> retrieve the sender and subject from their headers. Right now I'm 95%
> of the way there, except I can't seem to figure out how to *just* get
> the headers. Problem is, certain email clients also include headers
> in the message body (i.e. if you're replying to a message), and these
> are all picked up as additional senders/subjects. So, I want to avoid
> processing anything from the message body.
>
> Here's a sample of what I have:
>
> # For each line in message
> for j in M.retr(i+1)[1]:
> # Create email message object from returned string
> emailMessage = email.message_from_string(j)
> # Get fields
> fields = emailMessage.keys()
> # If email contains "From" field
> if emailMessage.has_key("From"):
> # Get contents of From field
> from_field = emailMessage.__getitem__("From")
>
> I also tried using the following, but got the same results:
> emailMessage =
> email.Parser.HeaderParser().parsestr(j, headersonly=True)
>
> Any help would be appreciated!

If you're using poplib then use ".top" instead of ".retr".
From: Grant Edwards on
On 2010-03-11, T <misceverything(a)gmail.com> wrote:
> All I'm looking to do is to download messages from a POP account and
> retrieve the sender and subject from their headers. Right now I'm 95%
> of the way there, except I can't seem to figure out how to *just* get
> the headers.

The headers are saparated from the body by a blank line.

> Problem is, certain email clients also include headers in the message
> body (i.e. if you're replying to a message), and these are all picked
> up as additional senders/subjects. So, I want to avoid processing
> anything from the message body.

Then stop when you see a blank line.

Or retreive just the headers.

--
Grant Edwards grant.b.edwards Yow! My life is a patio
at of fun!
gmail.com
From: T on
On Mar 11, 3:13 pm, MRAB <pyt...(a)mrabarnett.plus.com> wrote:
> T wrote:
> > All I'm looking to do is to download messages from a POP account and
> > retrieve the sender and subject from their headers.  Right now I'm 95%
> > of the way there, except I can't seem to figure out how to *just* get
> > the headers.  Problem is, certain email clients also include headers
> > in the message body (i.e. if you're replying to a message), and these
> > are all picked up as additional senders/subjects.  So, I want to avoid
> > processing anything from the message body.
>
> > Here's a sample of what I have:
>
> >                 # For each line in message
> >                 for j in M.retr(i+1)[1]:
> >                     # Create email message object from returned string
> >                     emailMessage = email.message_from_string(j)
> >                     # Get fields
> >                     fields = emailMessage.keys()
> >                     # If email contains "From" field
> >                     if emailMessage.has_key("From"):
> >                         # Get contents of From field
> >                         from_field = emailMessage.__getitem__("From")
>
> > I also tried using the following, but got the same results:
> >                  emailMessage =
> > email.Parser.HeaderParser().parsestr(j, headersonly=True)
>
> > Any help would be appreciated!
>
> If you're using poplib then use ".top" instead of ".retr".

I'm still having the same issue, even with .top. Am I missing
something?

for j in M.top(i+1, 0)[1]:
emailMessage = email.message_from_string(j)
#emailMessage =
email.Parser.HeaderParser().parsestr(j, headersonly=True)
# Get fields
fields = emailMessage.keys()
# If email contains "From" field
if emailMessage.has_key("From"):
# Get contents of From field
from_field = emailMessage.__getitem__("From")

Is there another way I should be using to retrieve only the headers
(not those in the body)?
From: MRAB on
T wrote:
> On Mar 11, 3:13 pm, MRAB <pyt...(a)mrabarnett.plus.com> wrote:
>> T wrote:
>>> All I'm looking to do is to download messages from a POP account and
>>> retrieve the sender and subject from their headers. Right now I'm 95%
>>> of the way there, except I can't seem to figure out how to *just* get
>>> the headers. Problem is, certain email clients also include headers
>>> in the message body (i.e. if you're replying to a message), and these
>>> are all picked up as additional senders/subjects. So, I want to avoid
>>> processing anything from the message body.
>>> Here's a sample of what I have:
>>> # For each line in message
>>> for j in M.retr(i+1)[1]:
>>> # Create email message object from returned string
>>> emailMessage = email.message_from_string(j)
>>> # Get fields
>>> fields = emailMessage.keys()
>>> # If email contains "From" field
>>> if emailMessage.has_key("From"):
>>> # Get contents of From field
>>> from_field = emailMessage.__getitem__("From")
>>> I also tried using the following, but got the same results:
>>> emailMessage =
>>> email.Parser.HeaderParser().parsestr(j, headersonly=True)
>>> Any help would be appreciated!
>> If you're using poplib then use ".top" instead of ".retr".
>
> I'm still having the same issue, even with .top. Am I missing
> something?
>
> for j in M.top(i+1, 0)[1]:
> emailMessage = email.message_from_string(j)
> #emailMessage =
> email.Parser.HeaderParser().parsestr(j, headersonly=True)
> # Get fields
> fields = emailMessage.keys()
> # If email contains "From" field
> if emailMessage.has_key("From"):
> # Get contents of From field
> from_field = emailMessage.__getitem__("From")
>
> Is there another way I should be using to retrieve only the headers
> (not those in the body)?

The documentation does say:

"""unfortunately, TOP is poorly specified in the RFCs and is
frequently broken in off-brand servers."""

All I can say is that it works for me with my ISP! :-)