From: News123 on
Hi,

I'd like to perform huge file uploads via https.
I'd like to make sure,
- that I can obtain upload progress info (sometimes the nw is very slow)
- that (if the file exceeds a certain size) I don't have to
read the entire file into RAM.

I found Active states recipe 146306, which constructs the whole
multipart message first in RAM and sends it then in one chunk.


I found a server side solutions, that will write out the data file chunk
wise ( http://webpython.codepoint.net/mod_python_publisher_big_file_upload
)



If I just wanted to have progress info, then I could probably
just split line 16 of Active State's recipe ( h.send(body) )
into multiple send, right?

chunksize = 1024
for i in range(0,len(body),chunksize):
h.send(body[i:i+chunksize])
show_progressinfo()


But how could I create body step by step?
I wouldn't know the content-length up front?

thanks in advance



N





From: Aahz on
In article <4bea6b50$0$8925$426a74cc(a)news.free.fr>,
News123 <news1234(a)free.fr> wrote:
>
>I'd like to perform huge file uploads via https.
>I'd like to make sure,
>- that I can obtain upload progress info (sometimes the nw is very slow)
>- that (if the file exceeds a certain size) I don't have to
> read the entire file into RAM.

Based on my experience with this, you really need to send multiple
requests (i.e. "chunking"). There are ways around this (you can look
into curl's resumable uploads), but you will need to maintain state no
matter what, and I think that chunking is the best/simplest.
--
Aahz (aahz(a)pythoncraft.com) <*> http://www.pythoncraft.com/

f u cn rd ths, u cn gt a gd jb n nx prgrmmng.
From: James Mills on
On Wed, May 12, 2010 at 6:48 PM, News123 <news1234(a)free.fr> wrote:
> Hi,
>
> I'd like to perform huge file uploads via https.
> I'd like to make sure,
> - that I can obtain upload progress info (sometimes the nw is very slow)
> - that (if the file exceeds a certain size) I don't have to
>  read the entire file into RAM.
>
> I found Active states recipe 146306, which constructs the whole
> multipart message first in RAM and sends it then in one chunk.
>
>
> I found a server side solutions, that will write out the data file chunk
> wise ( http://webpython.codepoint.net/mod_python_publisher_big_file_upload
> )
>
>
>
> If I just wanted to have progress info, then I could probably
> just split line 16 of Active State's recipe ( h.send(body) )
> into multiple send, right?
>
> chunksize = 1024
> for i in range(0,len(body),chunksize):
>    h.send(body[i:i+chunksize])
>    show_progressinfo()
>
>
> But how could I create body step by step?
> I wouldn't know the content-length up front?
>
> thanks in advance

My suggestion is to find some tools that can
send multiple chucks of data. A non-blocking
i/o library/tool might be useful here (eg: twisted or similar).

cheers
James
From: News123 on
Hi Aaaz,

Aahz wrote:
> In article <4bea6b50$0$8925$426a74cc(a)news.free.fr>,
> News123 <news1234(a)free.fr> wrote:
>> I'd like to perform huge file uploads via https.
>> I'd like to make sure,
>> - that I can obtain upload progress info (sometimes the nw is very slow)
>> - that (if the file exceeds a certain size) I don't have to
>> read the entire file into RAM.
>
> Based on my experience with this, you really need to send multiple
> requests (i.e. "chunking"). There are ways around this (you can look
> into curl's resumable uploads), but you will need to maintain state no
> matter what, and I think that chunking is the best/simplest.
I agree I need chunking. (the question is just on which level of the
protocol)

I just don't know how to make a chunkwise file upload or what library is
best.

Can you recommend any libraries or do you have a link to an example?


I'd like to avoid to make separate https post requests for the chunks
(at least if the underlying module does NOT support keep-alive connections)


I made some tests with high level chunking (separate sequential https
post requests).
What I noticed is a rather high penalty in data throughput.
The reason is probably, that each request makes its own https connection
and that either the NW driver or the TCP/IP stack doesn't allocate
enough band width to my request.

Therefore I'd like to do the chunking on a 'lower' level.
One option would be to have a https module, which supports keep-alive,

the other would be to have a library, which creates a http post body
chunk by chunk.


What do others do for huge file uploads
The uploader might be connected via ethernet, WLAN, UMTS, EDGE, GPRS. )

N
From: Sean DiZazzo on
On May 13, 9:39 am, News123 <news1...(a)free.fr> wrote:
> Hi Aaaz,
>
> Aahz wrote:
> > In article <4bea6b50$0$8925$426a7...(a)news.free.fr>,
> > News123  <news1...(a)free.fr> wrote:
> >> I'd like to perform huge file uploads via https.
> >> I'd like to make sure,
> >> - that I can obtain upload progress info (sometimes the nw is very slow)
> >> - that (if the file exceeds a certain size) I don't have to
> >>  read the entire file into RAM.
>
> > Based on my experience with this, you really need to send multiple
> > requests (i.e. "chunking").  There are ways around this (you can look
> > into curl's resumable uploads), but you will need to maintain state no
> > matter what, and I think that chunking is the best/simplest.
>
> I agree I need  chunking. (the question is just on which level of the
> protocol)
>
> I just don't know how to make a chunkwise file upload or what library is
> best.
>
> Can you recommend any libraries or do you have a link to an example?
>
> I'd like to avoid to make separate https post requests for the chunks
> (at least if the underlying module does NOT support keep-alive connections)
>
> I made some tests with high level chunking (separate sequential https
> post requests).
> What I noticed is a rather high penalty in data throughput.
> The reason is probably, that each request makes its own https connection
> and that either the NW driver or the TCP/IP stack doesn't allocate
> enough band width to my request.
>
> Therefore I'd like to do the chunking on a 'lower' level.
> One option would be to have a https module, which supports keep-alive,
>
> the other would be  to have a library, which creates a http post body
> chunk by chunk.
>
> What do others do for huge file uploads
> The uploader might be connected via ethernet, WLAN, UMTS, EDGE, GPRS. )
>
> N

You could also just send the file in one big chunk and give yourself
another avenue to read the size of the file on the server. Maybe a
webservice that you call with the name of the file that returns it's
percent complete, or it could just return bytes on disk and you do the
math on the client side. Then you just forget about the transfer and
query the file size whenever you want to know...or on a schedule.

~Sean