From: Steve Howell on
The v2.6.4 version of the tutorial says this:

'''
It is also possible to use a list as a queue, where the first element
added is the first element retrieved (“first-in, first-out”); however,
lists are not efficient for this purpose. While appends and pops from
the end of list are fast, doing inserts or pops from the beginning of
a list is slow (because all of the other elements have to be shifted
by one).
'''

Is that really true in CPython? It seems like you could advance the
pointer instead of shifting all the elements. It would create some
nuances with respect to reclaiming the memory, but it seems like an
easy way to make lists perform better under a pretty reasonable use
case.

Does anybody know off the top of their head if the "have-to-be-shifted-
by-one" warning is actually valid?


http://docs.python.org/tutorial/datastructures.html







From: Chris Rebert on
On Fri, Jan 22, 2010 at 11:14 AM, Steve Howell <showell30(a)yahoo.com> wrote:
> The v2.6.4 version of the tutorial says this:
>
> '''
> It is also possible to use a list as a queue, where the first element
> added is the first element retrieved (“first-in, first-out”); however,
> lists are not efficient for this purpose. While appends and pops from
> the end of list are fast, doing inserts or pops from the beginning of
> a list is slow (because all of the other elements have to be shifted
> by one).
> '''
>
> Is that really true in CPython?  It seems like you could advance the
> pointer instead of shifting all the elements.  It would create some
> nuances with respect to reclaiming the memory, but it seems like an
> easy way to make lists perform better under a pretty reasonable use
> case.
>
> Does anybody know off the top of their head if the "have-to-be-shifted-
> by-one" warning is actually valid?

Judging by the "Sorted dictionary" thread responses: Yes.

Cheers,
Chris
--
http://blog.rebertia.com
From: Steve Howell on
On Jan 22, 12:14 pm, Chris Rebert <c...(a)rebertia.com> wrote:
> On Fri, Jan 22, 2010 at 11:14 AM, Steve Howell <showel...(a)yahoo.com> wrote:
> > The v2.6.4 version of the tutorial says this:
>
> > '''
> > It is also possible to use a list as a queue, where the first element
> > added is the first element retrieved (“first-in, first-out”); however,
> > lists are not efficient for this purpose. While appends and pops from
> > the end of list are fast, doing inserts or pops from the beginning of
> > a list is slow (because all of the other elements have to be shifted
> > by one).
> > '''
>
> > Is that really true in CPython?  It seems like you could advance the
> > pointer instead of shifting all the elements.  It would create some
> > nuances with respect to reclaiming the memory, but it seems like an
> > easy way to make lists perform better under a pretty reasonable use
> > case.
>
> > Does anybody know off the top of their head if the "have-to-be-shifted-
> > by-one" warning is actually valid?
>
> Judging by the "Sorted dictionary" thread responses: Yes.
>

I think you are referring to this comment:

'''
Insertion and deletion are still O(n) as all items to the right of the
inserted/deleted one have to be shifted by one place.
'''

http://groups.google.com/group/comp.lang.python/browse_thread/thread/d3699724d94d5b5a

I can certainly see why most reasonable implementations of a list
would have insertion/deletion in the middle of the list be O(N), but I
don't think that limitation has to apply for the special cases of the
beginning and end of the list.


From: Christian Heimes on
Steve Howell wrote:
> Is that really true in CPython? It seems like you could advance the
> pointer instead of shifting all the elements. It would create some
> nuances with respect to reclaiming the memory, but it seems like an
> easy way to make lists perform better under a pretty reasonable use
> case.
>
> Does anybody know off the top of their head if the "have-to-be-shifted-
> by-one" warning is actually valid?

Why do you think the documentation has such obvious errors?

CPython's list type uses an array of pointers to store its members. The
type is optimized for the most common list operations in Python:
iteration and appending. Python code rarely changes the head or middle
of a list. The dequeue type is an optimized data structure for popping
and inserting data at both ends.

When you list.pop() or list.insert() the pointers in the internal array
must be shifted. appending is much faster because the internal array is
overallocated meaning it contains free slots at the tail of the data
structure. A linked list of pointers requires more memory and iteration
is slower since since it can't utilize the CPU's cache as good as an array.

Christian
From: Steve Howell on
On Jan 22, 12:40 pm, Christian Heimes <li...(a)cheimes.de> wrote:
> Steve Howell wrote:
> > Is that really true in CPython?  It seems like you could advance the
> > pointer instead of shifting all the elements.  It would create some
> > nuances with respect to reclaiming the memory, but it seems like an
> > easy way to make lists perform better under a pretty reasonable use
> > case.
>
> > Does anybody know off the top of their head if the "have-to-be-shifted-
> > by-one" warning is actually valid?
>
> Why do you think the documentation has such obvious errors?

I wasn't making any assumptions, hence the question mark. The Python
docs are very good, but even the best of projects make advances that
aren't reflected in the docs.

> CPython's list type uses an array of pointers to store its members. The
> type is optimized for the most common list operations in Python:
> iteration and appending. Python code rarely changes the head or middle
> of a list. The dequeue type is an optimized data structure for popping
> and inserting data at both ends.
>

I disagree that Python code rarely pops elements off the top of a
list. There are perfectly valid use cases for wanting a list over a
dequeue without having to pay O(N) for pop(0). Maybe we are just
quibbling over the meaning of "rarely."

> When you list.pop() or list.insert() the pointers in the internal array
> must be shifted. appending is much faster because the internal array is
> overallocated meaning it contains free slots at the tail of the data
> structure. A linked list of pointers requires more memory and iteration
> is slower since since it can't utilize the CPU's cache as good as an array.
>

I am not proposing a linked list of pointers. I am wondering about
something like this:

p = &p[1];
(and then reclaim p[0] as free memory, I already said I understood
that was the tricky bit)

The pointer arithmetic for accessing each element would still work in O
(1), right?