From: Warren on
Maciej Sobczak expounded in news:7794a413-34e9-4340-abcc-a6568246fc38
@h18g2000yqo.googlegroups.com:

> On 24 Mar, 17:40, Warren <ve3...(a)gmail.com> wrote:
>
>> Another barrier I see to this is the high cost of
>> starting a new thread and stack space allocation.
>
>> Somehow you gotta make thread startup and shutdown
>> cheaper.
>
> Why?
>
> The problem of startup/shutdown cost and how many cores you have are
> completely orthogonal.
> I see no problem in starting N threads at the initialization time, use
> them throughout the application lifetime and then shut down at the end
> (or never).

Yes, I am aware of that option.

> If your favorite programming model involves lots of short-running
> threads that have to be created and torn down repeatedly, then it has
> no relation to multicore. It is just a bad resource usage pattern.
> Maciej Sobczak * http://www.inspirel.com

That's a rather sweeping statement to make ("bad resource usage
pattern"). Unless there are leaps in language design, I believe
that is what you will mostly get in automatic parallel thread
generation.

As humans we tend to think in sequential steps, and consequently
code things. The media seems to suggest that we shouldn't have
to change our mindset to do parallism (i.e. the compilers should
arrange it for us). Certainly that would make a wish list item.

I don't know much about Intel's hyper-threads, but I believe it
was one approach to doing this (presumably largely without
compiler help).

So I can't buy into your conclusion on that.

Warren
From: Warren on
Maciej Sobczak expounded in news:7794a413-34e9-4340-abcc-a6568246fc38
@h18g2000yqo.googlegroups.com:

> On 24 Mar, 17:40, Warren <ve3...(a)gmail.com> wrote:
>
>> Another barrier I see to this is the high cost of
>> starting a new thread and stack space allocation.
>
>> Somehow you gotta make thread startup and shutdown
>> cheaper.
>
> Why?
>
> The problem of startup/shutdown cost and how many cores you have are
> completely orthogonal.
> I see no problem in starting N threads at the initialization time, use
> them throughout the application lifetime and then shut down at the end
> (or never)...

I forgot to mention that the disadvantage of this approach is that
you have to "pre-allocate" stack space for each thread (whether
by default amount or by a specific designed amount).

If you used a true cactus stack, this is not an issue. But with a
traditional thread, you could choose stack requirements at the
point of thread creation. Not so, if you create them all up front.

So there are downsides to this approach.

Warren
From: Dmitry A. Kazakov on
On Thu, 25 Mar 2010 17:30:05 +0000 (UTC), Warren wrote:

> Maciej Sobczak expounded in news:7794a413-34e9-4340-abcc-a6568246fc38
> @h18g2000yqo.googlegroups.com:
>
>> On 24 Mar, 17:40, Warren <ve3...(a)gmail.com> wrote:
>>
>>> Another barrier I see to this is the high cost of
>>> starting a new thread and stack space allocation.
>>
>>> Somehow you gotta make thread startup and shutdown
>>> cheaper.
>>
>> Why?
>>
>> The problem of startup/shutdown cost and how many cores you have are
>> completely orthogonal.
>> I see no problem in starting N threads at the initialization time, use
>> them throughout the application lifetime and then shut down at the end
>> (or never)...
>
> I forgot to mention that the disadvantage of this approach is that
> you have to "pre-allocate" stack space for each thread (whether
> by default amount or by a specific designed amount).

BTW, if this approach worked for an application, it should also do for the
OS, e.g. why not to start all threads for all not yet running processes
upon booting? If that worked, the effective observed startup time of a
thread would be 0, and thus there would be nothing to care about.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
From: Maciej Sobczak on
On 26 Mar, 09:19, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de>
wrote:

> BTW, if this approach worked for an application, it should also do for the
> OS,

It is true, obtaining resources up-front requires more careful
analysis of the problem that is being solve and is not always
possible.
The difference between application and OS is in the amount of
knowledge about what the software will do and applications tend to
know more than OS in this aspect.
That is why it is more realistic to have applications allocating their
resources during initialization phase than to see that at the OS
level.

I'm not a big fan of programs that allocate and deallocate the same
resource repeatedly - this is an obvious candidate for caching and
object reuse, where the cost of allocation is amortized. Fortunately,
it is not even necessary for a user code to do that - think about a
caching memory allocator, there are analogies. And the language
standard does not prevent implementations from reusing physical
threads, if they are used as implementation foundations for tasks.

--
Maciej Sobczak * http://www.inspirel.com

YAMI4 - Messaging Solution for Distributed Systems
http://www.inspirel.com/yami4
From: Warren on
Maciej Sobczak expounded in
news:7b059d0f-791b-4ac9-bf64-c50448ec99f7(a)b30g2000yqd.googlegroups.com:
...
> The difference between application and OS is in the amount of
> knowledge about what the software will do and applications tend to
> know more than OS in this aspect.

Yes.

> That is why it is more realistic to have applications allocating their
> resources during initialization phase than to see that at the OS
> level.

I would generally agree with that, unless the cost of resource
management was cleverly reduced.

> I'm not a big fan of programs that allocate and deallocate the same
> resource repeatedly - this is an obvious candidate for caching and
> object reuse, where the cost of allocation is amortized.

As a general principle this is right. But memory is another
resource that sometimes needs careful management. With only
1 thread, you have a heap growing up to the stack and a stack
that grows towards the heap. Either stack or heap can be huge
(potentially at least), as long as both are not at the same
time (overlapping).

The moment you add 1 [additional] thread, you've now drawn
the line in the sand for the lowest existing stack, and
putting a smaller limit on it.

This disadvantage is ok for probably most threaded programs,
but perhaps not for a video rendering program that might
hog resources on both heap and stack sides at differing
times.

In the end, the application programmer must plan this out,
but this is a limitation that I dislike about our current
execution environments. I suppose, just increasing the
size of your VM address space, postpones the problem
until we hit limits again. ;-)

> Fortunately,
> it is not even necessary for a user code to do that - think about a
> caching memory allocator, there are analogies. And the language
> standard does not prevent implementations from reusing physical
> threads, if they are used as implementation foundations for tasks.
> Maciej Sobczak * http://www.inspirel.com

From an efficiency pov, this is all well and good. But
if you want maximum dynamic allocation of heap+stack,
then you might prefer fewer (if any) pre-allocated threads
(implying additional stacks).

Warren