From: Jack Steiner on
On Thu, Apr 22, 2010 at 10:28:52AM +0100, Alan Cox wrote:
> > Distros don't want to take a patch that adds a new boot param that is
> > not accepted upstream, otherwise they will be stuck forward porting it
> > from now until, well, forever :)
>
> So for an obscure IA64 specific problem you want the upstream kernel to
> port it forward forever instead ?

FWIW, the problem is occurring on systems that use x86 processors - not
IA64.


> >
> > As this solves a problem that people are having today, on the kernel.org
> > kernel, on a known machine, and we really don't know when the "reduce
> > the number of processes per cpu" work will be done, or if it really will
> > solve this issue, then why can't we take it now? If the work does solve
> > the problem in the future, then we can take the command line option out,
> > and everyone is happy.
> >
> > Sound reasonable?
>
> No - to start with it would be far saner for everything involved if the
> 4096 processor minority fixed it for the moment in their arch code by
> doing something like
>
> if (max_pids < PIDS_PER_CPU * num_cpus) {
> max_pids = ...
> printk(something informative)
> }
>
> in their __init marked code.
>
> Because when Tejun's stuff is in the patch can go away, and also if it's
> not sufficient then the patch above should keep it sane when they go to
> 32000 cpus or whatever is next.
>
> Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Hedi Berriche on
On Wed, Apr 21, 2010 at 10:20 Alan Cox wrote:
| > of 32k will not be enough. A system with 1664 CPU's, there are 25163 processes
| > started before the login prompt. It's estimated that with 2048 CPU's we will pass
|
| Is that perhaps the bug not the 32K limit?

Doubt it: I just checked on an *idle* 1664 CPUs system and I can see 26844
tasks, all but few being kernel threads.

Worst case scenario i.e. 4096 CPUs system (+ typically thousands of disks) will
most certainly pain to boot, if it ever manages to, when pid_max is set to 32K.

Cheers,
Hedi.
--
Be careful of reading health books, you might die of a misprint.
-- Mark Twain
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Hedi Berriche on
On Wed, Apr 21, 2010 at 18:54 Alan Cox wrote:
| Hedi Berriche <hedi(a)sgi.com> wrote:
|
| > I just checked on an *idle* 1664 CPUs system and I can see 26844 tasks, all
| > but few being kernel threads.
|
| So why have we got 26844 tasks. Isn't that a rather more relevant
| question.

OK, here's a rough breakdown of the tasks

104 kswapd
1664 aio
1664 ata
1664 crypto
1664 events
1664 ib_cm
1664 kintegrityd
1664 kondemand
1664 ksoftirqd
1664 kstop
1664 migration
1664 rpciod
1664 scsi_tgtd
1664 xfsconvertd
1664 xfsdatad
1664 xfslogd

that's 25064, omitting the rest as its contribution to the overall total is
negligible.

[[

Let's also not forget all those ephemeral user space tasks (udev and the likes)
that will be spawned at boot time on even large systems with even more
thousands of disks, arguably one might consider hack initrd and similar to work
around the problem and set pid_max as soon as /proc becomes available but it's
a bit of a PITA.

]]

| And as I asked before - how does Tejun's work on sanitizing work queues
| affect this ?

I'm not familiar with the work in question so I (we) will have to look it up,
and at it and see whether it's relevant to what we're seeing here. It does sound
like it might help, to certain extent at least.

That said, while I am genuinely interested in spending time on this and digging
further to see whether something has/can be done about keeping under control the
number of tasks required to comfortably boot a system of this size, I think that
in the meantime the boot parameter approach is useful in the sense that it addresses
the immediate problem of being able such systems *without* any risk to break the
code or alter the default behaviour.

Cheers,
Hedi.
--
Be careful of reading health books, you might die of a misprint.
-- Mark Twain
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Hedi Berriche on
On Wed, Apr 21, 2010 at 20:15 John Stoffel wrote:
| >>>>> "Rik" == Rik van Riel <riel(a)redhat.com> writes:
|
| Rik> That is 15 kernel threads per CPU.
|
| Rik> Reducing the number of kernel threads sounds like a
| Rik> useful thing to do.
|
| Isn't that already a project?

Yes, thanks to Alan's probing I looked it up

http://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git

but we're definitely talking long term solution vs. something that can ease
pain now.

Cheers,
Hedi.
--
Be careful of reading health books, you might die of a misprint.
-- Mark Twain
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: John Stoffel on
>>>>> "Hedi" == Hedi Berriche <hedi(a)sgi.com> writes:

Hedi> On Wed, Apr 21, 2010 at 20:15 John Stoffel wrote:
Hedi> | >>>>> "Rik" == Rik van Riel <riel(a)redhat.com> writes:
Hedi> |
Hedi> | Rik> That is 15 kernel threads per CPU.
Hedi> |
Hedi> | Rik> Reducing the number of kernel threads sounds like a
Hedi> | Rik> useful thing to do.
Hedi> |
Hedi> | Isn't that already a project?

Hedi> Yes, thanks to Alan's probing I looked it up

Hedi> http://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git

Hedi> but we're definitely talking long term solution vs. something
Hedi> that can ease pain now.

It seems to me that running Linux on such a large machine is such a
specialized niche, the putting in your change to the regular kernel
isn't a near term need either. And from the sounds of it, Tejun's
work has better long term potential.

But hey, I'm generally clueless, so take what I say with a grain of
salt. :]

John
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/