From: Rafael J. Wysocki on
On Saturday 27 February 2010, Ingo Molnar wrote:
>
> * Stephen Rothwell <sfr(a)canb.auug.org.au> wrote:
>
> > [I have removed linux-tip-commits from the cc list]
> >
> > Hi Ingo,
> >
> > On Tue, 23 Feb 2010 09:45:52 +0100 Ingo Molnar <mingo(a)elte.hu> wrote:
> > >
> > > Developers simply cannot be expected to build for 22 architectures, and
> > > they shouldnt be.
> >
> > I have agreed with this point of yours several times. Why do you keep
> > stating it?
>
> If you agree with me then why do you put so much focus on cross-arch build
> failures, versus other, more relevant forms of testing?

I don't really know what this is all about. Stephen does what he can and
that's generally appreciated very much. It helps to make sure the code builds
correctly on the architectures it's supposed to build on and there's nothing
wrong with that IMO.

> > > The thing is, last i checked you didnt even _test_ x86 as the first step
> > > in your linux-next build tests. Most of your generic build bug reports are
> > > against PowerPC. They create the appearance that x86 is a second class
> > > citizen in linux-next.
> >
> > Lets see. Over the last 60 days, I have reported 37 build errors. Of
> > these, 16 were reported against x86, 14 against ppc, 7 against other archs.
>
> So only 43% of them were even relevant on the platform that 95+% of the Linux
> testers use? Seems to support the points i made.

Well, I hope you don't mean that because the majority of bug reporters (vs
testers, the number of whom is unknown to me at least) use x86, we are free
to break the other architectures. ;-)

> > Of the ppc reports, 10 would not affect x86 builds (due to being ppc
> > specific problems or dependencies on implicit includes that do happen on
> > x86). None of the reports against other arches would affect x86 builds.
> >
> > I also reported 31 warnings. 15 against x86, 15 against ppc and 1 against
> > both. Of those only reported against ppc, 13 did not affect x86.
> >
> > So of my "generic" reports, 4 errors and 2 warnings were reported against
> > ppc, 16 errors and 15 warnings again x86.
> >
> > Also, I am not sure how reports of 37 build errors and 32 warnings over 60
> > days can tax the resources of our developer base. [...]
>
> Note that out of those 37 build errors only a small minority were caused by
> any tree i co-maintain. (i dont have the precise numbers but it's below 5)
>
> Why? Because i cross-build before pushing to linux-next. I bug people about
> cross-arch build failures, and about the patch flow delays and hickups this
> causes. Without that you'd see twice that many cross-build failures.
>
> Which in itself is not bad of course (any fix is a good fix) - except the
> forced prioritization and its place in the workflow: it sends the wrong
> testing message.
>
> It sends the message that building on N architectures is more important than
> for the code to work for real people. I've had good developers waste their
> time trying to set up cross-build testing environments and complain to me how
> this complicates their testing.

That's the kind of task linux-next is really good at AFAICT. Before linux-next
I used to have a cross-build testing environment like this, but I don't need it
any more, because I know linux-next will catch the cross-build problems for
me and I appreciate that very much, because it saves a lot of my time.

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ingo Molnar on

* Rafael J. Wysocki <rjw(a)sisk.pl> wrote:

> > > Lets see. Over the last 60 days, I have reported 37 build errors. Of
> > > these, 16 were reported against x86, 14 against ppc, 7 against other
> > > archs.
> >
> > So only 43% of them were even relevant on the platform that 95+% of the
> > Linux testers use? Seems to support the points i made.
>
> Well, I hope you don't mean that because the majority of bug reporters (vs
> testers, the number of whom is unknown to me at least) use x86, we are free
> to break the other architectures. ;-)

It means exactly that: just like we 'can' break compilation with gcc296,
ancient versions of binutils, odd bootloaders, can break the boot via odd
hardware, etc. When someone uses that architectures then the 'easy' bugfixes
will actually flow in very quickly and without much fuss - and without
burdening developers to consider cases they have no good ways to test. Why
should rare architectures be more important than those other rare forms of
Linux usage?

In fact those rare ways of building and booting the kernel i mentioned are
probably used _more_ than half of the architectures that linux-next
build-tests ...

So yes, of course _all_ bugs need fixing if there's enough capacity, but the
process in general should be healthy, low-overhead and shouldnt concentrate on
an irrelevant portion of Linux usage in such a prominent way.

Or, if it does, it should _first_ cover the other, much more burning areas of
testing interest. All the while our _real_ bugreports are often rotting on
bugzilla.kernel.org ...

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rafael J. Wysocki on
On Saturday 27 February 2010, Ingo Molnar wrote:
>
> * Rafael J. Wysocki <rjw(a)sisk.pl> wrote:
>
> > > > Lets see. Over the last 60 days, I have reported 37 build errors. Of
> > > > these, 16 were reported against x86, 14 against ppc, 7 against other
> > > > archs.
> > >
> > > So only 43% of them were even relevant on the platform that 95+% of the
> > > Linux testers use? Seems to support the points i made.
> >
> > Well, I hope you don't mean that because the majority of bug reporters (vs
> > testers, the number of whom is unknown to me at least) use x86, we are free
> > to break the other architectures. ;-)
>
> It means exactly that: just like we 'can' break compilation with gcc296,
> ancient versions of binutils, odd bootloaders, can break the boot via odd
> hardware, etc. When someone uses that architectures then the 'easy' bugfixes
> will actually flow in very quickly and without much fuss

Then I don't understand what the problem with getting them in at the linux-next
stage is. They are necessary anyway, so we'll need to add them sooner or
later and IMO the sooner the better.

Apart from this, that cross-build issues aren't always "easy" and sometimes
they take quite some time and engineering effort to resolve. IMO that's better
done at the linux-next stage than during a merge window.

> - and without burdening developers to consider cases they have no good ways
> to test. Why should rare architectures be more important than those other
> rare forms of Linux usage?

Because the Linus' tree is supposed to build on those architectures. As long
as that's the case, linux-next should build on them too.

> In fact those rare ways of building and booting the kernel i mentioned are
> probably used _more_ than half of the architectures that linux-next
> build-tests ...

I don't know and you don't know either. That's just pure speculation and
therefore meaningless.

> So yes, of course _all_ bugs need fixing if there's enough capacity, but the
> process in general should be healthy, low-overhead and shouldnt concentrate on
> an irrelevant portion of Linux usage in such a prominent way.
>
> Or, if it does, it should _first_ cover the other, much more burning areas of
> testing interest. All the while our _real_ bugreports are often rotting on
> bugzilla.kernel.org ...

All right. There are two _separate_ questions to ask IMO:

(1) Do we need the kind of community service that Stephen has been doing?

(2) Do we need more testing of linux-next and if so, who's task should that be?

I think you agree that the aswer to (1) is "yes, we do". So _someone_ has to
do it and I'm very grateful to Stephen for taking care of it.

[Thanks, Stephen!]

Now, the part of this service is to check that the resulting tree will actually
build in all conditions it's supposed to build in, if possible, or the whole
merging exercise wouldn't have much practical meaning. Stephen has been
doing just that and IMO to a good result.

To some extent, though, that's a matter of defining in what conditions the
kernel is supposed to build in, but I think for linux-next these conditions
should be the same as for the Linus' tree, for the simple reason that
linux-next is supposed to be a "future snapshot" of it. So linux-next should
build on all architectures that the future Linus' tree is supposed to build on.
Even on "exotic" ones.

[IMO that's actually important, because such corner cases tend to reveal
runtime bugs we wouldn't have been aware of otherwise. Now, in the majority
of cases a casual tester will be discouraged by the kernel not compiling for
him while he might have found a "real" bug otherwise.]

Now, as far as (2) and is concerned, I think the answer here is also "yes, we
do", but that's not a part of the Stephen's job.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Geert Uytterhoeven on
On Sat, Feb 27, 2010 at 20:07, Rafael J. Wysocki <rjw(a)sisk.pl> wrote:
> On Saturday 27 February 2010, Ingo Molnar wrote:
>> * Rafael J. Wysocki <rjw(a)sisk.pl> wrote:
>>
>> > > > Lets see.  Over the last 60 days, I have reported 37 build errors.  Of
>> > > > these, 16 were reported against x86, 14 against ppc, 7 against other
>> > > > archs.
>> > >
>> > > So only 43% of them were even relevant on the platform that 95+% of the
>> > > Linux testers use? Seems to support the points i made.
>> >
>> > Well, I hope you don't mean that because the majority of bug reporters (vs
>> > testers, the number of whom is unknown to me at least) use x86, we are free
>> > to break the other architectures. ;-)
>>
>> It means exactly that: just like we 'can' break compilation with gcc296,
>> ancient versions of binutils, odd bootloaders, can break the boot via odd
>> hardware, etc. When someone uses that architectures then the 'easy' bugfixes
>> will actually flow in very quickly and without much fuss
>
> Then I don't understand what the problem with getting them in at the linux-next
> stage is.  They are necessary anyway, so we'll need to add them sooner or
> later and IMO the sooner the better.
>
> Apart from this, that cross-build issues aren't always "easy" and sometimes
> they take quite some time and engineering effort to resolve.  IMO that's better
> done at the linux-next stage than during a merge window.
>
>> - and without burdening developers to consider cases they have no good ways
>> to test.  Why should rare architectures be more important than those other
>> rare forms of Linux usage?
>
> Because the Linus' tree is supposed to build on those architectures.  As long
> as that's the case, linux-next should build on them too.
>
>> In fact those rare ways of building and booting the kernel i mentioned are
>> probably used _more_ than half of the architectures that linux-next
>> build-tests ...
>
> I don't know and you don't know either.  That's just pure speculation and
> therefore meaningless.

If only the CE Linux Forum member companies would publish figures about the
number of Linux devices they push onto the world population...

Yes I know, this still excludes `obsolete' architectures like parisc
and alpha, but it would
change the balance towards x86 (and powerpc?) drastically.

>> So yes, of course _all_ bugs need fixing if there's enough capacity, but the
>> process in general should be healthy, low-overhead and shouldnt concentrate on
>> an irrelevant portion of Linux usage in such a prominent way.
>>
>> Or, if it does, it should _first_ cover the other, much more burning areas of
>> testing interest. All the while our _real_ bugreports are often rotting on
>> bugzilla.kernel.org ...
>
> All right.  There are two _separate_ questions to ask IMO:
>
> (1) Do we need the kind of community service that Stephen has been doing?
>
> (2) Do we need more testing of linux-next and if so, who's task should that be?
>
> I think you agree that the aswer to (1) is "yes, we do".  So _someone_ has to
> do it and I'm very grateful to Stephen for taking care of it.
>
> [Thanks, Stephen!]
>
> Now, the part of this service is to check that the resulting tree will actually
> build in all conditions it's supposed to build in, if possible, or the whole
> merging exercise wouldn't have much practical meaning.  Stephen has been
> doing just that and IMO to a good result.
>
> To some extent, though, that's a matter of defining in what conditions the
> kernel is supposed to build in, but I think for linux-next these conditions
> should be the same as for the Linus' tree, for the simple reason that
> linux-next is supposed to be a "future snapshot" of it.   So linux-next should
> build on all architectures that the future Linus' tree is supposed to build on.
> Even on "exotic" ones.
>
> [IMO that's actually important, because such corner cases tend to reveal
> runtime bugs we wouldn't have been aware of otherwise.  Now, in the majority
> of cases a casual tester will be discouraged by the kernel not compiling for
> him while he might have found a "real" bug otherwise.]
>
> Now, as far as (2) and is concerned, I think the answer here is also "yes, we
> do", but that's not a part of the Stephen's job.

While wearing my m68k hat, I can say that I suffer an order of
magnitude more from
build failures than from boot failures. So I'm inclined to agree with
Linus when he says
`if it compiles, it's great; if it boots, it's perfect' :-)

Or perhaps this says more about our review process: we're quite good at catching
logical errors in our code, and worse at catching syntax and dependency errors.
Fortunately we have tools (and linux-next) to catch those...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert(a)linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rafael J. Wysocki on
On Saturday 27 February 2010, Geert Uytterhoeven wrote:
> On Sat, Feb 27, 2010 at 20:07, Rafael J. Wysocki <rjw(a)sisk.pl> wrote:
> > On Saturday 27 February 2010, Ingo Molnar wrote:
> >> * Rafael J. Wysocki <rjw(a)sisk.pl> wrote:
> >>
> >> > > > Lets see. Over the last 60 days, I have reported 37 build errors. Of
> >> > > > these, 16 were reported against x86, 14 against ppc, 7 against other
> >> > > > archs.
> >> > >
> >> > > So only 43% of them were even relevant on the platform that 95+% of the
> >> > > Linux testers use? Seems to support the points i made.
> >> >
> >> > Well, I hope you don't mean that because the majority of bug reporters (vs
> >> > testers, the number of whom is unknown to me at least) use x86, we are free
> >> > to break the other architectures. ;-)
> >>
> >> It means exactly that: just like we 'can' break compilation with gcc296,
> >> ancient versions of binutils, odd bootloaders, can break the boot via odd
> >> hardware, etc. When someone uses that architectures then the 'easy' bugfixes
> >> will actually flow in very quickly and without much fuss
> >
> > Then I don't understand what the problem with getting them in at the linux-next
> > stage is. They are necessary anyway, so we'll need to add them sooner or
> > later and IMO the sooner the better.
> >
> > Apart from this, that cross-build issues aren't always "easy" and sometimes
> > they take quite some time and engineering effort to resolve. IMO that's better
> > done at the linux-next stage than during a merge window.
> >
> >> - and without burdening developers to consider cases they have no good ways
> >> to test. Why should rare architectures be more important than those other
> >> rare forms of Linux usage?
> >
> > Because the Linus' tree is supposed to build on those architectures. As long
> > as that's the case, linux-next should build on them too.
> >
> >> In fact those rare ways of building and booting the kernel i mentioned are
> >> probably used _more_ than half of the architectures that linux-next
> >> build-tests ...
> >
> > I don't know and you don't know either. That's just pure speculation and
> > therefore meaningless.
>
> If only the CE Linux Forum member companies would publish figures about the
> number of Linux devices they push onto the world population...
>
> Yes I know, this still excludes `obsolete' architectures like parisc
> and alpha, but it would
> change the balance towards x86 (and powerpc?) drastically.

You apparently forgot about ARM.

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/