From: Stephane Marchesin on
On Thu, Mar 4, 2010 at 23:44, Ingo Molnar <mingo(a)elte.hu> wrote:
>
> * Pekka Enberg <penberg(a)cs.helsinki.fi> wrote:
>
>> On Fri, Mar 5, 2010 at 8:49 AM, Ingo Molnar <mingo(a)elte.hu> wrote:
>> > The conclusion is crystal clear, breaking an ABI via a "flag day"
>> > cleanup/feature/etc is:
>> >
>> > ?- wrong
>> >
>> > ?- harmful
>> >
>> > ?- limits the developer base
>> >
>> > ?- limits the tester base
>> >
>> > ?- wastes time and effort. (fewer developers/testers means that while _this_
>> > ? feature was easier to add, all your _future_ features will be a bit harder
>> > ? to do. It compounds up.)
>> >
>> > ?- so it hurts even the very developer who is most convinced that this was the
>> > ? right thing to do
>> >
>> > It's a bad technical decision throughout. It's masochistic and often suicidal
>> > to just about any project in essence. I've seen projects that did it once and
>> > died just due to that single act of stupidity. I've seen projects that have
>> > done it a few times and took the usage hit, limped along with the wounds and
>> > never grew to the size they could have achieved. I've seen projects that did
>> > it once, took the hit, learned from it and never did it again.
>>
>> Agreed. What bothers me in this discussion is that people keep bringing up
>> the fact that nouveau is mostly developed by volunteers and thus it doesn't
>> make sense to make sure it's backwards (or forwards) compatible. But the way
>> I see it, it's the complete opposite. It's _more_ important to support ABIs
>> for community-driven efforts because you're relying on people who by
>> definition don't have time to waste. While the nouveau people might have
>> good intentions, I'm afraid they might be severely limiting their developer
>> and tester base because they're not focused on real world problems (like the
>> ones Linus is seeing).
>
> Yeah. I've seen a few other bad arguments as well:
>
> � 'exploding test matrix'
>
> This is often the result of _another_ bad technical decision:
> over-modularization.
>
> Xorg, mesa/libdrm and the kernel DRM drivers pretty share this signature:
>
> �- it's developed by the same tightly knit developer base who often cross
> � between these packages. Features often need changes in each component.
>
> �- a developer to be able to do real work has to have the latest sources
> � of all these components.
>
> �- a user just uses whatever horizontal version cut the distro did and never
> � truly 'mixes' these components as a conscious decision.
>
> �- distros just try to get the latest and most capable but still stable
> � version. Desperately so. Often they will create a version mix that was
> � never tested by developers in that form. They'll expose users to ABI
> � combinations that were never really intended. They have trouble
> � bootstrapping and stabilizing those essentially random combinations and
> � then have trouble applying stability and security fixes.
>
> The thing is, if development has such characteristics then it's pretty clearly
> not 3-4 separate projects but _one_ abstract project. [*]
>
> So the 'exploding test matrix' is simply the result of: creating ABIs between
> 3-4 _artificial components of the same project_ and then going through
> developer hell living with that mistake. [**]
>
> It's a bit as if we split up the kernel into 'microkernel' components, did a
> VFS ABI, MM ABI, drivers ABI, scheduler ABI, networking ABI and arch ABIs, and
> then tried to develop them as separate components.
>
> If we did then then Linux kernel development would slow down massively while
> in reality everyone would _still_ have to have the latest and greatest source
> checked out to do some real development work and to be able to implement
> features that affect the whole kernel ...
>
> Linux would become an epic fail of historic proportions if we ever did that.
>

Yes that is exactly the problem we are facing. And you know what? All
graphic driver devs agree on that, but there is no obvious solution.

Here are the interfaces which are part of this problem:
- drm interface (drm wrappers as seen from the driver, drm ioctls from
the user space)
- X.Org acceleration interface (EXA and friends as seen from the
driver, XRender and friends as seen from the apps)
- Mesa interface (Gallium or mesa driver interface from the driver,
OpenGL seen from the app)

Any solution will involve merging two or more components together to
remove interfaces, so lets observe pairwise what could be merged and
the drawbacks:
- Merge DRM and Mesa drivers. Technically we could do this, but then
what happens when a new OpenGL version/feature comes around? Yes, we
get a new mesa interface. So we're exchanging one interface for
another here. No gain.
- Merge DDX And DRM driver. Same problem as before, whenever 2D
interfaces changes, we have to update the DDX anyway. Again, no gain
in sight.
- Merge Mesa and DDX drivers. This makes sense, and this is where
gallium is going by providing 2D and GL acceleration on top of a
single, common gallium driver. So yes, I have hopes that this one will
happen eventually, at least on non-intel hardware.

In a far away future, I can only hope that all acceleration (2D and
3D) will be done on top of GL only. That'll mean we can remove the DDX
entirely. We've been talking about this for 6 years or so. But as you
know, it's far from the case yet.

Stephane
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jeff Garzik on
On 03/04/2010 05:59 PM, Adam Jackson wrote:
> On Thu, 2010-03-04 at 17:21 -0500, Jeff Garzik wrote:
>
>>> # sed -i 's/\<kernel\>.*/& nouveau.modeset=0/g' /etc/grub.conf
>>
>> Never tried this part.
>
> The bug I'm assuming you're referring to is
>
> https://bugzilla.redhat.com/show_bug.cgi?id=519298
>
> in which you merely remove the nouveau userspace component, and in which
> I can't tell if you built nouveau into the kernel or not, but I assume
> you didn't based on your previous post. The X server does only try the
> one driver before falling back to vesa, which is a bug in the fallback
> logic I suppose. I've (blindly) fixed that for F13 now.

Thanks. Can this be put into F12 too?


> However, the log in that bug only shows you using the built-in
> autoconfig logic, and not an xorg.conf file. So, given you were talking
> about a kernel without nouveau, I am left to assume one of:
>
> - you didn't try writing an xorg.conf fragment
> - you did, and it didn't work anyway
>
> The latter case is entirely plausible, as nv is not the sort of driver
> that gets a lot of love, but I'm not aware of any open bugs about gf9800
> in particular in nv.

The latter... would modeset in grub interfere, perhaps?

Jeff


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alan Cox on
On Thu, 04 Mar 2010 14:32:02 -0500
Jeff Garzik <jeff(a)garzik.org> wrote:

> On 03/04/2010 02:04 PM, Matthew Garrett wrote:
> > "Please note that these drivers are under heavy development, may or may
> > not work, and may contain userspace interfaces that most likely will be
> > changed in the near future."
>
> Shipping it as the default Fedora driver for NVIDIA hardware makes that
> text largely irrelevant.

Why ? Fedora isn't special, Fedora is just a distribution that uses the
Linux kernel. If Fedora turns on staging drivers then Fedora has to
accept that stuff breaks and manage that expectation with its users.
Staging is not and has not been API stable. If staging is going to be API
stable then it it useless and may as well be deleted.

In this case Linus is just a random Fedora user having a distro problem.
I don't even see what it has to do with linux-kernel. The libdrm problem
and difficulty using Fedora libdrm with current upstream kernel is a
Fedora problem not a kernel problem.

The kernel staging tree is unstable for API. Whether thats the Nouveau
guys breaking Fedora, submissions to network drivers breaking/removing
bogus APIs in stuff being cleaned up - whatever then thats how the cookie
crumbles. DRM has just made it all horribly more visible because the
libdrm/kernel stuff has such a complex and closely tied interface.

Serious discussion point perhaps should be: is the libdrm so close to the
kernel it ought to be in the same git tree ? Alternatively does it need
to be easier to have multiple Nouveau libdrms autoselected according to
the kernel side versioning. ELF library versioning is not rocket science
and both the old and new libraries exist and can be installed so all the
bits are present except for the wrapper to load the right sublibrary yes ?

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alan Cox on
> So man up, guys. Face the problem, rather than say "well, it's staging",
> or "well, we can revert it". Neither of those really solve anything in the
> short run _or_ the long run.

Linus stop and think for a minute instead. Maybe a timeline would help


Nouveau development starts
People ship highly experimental stuff for testers
Code gets to the "works but really needs a clean up point"

Linus demands it is merged
Linus gets it merged as staging

Developers start doing the cleanup

Linus throws a tantrum because they did the cleanup after
merge


*YOU* forced the early merge (rightly or wrongly)
*YOU* effectively created the API break problem

So blaming other people is quite out of order.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alan Cox on
> The conclusion is crystal clear, breaking an ABI via a "flag day"
> cleanup/feature/etc is:

Ingo go read the staging Kconfig. It's crystal clear, and lots of vendor
junk that is in there being cleaned up it would be *insane* to keep their
old APIs

See there's a bigger offence than breaking an ABI - its called not RTFM.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/