From: Andi Kleen on
On Wed, Feb 17, 2010 at 01:16:48PM +0300, Nikita V. Youshchenko wrote:
> > "Nikita V. Youshchenko" <yoush(a)cs.msu.su> writes:
> > > I'm developing a device driver that, in it's ioctl()s, accepts a
> > > complex data structure. Before doing it's operation, it performs large
> > > number of checks if data is valid. If one of those checks fail, driver
> > > returns -EINVAL.
> > >
> > > Unfortunately this -EINVAL is not really useful. E.g. if a developer,
> > > sitting in his IDE and debugging his code, will see ioctl()
> > > returning -EINVAL, and will have hard times finding what exactly is
> > > wrong.
> > >
> > > Before inventing driver-specific extended error reporting, I'd like to
> > > ask if there is anything more or less generic for this.
> > > I believe situation when -Exxx is too weak interface for error
> > > reporting is common.
> >
> > This is a very common problem in Linux unfortunately. I always
> > describe that as a the "ed approach to error handling". Instead
> > of giving a error message you just give ?. Just ? happens
> > to be EINVAL in Linux.
> >
> > My favourite example of this is the configuration of the networking
> > queueing disciplines, which configure complicated data structures and
> > algorithms and in many cases have tens of different error conditions
> > based on the input parameters -- and they all just report EINVAL.
> >
> > The standard way (standard kludge or standard workaround would be a
> > better description) is to use printk; often guarded by a special
> > kernel tunable or ifdef to avoid flooding the log in the normal case.
> >
> > IMHO it would be best to simply add a way to return strings directly
> > in this case (a la plan9). This would be probably not too hard to
> > implement. It's not there unfortunately.
> >
> > This could be done with one of the message oriented protocols,
> > e.g. netlink or read/write on a special minor.
>
> Why not create a generic solution for this, if one does not exist yet?

Someone would need to do it. Yes I think it would be a worthy project.

The trick is also get around the objections of the "but we always
did it this way" Unix traditionalists.

>
> For example, have a "last error" string associated with task_struct, that:
> - will clean on each syscall entry,
> - while syscall is running, may be filled with printf-style routines,
> - may be accessible from userspace with additional syscall [that obviously
> should not reset error]?
>
> This will give driver writers a common interface for extended error
> reporting...

You would need a way to save/restore that string too (like it works
with errno) otherwise libraries cannot use it safely. Also
it would be good to have something that does not impact the system
call fast path for a non error call.

From the basic semantics I think I would prefer a way
associated with each syscall. It could be probably fit into
many syscall ABIs, but that would need architecture specific
changes, which are difficult to coordinate (Linux has too many
architectures and many of them with inactive maintainers)

One way to do that would be a "extended ioctl" syscall that supports
this in a generic way (and perhaps could fix some of the other problems
of ioctl too, like better type safety).

Designing such a thing might end up being a rat-hole (and you would
probably need to be very careful to avoid the second system effect)

Of course the qdiscs and other code who uses netlink instead would also
need something equivalent.

Also I expect someone would come up with localization issues, although
the the classical "translation database" approach would probably work
anyways.

-Andi

--
ak(a)linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alan Cox on
> For example, have a "last error" string associated with task_struct, that:
> - will clean on each syscall entry,
> - while syscall is running, may be filled with printf-style routines,
> - may be accessible from userspace with additional syscall [that obviously
> should not reset error]?
>
> This will give driver writers a common interface for extended error
> reporting...

Thats probably overkill. For almost any ioctl type interface the only
thing you *need* to make more sense is the address of the field that was
deemed invalid.

So in your ioctl handler you'd do something like

get_user(v, &foo->wombats);
if (v < 5) {
error_addr(&oo->wombats);
return -EINVAL;
}

returning text is all very well, and printk can help debug, but neither
actually help application code or particularly help interpreters to dig
into the detail and act themselves to fix a problem or understand it. It
also costs material amounts of unswappable memory and also disk storage
for the kernel image on embedded devices.

Two other problems text returns bring up or ambiguity and translations -
its almost impossible to keep them unique even within a big module. It's
also possible to get things like typos in the returned text or
mis-spellings that you then can't fix because some other app now has

if (strcmp(returned_err, "No such wombat evalueted")==0) {
...
}

in it. (HTTP 'referer' being a dark warning from history ...)

A lot of other systems keep message catalogues often indexed by
module:error. Text lookups in userspace (easy to do with existing
interfaces), and the OS providing generic, specific, and identifying
module info.

I guess the Linux extension to that would end up as

extended_error(&foo->wombats, E_NOT_A_VALID_BREEDING_POPULATION);

and internally expand to include THIS_MODULE and extract the module name.

There's another related problem here too - Unix style errors lack the
ability of some OS systems to report "It worked but ....." which leads to
interface oddities like termios where it reports "Ok" but you have to
inpsect the returned structure to see if you got what you requested.

Doesn't look too hard to add some of this or something similar as you
suggest and while it would take a long time to get coverage you have to
start somewhere.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andi Kleen on
Hi Alan,

> Thats probably overkill. For almost any ioctl type interface the only
> thing you *need* to make more sense is the address of the field that was
> deemed invalid.

Take a look at all the return -EINVALs in net/sched/sch_cbq.c
and then tell me if you really still believe just knowing the field
is enough to diagnose those. A common issue for example
is if it depends on the current state somehow.

> actually help application code or particularly help interpreters to dig
> into the detail and act themselves to fix a problem or understand it. It
> also costs material amounts of unswappable memory and also disk storage
> for the kernel image on embedded devices.

Trading developer time for a few bytes saved is exactly the wrong
tradeoff, even on a small system. In principle it could be CONFIGed
of course, but I suspect it wouldn't be worth it (especially
compared to all the other bloat)

>
> Two other problems text returns bring up or ambiguity and translations -
> its almost impossible to keep them unique even within a big module. It's

For translations the "pragmatic text database" works reasonably well
I think. Also you don't necessarily need them to be unique
(if the english string is not unique, why would the translation need to be?)

Sure text won't solve all problems either, but it's infinitely
better than EINVAL.


> also possible to get things like typos in the returned text or
> mis-spellings that you then can't fix because some other app now has
>
> if (strcmp(returned_err, "No such wombat evalueted")==0) {
> ...
> }
>
> in it. (HTTP 'referer' being a dark warning from history ...)

You could get numbers wrong too. There's really no cure against
that.

But yes it's a good point -- would need to make sure that the spelling
police would direct their efforts elsewhere as much as possible.


>
> A lot of other systems keep message catalogues often indexed by
> module:error. Text lookups in userspace (easy to do with existing
> interfaces), and the OS providing generic, specific, and identifying
> module info.

That's the IBM approach. I have some doubts it would really work
for a distributed environment like Linux. I believe it has been
even tried already (e.g. there's a Japanese project for such
a catalog). I don't think it works that well.

I think i would prefer just text strings. In principle one
could still develop a convention inside them though.

>
> I guess the Linux extension to that would end up as
>
> extended_error(&foo->wombats, E_NOT_A_VALID_BREEDING_POPULATION);
>
> and internally expand to include THIS_MODULE and extract the module name.

Hmm, yes including the module might be reasonable.

> There's another related problem here too - Unix style errors lack the
> ability of some OS systems to report "It worked but ....." which leads to
> interface oddities like termios where it reports "Ok" but you have to
> inpsect the returned structure to see if you got what you requested.

I suspect that's better solved in some way specific to that call.
I don't think it's all that common anyways.

-Andi

--
ak(a)linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Dr. David Alan Gilbert on
* Andi Kleen (andi(a)firstfloor.org) wrote:
> On Wed, Feb 17, 2010 at 01:16:48PM +0300, Nikita V. Youshchenko wrote:
> > > "Nikita V. Youshchenko" <yoush(a)cs.msu.su> writes:
> > > > I'm developing a device driver that, in it's ioctl()s, accepts a
> > > > complex data structure. Before doing it's operation, it performs large
> > > > number of checks if data is valid. If one of those checks fail, driver
> > > > returns -EINVAL.
> > > >
> > > > Unfortunately this -EINVAL is not really useful. E.g. if a developer,
> > > > sitting in his IDE and debugging his code, will see ioctl()
> > > > returning -EINVAL, and will have hard times finding what exactly is
> > > > wrong.
> > > >
> > > > Before inventing driver-specific extended error reporting, I'd like to
> > > > ask if there is anything more or less generic for this.
> > > > I believe situation when -Exxx is too weak interface for error
> > > > reporting is common.
> > >
> > > This is a very common problem in Linux unfortunately. I always
> > > describe that as a the "ed approach to error handling". Instead
> > > of giving a error message you just give ?. Just ? happens
> > > to be EINVAL in Linux.
> > >
> > > My favourite example of this is the configuration of the networking
> > > queueing disciplines, which configure complicated data structures and
> > > algorithms and in many cases have tens of different error conditions
> > > based on the input parameters -- and they all just report EINVAL.
> > >
> > > The standard way (standard kludge or standard workaround would be a
> > > better description) is to use printk; often guarded by a special
> > > kernel tunable or ifdef to avoid flooding the log in the normal case.
> > >
> > > IMHO it would be best to simply add a way to return strings directly
> > > in this case (a la plan9). This would be probably not too hard to
> > > implement. It's not there unfortunately.
> > >
> > > This could be done with one of the message oriented protocols,
> > > e.g. netlink or read/write on a special minor.
> >
> > Why not create a generic solution for this, if one does not exist yet?
>
> Someone would need to do it. Yes I think it would be a worthy project.
>
> The trick is also get around the objections of the "but we always
> did it this way" Unix traditionalists.

I'd wondered about some form of halfway house where the error
value is expanded but could be truncated for compatibility - i.e.
if at the moment we had:

return -EINVAL;

it would become:

return ERRORNUM(EINVAL, BADLENGTH);

and that would expand to something like:
return -(EINVAL + BADLENGTH << ESHIFT);

existing syscall handlers could mask the extended error bits out on the way
back, and a new entry could pass the whole error value back where user space
could separate out the other part of the error.

This still feels quite like stretching the traditional way; but at the cost
of it still having the same problems (e.g. having to define a list of error
values).

One hard problem is that often the thing that actually returns the error
has actually just got a failure from something that called it which didn't
return any diagnostics, so to do this properly errors have to be passed
around in a lot of places; you'll also have to figure out just how far
down you want to pass it - if a read() fails due to a SCSI error there
is a whole load of different levels of information that you have to chose
what to return.

<snip>

Dave (who has stared at mmap's that have returned EINVAL for way too long)

--
-----Open up your eyes, open up your mind, open up your code -------
/ Dr. David Alan Gilbert | Running GNU/Linux on Alpha,68K| Happy \
\ gro.gilbert @ treblig.org | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
\ _________________________|_____ http://www.treblig.org |_______/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andi Kleen on
> I'd wondered about some form of halfway house where the error
> value is expanded but could be truncated for compatibility - i.e.

Who would do the truncation?

> if at the moment we had:
>
> return -EINVAL;
>
> it would become:
>
> return ERRORNUM(EINVAL, BADLENGTH);

x86 only has about 12 bits in the current ABI btw.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
First  |  Prev  |  Next  |  Last
Pages: 1 2 3
Prev: (none)
Next: Itanium support ...