Index Home About Blog
Newsgroups: fa.linux.kernel
From: Alexander Viro <viro@math.psu.edu>
Subject: [RFC] devfs API
Original-Message-ID: <Pine.GSO.4.21.0211101348350.24061-100000@steklov.math.psu.edu>
Date: Sun, 10 Nov 2002 22:20:41 GMT
Message-ID: <fa.mkm1jov.1f1u1qb@ifi.uio.no>

	During the last couple of weeks I'd done a lot of digging in
devfs-related code.  Results are interesting, and not in a good sense.

	1) a _lot_ of functions exported by devfs are never used.  At
all.
	2) a bunch of functions is used only by SGI hwgraph "port".
Moreover, a lot of codepaths in functions that *are* used outside of
said port, is only exercised by hwgraph.  (More on that below)
	3) gratitious arguments (read: all callers pass the same value).
	4) semantics of devfs_register() and devfs_unregister() is, er,
suboptimal and leads to rather messy cleanup paths in drivers.  (More on that
below).
	5) instead of using dev_t, devfs insists on keeping and passing
majors/minors separately, which makes both callers _and_ devfs itself
messier than necessary.
	6) devfs_entry is a nightmare.  It's a structure that contains
union (by node type), one of the fields of said union is a structure
that contains void *ops, flags, and a union of dev_t (stored as major/minor
pair) and size_t.  The reason for that (and charming expressions like
de->u.fcb.u.device.major) is an attempt to use one field for regular files,
character devices and block devices.  The *only* thing really common to all of
them is set of flags.
	7) idea of "slave" entries (== unregistered when master is
unregistered) hadn't worked out - it's easier to do explicit unregister
in 3 or 4 places that use these animals.

Overall, I would expect ~50% size reduction in devfs/base.c simply from
dropping unused code.  The rest would be much easier to debug.

Note that devfs is *seriously* burdened by hwgraph.  To the point where
it would be better to give SGI folks a private copy (they want slightly
different semantics anyway) and merge it with hcl.c and friends.  And
drop the unused stuff from devfs proper.

Situation when one obscure caller is responsible for ~30% of exported API
and pretty much gets unrestrictred access to internals is Not Good(tm).
Situation when ~40% of exported API is either not used at all or used only
by devfs itself is also not pretty.

Note that there's a large part of devfs that is never used by hwgraph
code.  IOW, after such split *both* devfs and hwgraph copy would shrink
a lot.  Shared part is actually rather small and both sides would be
better off from clear separation - e.g. locking and refcounting mechanisms
should be different, judging by the tricks hcl.c tries to pull off.

Another problem is the semantics of devfs_register/devfs_unregister.
rm -rf and install(1) do not match each other, even if they were
suitable as primitives (which is a dubious idea, to start with).

First of all, devfs_unregister(devfs_register(...)) is _not_ a no-op.
It may leave you with a bunch of new directories that have to be removed
later.  Moreover, you don't even know how many of them were there before
devfs_register() and need to be removed.

What's more, after devfs_register() we are allowed to create objects in
the intermediate directories that might appear (we can call mkdir(2) in
there, to start with).  However, devfs_unregister() wipes these out, which
is arguably a wrong thing to do - they were not created by driver, so
driver has no business to decide when they should be gone.

The following scheme would give saner behaviour (and deal with devfs_register()
failing in the middle of the way, etc. more gracefully):
	* all entries get two new fields: integer R and boolen V.
	* VFS creation methods (mknod, symlink, etc.) set R on the new node
to 0 and set V to true for that node and all its ancestors.  Refcount of
new node is set to 1.
	* register creates nodes with V set to false and increment R on the
object we had created and all its ancestors.  Refcount of created nodes is
set to 1.  If we fail to create a node (out of memory) we undo all increments
of R we had done so far.
	* VFS removal methods (unlink and rmdir) fail if R is positive or
V is false.  Otherwise they set V to false.
	* unregister decrements R on the victim and all its ancestors.
	* node is detached from parent whenever (R == 0 && !V) becomes true.
After that refcount of node is decremented.
	* node is freed when refcount reaches 0.
	* root is originally created with R set to 0 and V set to true.
	* places that currently grab/drop refcount still do that.

That will guarantee that
	* once userland creates an object, only userland can remove it or
any of its ancestors.
	* object can't be removed when kernel holds it or some of its
descendents registered.
	* unregister(register(...)) is a no-op
	* register failing mid-way cleans up after itself
	* objects are always removed by those who had created them.
	* as long as driver unregisters all objects it had registered,
it doesn't have to worry about intermediate directories, etc.

The price of switching to that scheme is that we will need to switch
drivers to explicit cleanups (i.e. instead of devfs_mk_dir "loop" +
register <n> in that directory + remove "loop" upon the exit we would
register loop/<n> when we initialize struct loop_device and unregister
it when we clean struct loop_device - actually, that could be done as
side effects of add_disk()/del_gendisk()).

Transition to explicit cleanups can be done before any changes in devfs
proper - the sequence is
	* add explicit devfs_unregister() (or devfs_find_and_unregister())
where needed in drivers; everything keeps working.
	* add aforementioned fields to devfs_entry and modify devfs_register()
and friends (see above).  No changes in drivers.
	* drop a shitload of devfs_mk_dir()/corresponding directory removals
from drivers; everything still works.
	* shift most of remaining calls in block device drivers into
add_disk()/del_gendisk(), etc.

I'd estimate that sequence as about a week of work - devfs changes in it can
be kept fairly local.  And IMNSHO it is needed, since it will make devfs
users much cleaner.

	Aside of that, there is a bunch of obvious cleanups - e.g. the 6th
argument of devfs_find_and_unregister() is (and should be) always 0; 3rd,
4th and 5th arguments are never looked at; the first one is NULL in almost
all cases and getting 3 or 4 exceptions into that form is absolutely trivial.
IOW, 4 arguments out of 6 are completely gratitious and reducing the thing
to devfs_remove(pathname) is a matter of several one-liners.

	There's a lot of such cases, but they definitely fall into "obvious
cleanups" category.  Really critical issues are getting sane model for
register/unregister (doable in small steps and I'm ready to do the entire
series) and separation of hwgraph - preferably giving it a filesystem of
its own with the interface hwgraph wants.



Newsgroups: fa.linux.kernel
From: Alexander Viro <viro@math.psu.edu>
Subject: Re: [RFC] devfs API
Original-Message-ID: <Pine.GSO.4.21.0211112039430.29617-100000@steklov.math.psu.edu>
Date: Tue, 12 Nov 2002 01:50:12 GMT
Message-ID: <fa.mom1l0v.1e1e1ic@ifi.uio.no>

On Mon, 11 Nov 2002, Ryan Anderson wrote:

> On Sun, Nov 10, 2002 at 05:19:42PM -0500, Alexander Viro wrote:
> > 	During the last couple of weeks I'd done a lot of digging in
> > devfs-related code.  Results are interesting, and not in a good sense.
>
> >From an outsider point of view (and because nobody else responded), I
> think the big question here would be: Would you use it after this
> cleanup?
>
> If you say yes, I'd say that's a good sign in its favor.

The only way I'll use devfs is
	* on a separate testbox devoid of network interfaces
	* with no users
	* with no data - disk periodically populated from image on CD.

And that's regardless of that cleanup - fixing the interface doesn't solve
the internal races, so...



Newsgroups: fa.linux.kernel
From: "Theodore Ts'o" <tytso@mit.edu>
Subject: Re: [RFC] devfs API
Original-Message-ID: <20021112080417.GA11660@think.thunk.org>
Date: Tue, 12 Nov 2002 08:06:03 GMT
Message-ID: <fa.ha5fbqv.n4o08f@ifi.uio.no>

On Mon, Nov 11, 2002 at 08:49:22PM -0500, Alexander Viro wrote:
> The only way I'll use devfs is
> 	* on a separate testbox devoid of network interfaces
> 	* with no users
> 	* with no data - disk periodically populated from image on CD.
>
> And that's regardless of that cleanup - fixing the interface doesn't solve
> the internal races, so...

Hi Al,

It's good that you're trying to clean up the devfs code, but...

How many people are actually using devfs these days?  I don't like it
myself, and I've had to add a fair amount of hair to fsck's
mount-by-label/uuid code to deal with interesting cases such as
kernels where devfs is configured, but not actually mounted (it
changes what /proc/partitions exports).  So I'm one of those who have
never looked all that kindly on devfs, which shouldn't come as a
surprise to most folks.

In any case, if there aren't all that many people using devfs, I can
think of a really easy way in which we could simplify and clean up its
API by slimming it down by 100%......

						- Ted


Newsgroups: fa.linux.kernel
From: Alexander Viro <viro@math.psu.edu>
Subject: Re: [RFC] devfs API
Original-Message-ID: <Pine.GSO.4.21.0211120410130.29617-100000@steklov.math.psu.edu>
Date: Tue, 12 Nov 2002 09:43:58 GMT
Message-ID: <fa.mm69kfv.1dh6126@ifi.uio.no>

On Tue, 12 Nov 2002, Theodore Ts'o wrote:

> In any case, if there aren't all that many people using devfs, I can
> think of a really easy way in which we could simplify and clean up its
> API by slimming it down by 100%......

Well.  If Linus decides to remove devfs, I certainly won't weep for it.
However, I don't see any signs of that happening right now, and cleaned
interface is less PITA than what we have in the current tree.  Right now
I'm mostly interested in making the glue in drivers simpler and less
intrusive.  The fact that it leads to less/simpler code in devfs proper
is also a Good Thing(tm)...



Newsgroups: fa.linux.kernel
From: Alexander Viro <viro@math.psu.edu>
Subject: Re: devfs
Original-Message-ID: <Pine.GSO.4.21.0211120445570.29617-100000@steklov.math.psu.edu>
Date: Tue, 12 Nov 2002 10:05:32 GMT
Message-ID: <fa.mo6hl8v.1fhe0a3@ifi.uio.no>

On 12 Nov 2002, Xavier Bestel wrote:

> I'm wondering if a totally userspace solution could replace devs ?
> Something using hotplug + sysfs and creating directories/nodes as they
> appear on the system. This way, the policy (how do I name what) could be
> moved out of the kernel.

	Guys, may I remind you that Oct 31 had been more than a week ago?
Devfs *is* a race-ridden pile of crap, but we are in a goddamn feature
freeze, so let's get real.

	Interfaces can and should be cleaned up.  Ditto for semantics of
registering/unregistering - that allows to make glue in drivers more
straightforward.  Majestic flamewars about removing the thing completely/
moving it to userland/etc. are exercises in masturbation by that point.

	Again, WE ARE IN FEATURE FREEZE.

	Now, does somebody have technical comments on the proposed changes?



Newsgroups: fa.linux.kernel
From: Alexander Viro <viro@math.psu.edu>
Subject: Re: devfs
Original-Message-ID: <Pine.GSO.4.21.0211120807430.3700-100000@steklov.math.psu.edu>
Date: Tue, 12 Nov 2002 13:32:39 GMT
Message-ID: <fa.kjd02lv.17hgkpu@ifi.uio.no>

On Tue, 12 Nov 2002, Rando Christensen wrote:

> Rather than saying "Devfs sucks, and we can't do anything about it other
> than fix it's more minor problems because we're in feature freeze", we
> should be saying "devfs sucks; we're a little late for feature freeze,
> so let's clean up what we can and work on something much better for the
> next time around."

Whatever is going to happen with devfs, believe me, the first thing
you'll need is stable glue in drivers - as simple and natural from the
driver POV as possible.  Complexity of doing development in 2.6 will
directly depend on the amount of code in drivers touched by patches.
BTDT - one can carry (and gradually merge) deep rewrites of core code
during -STABLE if it's done carefully.  But as soon as your patchset
hits the drivers - you are in for a world of pain just porting it to
next versions.

_That_ is critical - get interfaces right in -CURRENT, so that further
work would not cross these boundaries; then work in the resulting areas
becomes independent.

And in situations like that of devfs, simple rules for callers are pretty
much the main criteria - if users of the interface have to jump through
some hoops, it's a sign that interface needs changes...



Newsgroups: fa.linux.kernel
From: "Theodore Ts'o" <tytso@mit.edu>
Subject: Re: devfs + PCI serial card = no extra serial ports
Original-Message-ID: <20030307233839.GB24572@think.thunk.org>
Date: Fri, 7 Mar 2003 23:39:52 GMT
Message-ID: <fa.hcljd2c.mkk1g4@ifi.uio.no>

On Fri, Mar 07, 2003 at 02:51:45PM -0800, Bryan Whitehead wrote:
> It seems devfsd has an annoying "feature". I bought a PCI card to get a
> couple (2) more serial ports. The kernel doesn't seem to set up the
> serial ports at boot, so devfs never creates an entry. However, post
> boot, since there is no entries, I cannot configure the serial ports
> with setserial. So basically devfsd = no PCI based serial add on?

Yep.  This I pointed this out as a flaw to devfs a long, long time
(years!) ago, but Richard chose not to listen to me.  Personally, I
solve this (and other) problems by simply refusing to use devfs.

						- Ted


Newsgroups: fa.linux.kernel
From: "Theodore Ts'o" <tytso@mit.edu>
Subject: Re: devfs + PCI serial card = no extra serial ports
Original-Message-ID: <20030311090703.GA13389@think.thunk.org>
Date: Tue, 11 Mar 2003 17:09:11 GMT
Message-ID: <fa.ha5vca3.g440ob@ifi.uio.no>

On Fri, Mar 07, 2003 at 03:57:56PM -0800, Ed Vance wrote:
>
> Will Bryan get the proper devfs entries if he patches serial.c to
> recognize his card at kernel initialization, or is there more
> weirdness expected?

The point is that with devfs, it requires a kernel patch.  And if you
have an ISA card, where you can't do this kind of autoconfiguration,
and you're using devfs, you're *toast*.  Without devfs, you just use
setserial to configure the necessary ports, and you're done.

(Granted, these days, the last point may not matter since ISA is
getting killed off pretty effectively by Microsoft refusing the
certify systems against recent Windows OS's if they contain ISA buses
--- one of the good things Microsoft has done for the computer
industry.  :-)

						- Ted


Newsgroups: fa.linux.kernel
From: viro@parcelfarce.linux.theplanet.co.uk
Subject: Re: devfs vs. udev
Original-Message-ID: <20031007182435.GW7665@parcelfarce.linux.theplanet.co.uk>
Date: Tue, 7 Oct 2003 18:25:42 GMT
Message-ID: <fa.noboest.1on022l@ifi.uio.no>

On Tue, Oct 07, 2003 at 10:53:10AM -0700, Greg KH wrote:

> A few things happened:
> 	- the devfs maintainer/author disappeared and stoped maintaining
> 	  the code.
> 	- devfs was found to have unfixable bugs
> 	- it was determined that the same thing could be done in
> 	  userspace (like udev.)

It went more like

	- it was determined that the same thing could be done in
	  userspace
	- devfs had been shoved into the tree in hope that its quality will
	  catch up
	- devfs was found to have fixable and unfixable bugs
	- the former had stayed around for many months with maintainer claiming
	  that everything works fine
	- the latter had stayed, period.
	- the devfs maintainer/author disappeared and stoped maintaining
	  the code.


Newsgroups: fa.linux.kernel
From: viro@parcelfarce.linux.theplanet.co.uk
Subject: Re: DEVFS is very good compared to UDEV
Original-Message-ID: <20031223230555.GF4176@parcelfarce.linux.theplanet.co.uk>
Date: Tue, 23 Dec 2003 23:11:39 GMT
Message-ID: <fa.nhrcdkr.1j7k3qh@ifi.uio.no>

On Tue, Dec 23, 2003 at 02:21:03PM -0800, Hua Zhong wrote:

> But I do have sth fair to say about this "unmaintained" part.
>
> >From my memory, at some point in time, somebody (Al Viro?) reviewed
> devfs code and flamed the author in public (klml), throwing lots of bad
> impolite words to him, which I think was the biggest reason that the
> author stopped maintaining it.

Oh, really?  That "flame in public" was after _many_ months of pointing
to the same problems in private - with zero effect.

If maintainer sits on exploitable holes for ~18 months and does not care to do
anything, his code is unmaintained.  If same maintainer keeps pretending in
public that everything is fine, he can expect to have the truthfulness of his
statements challenged.  Also in public.  If the situation persists even after
that, then yes, there will be rather unflattering things to say.

Don't delude yourself - critical parts of devfs had not been maintained for
quite a while before Richard had disappeared.  It's not the effect of flames
- it's their cause and it predates them by _far_.


Newsgroups: fa.linux.kernel
From: "Theodore Ts'o" <tytso@mit.edu>
Subject: Re: DEVFS is very good compared to UDEV
Original-Message-ID: <20031224184027.GA5836@thunk.org>
Date: Wed, 24 Dec 2003 18:41:57 GMT
Message-ID: <fa.e7she8u.h6a2i2@ifi.uio.no>

On Tue, Dec 23, 2003 at 10:33:15PM -0500, Albert Cahalan wrote:
> How quickly we forget where those names came from!
> Richard Gooch originally used the traditional names.
> Linus ordered the names changed as a condition for
> acceptance into the kernel. Of course, that led to
> devfsd being a requirement, which kind of took away
> the whole point of using devfs.
>
> The Linus-approved names made devfs a pain to use,
> so few people used devfs and fewer helped out.
>

And this is **precisely** why putting the device names in the kernel
via devfs was such a mistake.  Naming is policy, and should not be in
the kernel.  Yes, the new style names were Linus's idea, but the
problem was that while he has extremely good taste with respect to
code, unfortunately he had exquisitely bad taste with respect to devfs
device names.  And when one person's taste (even if that person is
Linus) about names can cause such grief, it should be an object lesson
about why putting that kind of user-visible naming policy in the
kernel is such a bad idea.

> Richard is only to blame for his inability to spell
> /dev/disk correctly. For that, he belongs in "jail"
> with a "j". It was enough of an eyesore to make me
> give up on devfs.

Shouldn't that be "jail" with a "g"?  (As in "gaol"?  :-)

					- Ted



From: Theodore Ts'o <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: [GIT PATCH] Remove devfs from 2.6.12-git
Date: Thu, 23 Jun 2005 13:00:11 UTC
Message-ID: <fa.d78hg6k.1tluvb0@ifi.uio.no>
Original-Message-ID: <20050623125814.GA29398@thunk.org>

On Thu, Jun 23, 2005 at 01:29:00AM -0700, Mike Bell wrote:
> On Wed, Jun 22, 2005 at 11:48:47PM -0700, Greg KH wrote:
> > No, plan for it.  Speak up.  Complain sometime a while ago instead of
> > right when it happens.
>
> Did so. Raised the same points I'm doing now. Said at the time I was
> going to wait and see if udev evolved to meet my needs before devfs got
> removed. That time has arrived, and it hasn't, hence this. Seems
> sensible to me.

So send patches to udev.  You seem to be complaining, but not doing
any of the work to help the situation.  Views expressed in that
fashion generally don't get much consideration.  (Some, but not much.)

> Debian? Their installer even relies on devfs instead of udev.

Debian's installer relies of devfs for historical reasons (it was too
hard to change without having sarge slip again for another year or
two); the system which is installed definitely does _not_ have to use
devfs, and _does_ ship with udev.  And the next version of Debian
(whenever it ships) will use something else.  See this URL in a
discussion filed regarding a _bug_ that was filed complaining that the
Debian installer was using devfs.

	http://lists.debian.org/debian-cd/2004/10/msg00012.html

> > But it was unmaintained clutter and mess.
>
> Which people have attempted to maintain. Presumably, had it not been
> marked OBSOLETE and thus useless to bother maintaining in the eyes of
> most, said patches would have gone in just like any other cleanup, and
> just like devfs cleanups had been doing until that point.

Not many.  If someone had been willing to put in the _concerted_,
long-term effort that has been put in by people like Greg K-H to
develop an alternative system, including submitting a patch to remove
the racy bits of devfs (not clear if anything would be left :-), then
we would be in a different place.  But we haven't, so we're not.  One
or two proposals to do work, without followup patches (and fixes to
the patches after people complain/criticize them) means that whoever
claimed they were willing to do maintainance work weren't serious.

> Wouldn't disagree with this. In fact, I'll come right out and say that
> given the complete stagnation of devfs (which, I would argue, is not
> entirely its own fault as you claim) and udev's rapid advances (based to
> some degree I would argue on the attention it gained from things like
> the marking of devfs obsolete) udev is the better solution today
> (ignoring the migration headache and pretending both were completely new
> systems introduced today, the features it offers that devfs doesn't are
> worth much more than those devfs offers and udev doesn't, and it has a
> much stronger community behind it).
> ...
> So in closing, while I disagree with the whole way this has gone about,
> in terms of things that can be done /now/ all I'm asking is that kernel
> developers reevaluate the assumption that devfs is truly obsolete rather
> than merely depricated, based on the fact that even after all this time
> and energy udev is still not seen as a complete replacement by everyone.

Is waiting another year really going to help?  If devfs is doomed,
then might as well make a clean break now.  Otherwise we'll wait
another year, and then more people will come out of the woodwork,
whining and whining for another stay of execution....

						- Ted



From: Theodore Ts'o <tytso@mit.edu>
Newsgroups: fa.linux.kernel
Subject: Re: [GIT PATCH] Remove devfs from 2.6.13
Date: Sun, 11 Sep 2005 11:02:51 UTC
Message-ID: <fa.d2ojdud.1p50t3f@ifi.uio.no>
Original-Message-ID: <20050911110214.GA16408@thunk.org>

On Sun, Sep 11, 2005 at 12:20:06AM -0700, David Lang wrote:
> >I'll bite - what distros are shipping a kernel 2.6.10 or later and still
> >using devfs?
> >
> I'll admit I don't keep track of the distros and what kernels and features
> they are useing. I think I've heard people mention gentoo, but I
> haven't verified this.

Nope, not Gentoo --- Greg KH fixed gentoo a while ago.  :-)

> however with the thousands of linux distros out there I'd lay odds that
> _someone_ is doing it ;-)

Yes, but if none of the major distributions are doing it, then past a
certain point we should just pull the trigger and be done with it.
C'mon, devfs's impending removal has been announced for a year.  It's
not like anyone can complain that they didn't get enough warning.....

							- Ted

Index Home About Blog