[Openmcl-devel] Another Directory Bug

Gary Byers gb at clozure.com
Sun Apr 4 00:07:34 UTC 2010



On Sat, 3 Apr 2010, wws2 new wrote:

>
> The Clozure version of directory is a real barrier to porting my code from MCL, as a Common-Lisp function it should just work apart from a few tweaks to pathname protocols.  The latest problem is that the Clozure version of the Common-Lisp Function Directory fails to return all volumes on my system.  In particular it fails to return the one where the operating system is located.  (In my case CCL is located on a partition.)  Terminal shows all my partitions of course.
>
> Welcome to Clozure Common Lisp Version 1.4-r13119  (DarwinX8664)!
> ? (directory "Volumes/*" :directories t :files nil)
> (#P"/Volumes/work/" #P"/Volumes/DOS/" #P"/Volumes/play/" #P"/Volumes/public/" #P"/Volumes/plans/" #P"/Volumes/private/" #P"/Volumes/xSpare/")
> ?
>

I have two volumes/disk partitions mounted on this system, named "Macintosh HD"
and "HD2".  The boot/root volume is the one named "HD2".  In the shell:

[~] gb at antinomial> ls -l /Volumes/
total 8
lrwxr-xr-x   1 root  admin     1 Mar 30 19:33 HD2 -> /
drwxrwxr-t  34 root  admin  1224 Sep 10  2009 Macintosh HD


The name of the root volume is stored as a symbolic link (to #p"/",
the root of the file system) in #p"/Volumes"; other disk volumes are
actually mounted at mount points that're subdirectories of
#p"/Volumes".  You're asking DIRECTORY to only include the actual
directories that match #p"/Volumes/*" (well, actually you're using a
relative pathname for some reason; GUI programs like the CCL IDE run
with the current directory set to #p"/" for some reason ...) and
that's what it's doing.

? (directory #p"/Volumes/*" :directories t :files nil)
(#P"/Volumes/Macintosh HD/")  ; the file/link isn't included in the output
? (directory #p"/Volumes/*" :directories t :files t :follow-links nil)
(#P"/Volumes/HD2" #P"/Volumes/Macintosh HD/") ; the link is included but not resolved
? (directory #p"/Volumes/*" :directories t :files t :follow-links t)
(#P"/" #P"/Volumes/Macintosh HD/") ; link is included and resolved, but of course
                                    ; the resolved pathname doesn't match

So, if a symbolic link resolves to a directory that matches its first argument
and :INCLUDE-DIRECTORIES is true, should the link or its target be automatically
included in DIRECTORY's output (and if so, which ?)  I can imagine wanting both
behaviors; a symbolic link that refers to a directory is one or more of:

   - a file
   - a directory
   - both
   - neither

There's no guarantee that a symbolic link resolves to anything, or that it
doesn't resolve to itself.  A link (symbolic or otherwise) is first and
foremost a file

> Doesn't anyone use Directory?

No, no one uses DIRECTORY in CCL.  Everyone has exactly the same
understanding of the issues and set of expectations that you do, and
no one ever tries to understand any more about the filesystem that
they use.  How could it be otherwise ?

Actually, that's not quite true ...  As far as I know, many people use
it; the most common complaint that I've heard is that it's necessary
to say :DIRECTORIES T in order to get directories included in the
output.  That's leftover MCL behavior, and I can't remember why we
ever thought that the :DIRECTORIES and :FILES arguments were good
ideas (or perhaps the issue is more one of why :DIRECTORIES should
default to NIL.)  People coming to CCL from MCL might find the
:DIRECTORIES/:FILES arguments and their defaults reasonable; people
coming from other environments have different expectations.

There's essentially one practical way (#_readdir or the related
#_readdir_r) to enumerate the contents of a directory on a Unix
filesystem (and that enumeration is a large part of what the DIRECTORY
function does.)  As soon as one traverses a directory, one encounters
magic "files" named "." and "..".  Do they match DIRECTORY's
"pathspec" argument ?  (What -precisely- does it mean for pathnames to
"match", anyway ?)  Are they files or directories ?  It's probably the
best policy for DIRECTORY to simply ignore these entries (and
certainly best to not naively traverse them), but that's ultimately a
matter of policy; if an implementation of DIRECTORY included entries
for "."  and ".." files in its output, that's not totally unreasonable
(though I'm not sure if it would match my expectations or anyone
else's).

Likewise, one might find other entries whose name starts with a dot;
common Unix convention is to treat these files as being hidden and to
exclude them from directory listings.  (The #p"/Volumes/" directory
typically contains an entry called ".DS_Store".)  Should DIRECTORY
include/traverse these entries (which, after all, could be said to
"match" a pathspec argument like #p"/Volumes/*" ?  Again (and perhaps
more clearly than in the case of . and ..), that's a matter of policy
and interpretation, which is another way of saying that it's a matter
of "guessing what most users would expect", along with "trying to
behave consistently or predictably."  CCL provides an :ALL keyword
argument (which defaults to NIL), which is vaguely analogous to "ls -a".

In the case that's causing problems - where a directory contains a
symbolic link that refers to another directory - I actually think that
treating the link as a file leads to more consistent and predictable
behavior than interpreting the link as a directory would ("it's not
a directory, but it happens to resolve to one.")  (This case seems
to indicate that the whole :DIRECTORIES/:FILES distinction makes less
sense than it may have on Classic MacOS 20+ years ago.)

DIRECTORY's supposed to determine what files have names matching its
pathspec argument and return a list of the TRUENAMEs of those files.
One might assume that that's clear and unambiguous, but a little bit
of exposure to the real world tends to be rather dissuasive ...  Recall
that my #p"/Volumes/" directory contains a directory named "Macintosh HD"
(on which a secondary disk partition happens to be mounted), a symbolic
link named "HD2" that refers to the root of the filesystem, and a file
named ".DS_Store" (whose reason for existing I can never remember.)

In CCL:
? (directory #p"/Volumes/*")
(#P"/") ; ignores the directory and "hidden by convention" file, returns
         ; the truename of #p"/Volumes/HD2"

In LispWorks 5.1.1:
;;; The "hidden" file, the truename of the link, and the directory.
(#P"/Volumes/.DS_Store" #P"/" #P"/Volumes/Macintosh HD/")

In Allegro CL 8.1:
;;; The hidden file, the link itself, and the directory
(#P"/Volumes/.DS_Store" #P"/Volumes/HD2" #P"/Volumes/Macintosh HD")

In CLisp 2.48
(#P"/Volumes/.DS_Store") ; just the hidden file

In MCL/RMCL 5.2.1:
? (directory #p"HD2:Volumes:*")
(#2P"HD2:Volumes:.DS_Store") ; just the hidden file, and the non-standard #nP syntax

(I don't have copies of CMUCL, SBCL, or GCL installed on this system and don't
know what those implementations return in this case.  I didn't check to see
whether any of the implementations I tried offer keyword extensions which
change their behavior in relevant ways, or whether more recent versions of
these implementations exist and/or behave differently.)

With the possible exception of Allegro's failure to return the TRUENAME of the
link, I think that all of these interpretations are compliant (the spec says
"files", not "directories" or "links") and I would guess that all of these
implementations are all trying to reasonable and consistent and predictable.
(In fact, it would require an unpleasant combination of arrogance and ignorance
to assume otherwise.)  I'd further assume that users of these implementations
use DIRECTORY in those implementations, and if they try to use DIRECTORY portably,
they're aware of these differences in behavior.  Peter Seibel's book has a 
whole chapter on this sort of thing.

(If I had a vote, I think that I'd pick LispWorks' result as being the most
compliant/useful; I don't really think that :DIRECTORIES in CCL should default
to NIL, mostly because it's hard to justify on any grounds beyound "20 years
ago, we seemed to think that it was a useful default."  More

You seem to expect the link to be included in DIRECTORY's output, but
(despite the fact that the spec requires TRUENAMEs to be returned)
don't expect the link to be resolved.  I don't know why you expected
this (perhaps you thought that the root volume was somehow mounted in
a subdirectory of itself, just like other volumes ...), but that
expectation doesn't seem to match reality.

I suppose that it's possible to conclude that everything under
/Volumes whose name matches a volume name is a directory, that CCL's
DIRECTORY function arbitarily decides to omit some directories from
its output for some mysterious reason, and that all users and
implementors of CCL are somehow too stupid to notice this (but you
aren't). If you reached that conclusion, which parts of this sounded
plausible to you ?





More information about the Openmcl-devel mailing list