[Openmcl-devel] Another Directory Bug

wws2 new ww.s2 at ukonline.co.uk
Sat Apr 3 22:55:43 PDT 2010


As always your comments help me to understand what choices and problems arise when designing these functions.  It seems, however, that a large part of the issue revolves around user expectations, and how they differ when coming at it from a UNIX, Objective-C, Lisp compiler perspective versus an application or end-user perspective.  Mine is more the application/end-user perspective, and from that angle I do not think the behaviour of (directory... ) is complete at the moment.

From Terminal if I type " ls -al /Volumes/"  I get a list of volumes including the startup volumes, with some sort of pointer to "/" which I takes means that it resolves to the same thing.  At any rate, I get a list of the names of all "partitions" which is what I want.  The same applies if I go to the finder, or to disk-utility.

In CCL, I cannot duplicate that, ie I cannot find a way to list all partition names, and I think it should as it is aware of the name of the startup partition.  (directory "/Volumes/*" :directories t :files t :follow-links t)  Gives me a list which now includes #P"/".   (directory "/*" :directories t :files nil :follow-links t)  Lists all directories on the startup.  The list is the same if I include :all t.  However, the following shows that at some level CCL knows the name of the startup.  (directory "/Volumes/Startup/*" :directories t :files t :follow-links t). 

How do I get the name of the Startup partition?  I would like directory to provide it (at least with some set of parameters) so that I can use it in a less Unix, more Macintosh end-user way.  At the same time, with some other set of parameters, I would like it to give me everything in a Unix way.  In that way we have both.

The way the name of the Startup partition is hidden at the moment, gets onto the problem of symbolic links, which is causing me other problems.  So far I cannot with my limited low-level understanding find a way of testing whether a path is an alias or not, or how to resolve it, something that MCL eventually provided.  My application uses that functionality a lot in order to traverse and manipulate directories of data, and avoid operating on things (aliases) that would fail or muck things up if attempted.  For example I can open an alias file named "a" to a folder, which somehow should fail...

? (with-open-file (ert (probe-file (full-pathname "root:a"))) (print ert))

#<BASIC-FILE-CHARACTER-INPUT-STREAM ("/Volumes/Arbeit/Allegro/a"/8 ISO-8859-1) #x300045CF725D> 
#<BASIC-FILE-CHARACTER-INPUT-STREAM ("/Volumes/Arbeit/Allegro/a"/:closed #x300045CF725D>

I take your point that Unix has blurred the distinction between directories and files, but there should be a way of hiding that (logically they are equivalent) from some users.  I don't mind knowing about this blurred world, or working with it some, but at the end of the day my model of how my data and programs are spread across my partitions should match that of what I see and do as a user in the finder.

Let me try to boil my posts and your reply into a suggestion and some requests.  I think it makes sense for  (directory.. ) to be capable of resolving aliases and getting past directories or filenames with "*" in them so that I can obtain a list of stuff on my partitions that correspond with what I see in the finder.  It can even throw in the ".DS_store" (IIRC it stores information about the file colors and positions users have applied in the finder...) and other hidden stuff as long as in each case there is a test function (like is-this-an-alias-p) that allows me to weed it out.  What I do at the moment, which works for files like ".DS_Store", is to always test the file-type, something I control for my own data-files but not in general of course.  

I would also like to get rid of the delete-duplicates and the alphabetical sorting inside of directory as that takes too long when manipulating hundreds of thousands of data-files.

Can we have a set of routines which tell me the file attributes like create-date, modify-date so I can sort them if I want to?  Apologies in advance if they already exist, but in general this layer seems missing at the moment.

On Apr 4, 2010, at 1:07 AM, Gary Byers wrote:

> On Sat, 3 Apr 2010, wws2 new wrote:
>> The Clozure version of directory is a real barrier to porting my code from MCL, as a Common-Lisp function it should just work apart from a few tweaks to pathname protocols.  The latest problem is that the Clozure version of the Common-Lisp Function Directory fails to return all volumes on my system.  In particular it fails to return the one where the operating system is located.  (In my case CCL is located on a partition.)  Terminal shows all my partitions of course.
>> Welcome to Clozure Common Lisp Version 1.4-r13119  (DarwinX8664)!
>> ? (directory "Volumes/*" :directories t :files nil)
>> (#P"/Volumes/work/" #P"/Volumes/DOS/" #P"/Volumes/play/" #P"/Volumes/public/" #P"/Volumes/plans/" #P"/Volumes/private/" #P"/Volumes/xSpare/")
>> ?
> I have two volumes/disk partitions mounted on this system, named "Macintosh HD"
> and "HD2".  The boot/root volume is the one named "HD2".  In the shell:
> [~] gb at antinomial> ls -l /Volumes/
> total 8
> lrwxr-xr-x   1 root  admin     1 Mar 30 19:33 HD2 -> /
> drwxrwxr-t  34 root  admin  1224 Sep 10  2009 Macintosh HD
> The name of the root volume is stored as a symbolic link (to #p"/",
> the root of the file system) in #p"/Volumes"; other disk volumes are
> actually mounted at mount points that're subdirectories of
> #p"/Volumes".  You're asking DIRECTORY to only include the actual
> directories that match #p"/Volumes/*" (well, actually you're using a
> relative pathname for some reason; GUI programs like the CCL IDE run
> with the current directory set to #p"/" for some reason ...) and
> that's what it's doing.
> ? (directory #p"/Volumes/*" :directories t :files nil)
> (#P"/Volumes/Macintosh HD/")  ; the file/link isn't included in the output
> ? (directory #p"/Volumes/*" :directories t :files t :follow-links nil)
> (#P"/Volumes/HD2" #P"/Volumes/Macintosh HD/") ; the link is included but not resolved
> ? (directory #p"/Volumes/*" :directories t :files t :follow-links t)
> (#P"/" #P"/Volumes/Macintosh HD/") ; link is included and resolved, but of course
>                                   ; the resolved pathname doesn't match
> So, if a symbolic link resolves to a directory that matches its first argument
> and :INCLUDE-DIRECTORIES is true, should the link or its target be automatically
> included in DIRECTORY's output (and if so, which ?)  I can imagine wanting both
> behaviors; a symbolic link that refers to a directory is one or more of:
>  - a file
>  - a directory
>  - both
>  - neither
> There's no guarantee that a symbolic link resolves to anything, or that it
> doesn't resolve to itself.  A link (symbolic or otherwise) is first and
> foremost a file
>> Doesn't anyone use Directory?
> No, no one uses DIRECTORY in CCL.  Everyone has exactly the same
> understanding of the issues and set of expectations that you do, and
> no one ever tries to understand any more about the filesystem that
> they use.  How could it be otherwise ?
> Actually, that's not quite true ...  As far as I know, many people use
> it; the most common complaint that I've heard is that it's necessary
> to say :DIRECTORIES T in order to get directories included in the
> output.  That's leftover MCL behavior, and I can't remember why we
> ever thought that the :DIRECTORIES and :FILES arguments were good
> ideas (or perhaps the issue is more one of why :DIRECTORIES should
> default to NIL.)  People coming to CCL from MCL might find the
> :DIRECTORIES/:FILES arguments and their defaults reasonable; people
> coming from other environments have different expectations.
> There's essentially one practical way (#_readdir or the related
> #_readdir_r) to enumerate the contents of a directory on a Unix
> filesystem (and that enumeration is a large part of what the DIRECTORY
> function does.)  As soon as one traverses a directory, one encounters
> magic "files" named "." and "..".  Do they match DIRECTORY's
> "pathspec" argument ?  (What -precisely- does it mean for pathnames to
> "match", anyway ?)  Are they files or directories ?  It's probably the
> best policy for DIRECTORY to simply ignore these entries (and
> certainly best to not naively traverse them), but that's ultimately a
> matter of policy; if an implementation of DIRECTORY included entries
> for "."  and ".." files in its output, that's not totally unreasonable
> (though I'm not sure if it would match my expectations or anyone
> else's).
> Likewise, one might find other entries whose name starts with a dot;
> common Unix convention is to treat these files as being hidden and to
> exclude them from directory listings.  (The #p"/Volumes/" directory
> typically contains an entry called ".DS_Store".)  Should DIRECTORY
> include/traverse these entries (which, after all, could be said to
> "match" a pathspec argument like #p"/Volumes/*" ?  Again (and perhaps
> more clearly than in the case of . and ..), that's a matter of policy
> and interpretation, which is another way of saying that it's a matter
> of "guessing what most users would expect", along with "trying to
> behave consistently or predictably."  CCL provides an :ALL keyword
> argument (which defaults to NIL), which is vaguely analogous to "ls -a".
> In the case that's causing problems - where a directory contains a
> symbolic link that refers to another directory - I actually think that
> treating the link as a file leads to more consistent and predictable
> behavior than interpreting the link as a directory would ("it's not
> a directory, but it happens to resolve to one.")  (This case seems
> to indicate that the whole :DIRECTORIES/:FILES distinction makes less
> sense than it may have on Classic MacOS 20+ years ago.)
> DIRECTORY's supposed to determine what files have names matching its
> pathspec argument and return a list of the TRUENAMEs of those files.
> One might assume that that's clear and unambiguous, but a little bit
> of exposure to the real world tends to be rather dissuasive ...  Recall
> that my #p"/Volumes/" directory contains a directory named "Macintosh HD"
> (on which a secondary disk partition happens to be mounted), a symbolic
> link named "HD2" that refers to the root of the filesystem, and a file
> named ".DS_Store" (whose reason for existing I can never remember.)

IIRC it stores information about the file colors and positions users have applied in the finder...

> In CCL:
> ? (directory #p"/Volumes/*")
> (#P"/") ; ignores the directory and "hidden by convention" file, returns
>        ; the truename of #p"/Volumes/HD2"
> In LispWorks 5.1.1:
> ;;; The "hidden" file, the truename of the link, and the directory.
> (#P"/Volumes/.DS_Store" #P"/" #P"/Volumes/Macintosh HD/")
> In Allegro CL 8.1:
> ;;; The hidden file, the link itself, and the directory
> (#P"/Volumes/.DS_Store" #P"/Volumes/HD2" #P"/Volumes/Macintosh HD")
> In CLisp 2.48
> (#P"/Volumes/.DS_Store") ; just the hidden file
> In MCL/RMCL 5.2.1:
> ? (directory #p"HD2:Volumes:*")
> (#2P"HD2:Volumes:.DS_Store") ; just the hidden file, and the non-standard #nP syntax
> (I don't have copies of CMUCL, SBCL, or GCL installed on this system and don't
> know what those implementations return in this case.  I didn't check to see
> whether any of the implementations I tried offer keyword extensions which
> change their behavior in relevant ways, or whether more recent versions of
> these implementations exist and/or behave differently.)
> With the possible exception of Allegro's failure to return the TRUENAME of the
> link, I think that all of these interpretations are compliant (the spec says
> "files", not "directories" or "links") and I would guess that all of these
> implementations are all trying to reasonable and consistent and predictable.
> (In fact, it would require an unpleasant combination of arrogance and ignorance
> to assume otherwise.)  I'd further assume that users of these implementations
> use DIRECTORY in those implementations, and if they try to use DIRECTORY portably,
> they're aware of these differences in behavior.  Peter Seibel's book has a whole chapter on this sort of thing.
> (If I had a vote, I think that I'd pick LispWorks' result as being the most
> compliant/useful; I don't really think that :DIRECTORIES in CCL should default
> to NIL, mostly because it's hard to justify on any grounds beyound "20 years
> ago, we seemed to think that it was a useful default."  More
> You seem to expect the link to be included in DIRECTORY's output, but
> (despite the fact that the spec requires TRUENAMEs to be returned)
> don't expect the link to be resolved.  I don't know why you expected
> this (perhaps you thought that the root volume was somehow mounted in
> a subdirectory of itself, just like other volumes ...), but that
> expectation doesn't seem to match reality.
> I suppose that it's possible to conclude that everything under
> /Volumes whose name matches a volume name is a directory, that CCL's
> DIRECTORY function arbitarily decides to omit some directories from
> its output for some mysterious reason, and that all users and
> implementors of CCL are somehow too stupid to notice this (but you
> aren't). If you reached that conclusion, which parts of this sounded
> plausible to you ?

More information about the Openmcl-devel mailing list