[Openmcl-devel] Solaris x86-64?
Chris Curtis
enderx12 at mac.com
Mon Jan 7 09:18:44 PST 2008
Well, after a fair bit of digging I've made a little bit of headway on
this issue. It feels like 2 steps forward, (- 2 epsilon) steps back.
The good news is that the post you found here seems to mean we
probably don't have quite the same problem as on Darwin.
DARWIN_GS_HACK swaps the pthread %gs data with the CCL tcr, and on
Solaris x86-64 libpthread uses %fs instead as per the ABI (supposedly
leaving %gs alone).
Interestingly, I can do "mov %0,%%gs" with inline assembly, but only
as long as the value I'm trying to set is [0-3]. Any other value
segfaults. (Same with %fs, BTW.)
FWIW, the sbcl runtime (on Solaris x86) sets %fs directly, but only
after setting up a new LDT block via SI86DSCR. It then saves the LDT
selector in its pthread_specific block.
I'm continuing to dig... any thoughts or suggestions would be greatly
appreciated. :-)
--chris
On Jan 3, 2008, at 3:07 PM, R. Matthew Emerson wrote:
> I stumbled across this:
> http://blogs.sun.com/tpm/entry/solaris_10_on_x64_processors3
>
> [begin excerpt]
>
> Threads and Selectors
>
> In previous releases of Solaris, the 32-bit threads library used the
> %gs selector to allow each LWP in a process to refer to a private
> LDT entry to provide the per-thread state manipulated by the
> internals of the thread library. Each LWP gets a different %gs
> value that selects a different LDT entry; each LDT entry is
> initialized to point at per-thread state. On LWP context switch,
> the kernel loads the per-process LDT register to virtualize all this
> data to the process. Workable, yes, but the obvious inefficiency
> here was requiring every process to have at least one extra locked-
> down page to contain a minimal LDT. More serious, was the implied
> upper bound of 8192 LWPs per process (derived from the hardware
> limit on LDT entries).
>
> For the amd64 port, following the draft ABI document, we needed to
> use the %fs selector for the analogous purpose in 64-bit processes
> too. On the 64-bit kernel, we wanted to use the FSBASE and GSBASE
> MSRs to virtualize the addresses that a specific magic %fs and magic
> %gs select, and we obviously wanted to use a similar technique on 32-
> bit applications, and on the 32-bit kernel too. We did this by
> defining specific %fs and %gs values that point into the GDT, and
> arranged that context switches update the corresponding underlying
> base address from predefined lwp-private values - either explicitly
> by rewriting the relevant GDT entries on the 32-bit kernel, or
> implicitly via theFSBASE and GSBASE MSRs on the 64-bit kernel. The
> result of all this work makes the code simpler, it scales cleanly,
> and the resulting upper bound on the number of LWPs is derived only
> from available memory (modulo resource controls, obviously).
>
> [end excerpt]
>
> So it sounds like the functionality is there, it's just a question
> of whether/how it's exposed to user processes.
>
More information about the Openmcl-devel
mailing list