[Openmcl-devel] Fwd: freshly built wx86cl64.exe crashes on start

Bharat Shetty bshetty at gmail.com
Fri Dec 30 09:42:21 PST 2022


Hi Tim,

Thanks to all the good people who gave us ccl. And thank you and Carl for
responding it makes me feel less lost :D

Luckily for me I have an additional old laptop with Windows7. I will be
back at times with a question or update.

Regards,
Bharat

On Fri, Dec 30, 2022 at 7:29 PM Tim McNerney <mc at media.mit.edu> wrote:

> Bharat,
>
> I should first thank you for taking on this CCL compilation project. It is
> important to keep up with OS changes. Your efforts are much appreciated.
>
> My partial understanding is that ASLR is for protecting *code* from
> attack. The other Lisp (Franz) was intolerant to address space
> *fragmentation* caused by ASLR, but I suspect there may be other ways
> ASLR can potentially violate assumptions made by a garbage collected
> language. They said, something *else* may be going on here.
>
> I feel like a working *baseline* would help you. By that I mean, start by
> finding an OS environment where the compilation *does* work, and compare
> to the broken results, striving to reduce variables.
>
> If you don’t have enough hardware to, say, run both Windows 10 and Windows
> 7 simultaneously, use a free or cheap VM (e.g. VMWare Player). I know AWS
> charges by the hour of usage (so not free), but they have a large selection
> of prebuilt images that will save you time. Or, since you are focusing on
> Windows, maybe Azure offers the same benefit. Maybe they offer promotional
> prices for developers.
>
> It is possible you have found an incompatibility with the gnu/cygwin tool
> chain (maybe little to do with Windows). Explore which *version* works
> for CCL. Again, establishing a baseline that gives you working code, and
> inching forward from there.
>
> Good luck, or in Gerry Sussman’s words “good skill!”
>
> --Tim
>
> On Dec 30, 2022, at 07:55, Bharat Shetty <bshetty at gmail.com> wrote:
>
> 
> Hi,
>
> I had not mentioned previously the reason I had added -no-pie. The
> executable used to wperror out after calling VirtualProtect in
> pmcl-kernel.c:remap_spjump() function. The error was
> ERROR_INVALID_ADDRESS 487 (0x1E7). After I added -no-pie (strangely
> -Wl,--no-pie or -fno-pie doesnt work) this got sorted. Some people say this
> disables ASLR however gnu ld doc is not very clear on this. However
> VirtualProtect working means this is needed.
>
> Today i added the following flags
>
>    - -Wl,--disable-high-entropy-va  ;; --high-entropy-va - Image is
>    compatible with 64-bit address space layout randomization (ASLR).This
>    option is enabled by default for 64-bit PE images.
>    - -Wl,--disable-dynamicbase ;; --dynamicbase - The image base address
>    may be relocated using address space layout randomization (ASLR). This
>    feature was introduced with MS Windows Vista for i386 PE targets.
>    - -Wl,--disable-nxcompat ;; --nxcompat - The image is compatible with
>    the Data Execution Prevention.
>
> However these do not change anything and the wx86cl64 crashes at
> exactly the same position(calculate_relocation () at ../x86-gc.c:1571).
> When I debug the executable the image base address(0x12000) , text start
> address(0x21000) amongst other addresses are the same across multiple
> executions. Would these not change if ASLR is active?
>
> This may not be related to ASLR as ASLR changes the image base and section
> start addresses at load time. As far as I understand ASLR does not change
> the base during execution. Whatever is causing this is changed behaviour of
> gcc/ld. Microsoft site clearly mention ASLR is to be opted in by the
> developer. If this was an OS change the downloaded bits should have the
> same issue. Another question is if ASLR in Windows were to cause
> re-initialisation of globals who could use it ?
>
> Regards,
> Bharat
>
>
> On Fri, Dec 30, 2022 at 3:05 AM Carl Shapiro <carl.shapiro at gmail.com>
> wrote:
>
>> The feature is known as ASLR in both Windows and Linux
>>
>>
>> https://learn.microsoft.com/en-us/windows/security/threat-protection/overview-of-threat-mitigations-in-windows-10#address-space-layout-randomization
>>
>> On Thu, Dec 29, 2022 at 12:43 PM Tim McNerney <mc at media.mit.edu> wrote:
>>
>>> I wonder if you need to *turn off* Windows 10’s new-ish,
>>> nondeterministic memory allocation policy. We have run into this with other
>>> Lisps. I don’t remember the correct terminology for this malware
>>> countermeasure or the name of the configuration flag. Sorry. Can someone
>>> else chime in?
>>>
>>> --Tim
>>>
>>> On Dec 29, 2022, at 13:49, Bharat Shetty <bshetty at gmail.com> wrote:
>>>
>>> 
>>> Hi,
>>>
>>> I built ccl (downloaded the 1.12.1 zip file from github)on
>>>
>>>    - Windows 10 (cygwin)
>>>    - gcc version 11.3.0
>>>    - ld/binutils version 2.39
>>>    - debug flag changed to -g3 in Makefile
>>>    - code optimisation level set to -O0 (zero) also in Makefile
>>>
>>>
>>> When the exe was built i got a message that section /1, /8 and /32 are
>>> before text. I altered the pei-x86-64.x to include the new debug
>>> sections(upto dwarf 5). For this i generated the default script running ld
>>> --verbose. retained .Copied .spfoo from the original, removed KEEP() and
>>> most of the SORT() unless it was present in the original file. Besides this
>>> I had to add *-no-pie and -Wl,--allow-multiple-definition* to the build
>>> rule for wx86cl64.exe target. I have not made any changes to the source
>>> code. This got an exe that starts.
>>>
>>> However every time I run this, it crashes at calculate_relocation in
>>> x86-gc.c. The back trace is as follows:
>>> #0  0x0000000000031d56 in calculate_relocation () at ../x86-gc.c:1571
>>> #1  0x000000000002ea15 in gc (tcr=0x5acebc0, param=0) at
>>> ../gc-common.c:1821
>>> #2  0x000000000003a170 in gc_from_tcr (tcr=0x5acebc0, param=0) at
>>> ../x86-exceptions.c:3014
>>> #3  0x000000000003a06b in gc_like_from_xp (xp=0x25f6f570, fun=0x3a126
>>> <gc_from_tcr>, param=0) at ../x86-exceptions.c:2970
>>> #4  0x000000000003a1ce in gc_from_xp (xp=0x25f6f570, param=0) at
>>> ../x86-exceptions.c:3026
>>> #5  0x0000000000035c0f in allocate_object (xp=0x25f6f570,
>>> bytes_needed=16, disp_from_allocptr=13, tcr=0x5acebc0,
>>> crossed_threshold=0x25f6f1ec) at ../x86-exceptions.c:171
>>> #6  0x0000000000036f26 in handle_alloc_trap (xp=0x25f6f570,
>>> tcr=0x5acebc0, notify=0x25f6f1ec) at ../x86-exceptions.c:665
>>> #7  0x0000000000037f15 in handle_exception (signum=11, info=0x25f6f4d8,
>>> context=0x25f6f570, tcr=0x5acebc0, old_valence=0) at
>>> ../x86-exceptions.c:1215
>>> #8  0x0000000000038cf3 in windows_exception_handler
>>> (exception_pointers=0x25f6f4c0, tcr=0x5acebc0, signal_number=11) at
>>> ../x86-exceptions.c:2150
>>> #9  0x00000000000438dd in windows_switch_to_foreign_stack () at
>>> ../x86-asmutils64.s:263
>>> #10 0x0000000025f6f4c0 in ?? ()
>>>
>>> This happens after start_lisp is called. By the time the code reaches
>>> gc.c global_reloctab is reset (set to 0x7cfe000000 before entering
>>> start_lisp). Due to this GCrelocptr is also set to 0x0 in gc-common.c. This
>>> results in relocptr being set to 0x0 in calculate_relocation.
>>> Also GCndynamic_dnodes_in_area is also 0 at this point(in the downloaded
>>> version it is 2048 at this point).
>>>
>>> Does anyone know why these global variables are reset ? And how can I
>>> fix this? I suspect this is because of the newer versions of gcc and ld.
>>>
>>> Regards,
>>> Bharat
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clozure.com/pipermail/openmcl-devel/attachments/20221230/71f02eca/attachment-0001.htm>


More information about the Openmcl-devel mailing list