[Openmcl-devel] Using suspend and resume (out of necessity)

Sun Jul 7 07:47:11 PDT 2013

I have read your suggestions and thought about possible solutions to the 
problem.
Can you find any flaws in the following idea:

-Each of the commands suspend, resume and kill requires first getting 
all locks that the target process might hold, to ensure it isn't holding 
any locks when it is interrupted.
I am using a debugging macro I wrote to ensure that the locks are always 
acquired in the same, hierarchical order, top to bottom. This should 
make deadlocks impossible.
-Every critical section is protected by without-interrupts and locks on 
all relevant objects. It is ensured that these critical sections can not 
take too long to execute.
Suspension is implemented as an interrupt command that waits on a 
semaphore. Resuming just signals that semaphore. It does /not/ use a 
second semaphore to tell the manager process when the interruption has 
succeeded, because that could lead to a deadlock if the interruption is 
deferred until after a critical section, but that section can't be 
entered because the manager thread still holds some locks.
Could this have some other negative side-effects? Could the 
without-interrupts be used inside the locks instead of the other way 
around? That would allow me to add a second semaphore to signal the 
manager thread without fear of deadlocks, since the process to be 
interrupted won't even get to the without-interrupts while the manager 
is holding the locks. Can a process that is waiting on a lock be 
interrupted so that it first processes the interruption and then tries 
to get the lock again? If it can't, I think this second option won't work.
-Killing a process just uses process-kill, as before.

Do you see any problems with this setup?

If there isn't one, can you think of any other problems that might arise?
The problem is that the target process may be running absolutely any 
code that can be built from the available building blocks. Critical 
sections should all be limited in processing-time, but otherwise it 
might do anything. For instance, it is able to throw and catch errors 
and to allocate a lot of resources by building enormous vectors. Can 
such actions be interrupted or are the interrupts delayed? Additionally, 
is there anything a process can do that is such a tremendous mistake 
that it circumvents all error-handlers? Like going into an infinite 
recursion or allocating infinite amounts of memory?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clozure.com/pipermail/openmcl-devel/attachments/20130707/e4b53810/attachment.htm>