[Openmcl-devel] Using suspend and resume (out of necessity)
Florian Dietz
Florian.Dietz2 at gmx.de
Sun Jul 7 07:47:11 PDT 2013
I have read your suggestions and thought about possible solutions to the
problem.
Can you find any flaws in the following idea:
-Each of the commands suspend, resume and kill requires first getting
all locks that the target process might hold, to ensure it isn't holding
any locks when it is interrupted.
I am using a debugging macro I wrote to ensure that the locks are always
acquired in the same, hierarchical order, top to bottom. This should
make deadlocks impossible.
-Every critical section is protected by without-interrupts and locks on
all relevant objects. It is ensured that these critical sections can not
take too long to execute.
Suspension is implemented as an interrupt command that waits on a
semaphore. Resuming just signals that semaphore. It does /not/ use a
second semaphore to tell the manager process when the interruption has
succeeded, because that could lead to a deadlock if the interruption is
deferred until after a critical section, but that section can't be
entered because the manager thread still holds some locks.
Could this have some other negative side-effects? Could the
without-interrupts be used inside the locks instead of the other way
around? That would allow me to add a second semaphore to signal the
manager thread without fear of deadlocks, since the process to be
interrupted won't even get to the without-interrupts while the manager
is holding the locks. Can a process that is waiting on a lock be
interrupted so that it first processes the interruption and then tries
to get the lock again? If it can't, I think this second option won't work.
-Killing a process just uses process-kill, as before.
Do you see any problems with this setup?
If there isn't one, can you think of any other problems that might arise?
The problem is that the target process may be running absolutely any
code that can be built from the available building blocks. Critical
sections should all be limited in processing-time, but otherwise it
might do anything. For instance, it is able to throw and catch errors
and to allocate a lot of resources by building enormous vectors. Can
such actions be interrupted or are the interrupts delayed? Additionally,
is there anything a process can do that is such a tremendous mistake
that it circumvents all error-handlers? Like going into an infinite
recursion or allocating infinite amounts of memory?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clozure.com/pipermail/openmcl-devel/attachments/20130707/e4b53810/attachment.htm>
More information about the Openmcl-devel
mailing list