User manual

Interprocessor Synchronization Window
49
Use of synchronization window allows TIME_CONTROL to be relaxed from 6 to 2. Full relaxation of
TIME_CONTROL to 0 is not supported yet by current release of VAX MP. Relaxation of TIME_CONTROL
from 6 to 2 allows OpenVMS to generate bugchecks and restart the system in case one (or some) of the
processors lock up while holding a spinlock or stop responding to interprocessor interrupt requests
within reasonable amount of time. With OpenVMS being mature and stable, these are unlikely events.
True benefits of synchronization window use lie elsewhere.
* * *
One other benefit of synchronization window is that with synchronization window enabled VAX MP
provides greater compatibility with mal-written code that can fail even on real hardware VAXen but is
more likely to fail on a simulator where VCPUs are implemented as preemptable threads that can “drift”
relative to each other.
We are not aware of any actually existing code of this nature, neither in baseline OpenVMS VAX 7.3, nor
in any layered or 3
rd
party components, however theoretically it can be imagined.
Consider for example privileged-mode code one section of which executes on VCPU1 at IPL RESCHED (3)
and locks certain interlock queue header with BBSSI instruction in order to quickly scan the queue, relying
on elevates IPL as a protection from being preempted. Another section of the code, executed on VCPU2,
performs $INSQHI on the queue, i.e. INSQHI operation with finite number of retry attempts (limited either
to 900,000 retries as hardwired into $INSQHI and associated macros, or by SYSGEN LOCKRETRY parameter,
or limited to any other finite number of retries).
Such code can fail even on a real hardware VAX. (And in case of $INSQHI macro generate bugcheck
BADQHDR.)
Indeed, although the code on VCPU1 is running at IPL 3, it can still be interrupted by higher-priority
interrupt and it might spend some extended time inside processing this interrupt. Default SYSGEN
parameters would allow such interrupt handler to last without system crash for up to 1 second. On fast
machines, such as NVAX/NVAX+ based machines, both VCPU1 and VCPU2 are legitimately allowed by
default SYSGEN parameters to execute about 50 mln. instructions before raising SMP timeout condition
a number well in excess of 2.7 mln. instruction limit hardcoded into $INSQHI and far in excess of much
smaller limit controlled by default value of LOCKRETRY. Thus $INSQHI based spinning in the described
code can timeout even on a real VAX, within default values of system parameters.
Such code would therefore be mal-written and we are not aware of its actual existence.
Even though such mal-written code is incorrect and prone to failure even on a real VAX, probability of
such failure may be small and the code may “pass” on a real VAX due to low probability of factors
required to cause its failure, but it may fail with greater probability when executed on a simulator,
because VCPU thread preemption on the simulator is much more likely event than long high-priority
interrupt processing on a real VAX.
We are not aware of any such code actually existing. If such code were to happen nevertheless,
synchronization window comes to the rescue. Synchronization window will not necessarily spare this