Today's mainframeanything to new 555
Today's mainframeanything to new 558
Of course there is measurement and security overhead; but you are in error about performance and security monitoring being contrary. They are different views of the same observations. The challenge is to build...
the original vm370 smp support had some fine-grain locking ... but had a global lock on much of the kernel. this bore some simularities to some of the smp kernel-supervisor spin locks of the period .... except there were a small number of very high-use-fastpaths that were enabled for concurrent kernal smp end.
Today's mainframeanything to new 556
Actually, both are often highly related and can only be fixed by watching both at the same time. Depends on the system in question...
the other difference was that the majority of the global kernel locks from the period were "spin" locks ... a processor that attempted to enter the kernel would spin on the kernel lock until is was made available by some other processor. this is what i called a "bounce" lock (instead of spin lock)
where if a processor didn't obtain the (almost) global kernel lock ... it queued an extremely lightweight thread and went looking for other activity. as a result there was almost no measurable smp "overhead" (like you might find with spin locks). it also had the side effect of increasing the mip rate on high end cache machines ... since there was a tendency if a specific processor got into kernel mode ... and other processors also needed kernel services ... these other kernel services requests tended to queue against the processor already performing kernel services (better cache hit rate ... and therefor higher mip rate).
this continued for a couple of releases and then had a major rewrite ... which supposedly vastly improved the smp thruput .... but most customers saw 10-20percent thruput degradation. One large federal TLA was especially concerned.
It turns out that the smp rewrite for this release was targeted at a very specific customer situation. 3081s had been originally targeted at never being available in non-smp version. the problem was that the airline control program (acp or its more modern name transaction processing facility or TPF) didn't have smp support. there were some customers running tpf on multiple 370-195 because they needed highest thruput possible.
so one way of getting more thruput for tpf out of 3081 ... was to operate the 3081 under vm .... and provide two separate single-processor virtual machines .... and run two copies of tpf in the different virtual machines (in cluster) configuration.
Description of a new oldfashioned programming language
John Savard - requiring the low-end and mid-range 360s & 370s had vertical microcoded engines ... look very much like normal instruction streams. they nominally avg. about ten native instructions per 360-370 instruction executed...
First buttembly language encountershow to get started 560
I'd already been programming buttembler on other platforms for years before I first encountered a PC, so my view on this may be different, but I never thought...
another mechanism was to start overlapping some of the virtual machine emulation functions. normally, i-o operation emulation tends to be performed syncronously-serially with virtual machine end. the smp enhancement for this particular release would split of things like SIOF and ccw translation as an independent task request, broadcast a signal processor and then resume virtual machine end (instead of finishing the i-o emulation). other processors would get the broadcast signal processor ... and pick up the work request that had just been pbutted off.
First buttembly language encountershow to get started 561
Well, Duh!, and as you pointed out elsewhere the 8031-8051 style trumps even that. But when Apple , Commodore, PET, Amiga and CP...
the downside was that sigp broadcast, interrupt handling, and buttociated pathlength was started to consume 10% of total processing time of every processor in the smp complex (overhead which had not existed before). the only configuration that benefited was customer account who's workload consisted almost totally of a single processor guest operating system.
For many customers the significant diversion of processing power was obfuscated by some changes made to the way i-o interraction was performed on 3270 terminals .... such that multiple i-o operations tended to be batched as one operation ... making the 3270 terminal interactions appear to be much more responsive.
the problem with this specific large federal TLA was that it didn't have any 3270 terminals ... it was almost totally a glbutt-TTY operation. minor past pbutting posting that ran as slow as today's supercomputers?
and therefor there was no improvement in perceived 3270 interactions to mask the loss in processor capacity.