| PLEX86 | ||
Performance and Capacity Planning 736the issue in cp67 was that all time was accounted for .... while in virtual machine mode ... it was all charged to "problem state" of the virtual machine. there were lots of reasons to enter the kernel and "supervisor state" ... lots of things could be performed in the kernel on behalf of the virtual machine in "supervisor state" ... which would also be charged against the virtual machine .... doing things on behalf of the virtual machine. Performance and Capacity Planning 739 before doing release 4 support for 158 & 168 (two-processor) smp, there were two other projects (that were never annonced), vamps (a 5-way smp ... implemented with lower level 370 processor... When i originally started on cp67 there was some non-linear code (linear scanning certain kinds of lists) that grew proportional to the number of tasks and virtual machines. At 35 virtual machines it was hitting 15-20 of total cpu (all kernel supervisor) and not charged to a specific virtual machine ... aka two kinds of kernel supervisor state ... that having to do with general system bookkeeping not charged to a specific user (and had to be amortized across all users as "overhead") and kernel supervisor state that was charged directly to virtual machine buttociated with kernel activity done directly on behalf of the virtual machine. OT: Folk keyboard Google lists 55 references to the Dvorak keyboard in this news group, so I hope this will also be... when i restructured various paths in the system ... i did away with most every linear scanning for overhead ... reducing cp67 "overhead" to possibly half percent of elapsed time ... even with 75-80 users. Folk keyboard 742 Once you get it running, Microsoft Keyboard Layout Creator should be pretty much self-explanatory... the issue in global kernel spinlock ... was that the standard state of the art for the period ... was that on entry to the kernel (for whatever reason) the kernel interrupt code would spin on the global kernel spinlock ... until that processor obtained the spinlock and could proceed. only one processor could be executing in the kernel at any point in time. the logic redo for vamps ... and then later ported to a purely software implementation ... moved the kernel lock well past the basic interrupt routines ... so that a much smaller portion of the kernel was serizlied, being only able to execute on a single processor at a time. the other vamps change ... and larter morped to a purely software implementation was that the global kernel serialization lock wasn't a spinlock ... i initially referred to it as a "bounce" lock a processor when it needed certain kinds of kernel function would attempt to obtain the kernel serialization lock ... if it obtained it ... it would proceed as normal. if it failed to obtain the kernel lock ... it would queue a super light-weight thread against the kernel lock and go off and look for something else to do. so access to certain serialized kernel functions could only be performed on one processor at a time (although on a moment to moment basis ... it could be any processor in the complex acting as the kernel server). Since all requests for those kernel services were serizlied on a single processor at a time ... that means that the total service time available for performing those services is 100 percent of a single processor (couldn't have more than 100 cpu seconds aggregate of serialized kernel time per 100 seconds of real time) Performance and Capacity Planning 738 re: planning planning it was interesting period at the science center concurrently and-or overlapped ... i was getting... So you have somewhat standard operations research analysis. Say you have four processor system .... where for every 100 seconds of general end ... you needed 25 second of global serizlized service. With four processors ... there would be 400 seconds of end in 100 seconds of real time and generating 4*25=100 seconds worth of serialized workload in 100 seconds of real time. With five processor system ... that would result in no processor having to wait on serialized kernel processing. The vamps microcode changes actually reduced it to less than 25 seconds of serialized kernel processor per 100 seconds of non-kernel processing .... but the requirement was that it had to be reduced to at least 25 seconds of kernel processor to keep the infrastructure from waiting on kernel services. now in standard global kernel spin-lock impleemntations of the period ... you didn't actually see processors in wait state for kernel services ... they were all spinning on the kernel spin-locks ... if there was greater demand for serizlied kernel services than 100percent of a single processor. in the vamps ... and later software implementation case with the bounce-queued implementation ... a processor that couldn't obtain the global kernel spin-lock, instead of spinning would queue a super light-weight request (for kernel services) ... and go off and look for other, non-kernel work to do. If it couldn't find other non-kernel work to do, it would enter wait-state ... but it wouldn't be in a tight compute-bound spin-loop. Performance and Capacity Planning 737 re: when i was an undergraduate ... i did a lot of path rewrites of stuff that i thot would likely be high-use ... as well as doing... the super light-weight queueing mechanism was such that on cache machines ... any overhead of actually doing the queue-dequeue operations was more than offset by maintaining kernel instruction cache locality on the same processor (it actually ran faster).
|
||||
Performance and Capacity Planning 737 Alt Folklore Computers from Newsgroups The #1 Usenet Provider on the Internet
|
||||