The Soul of Barb's New Machine was creat 1130
The Soul of Barb's New Machine was creat 1131
Bernd Felsche oops, not exactly. 360-67 had test&set instruction and no caches. charlie, while working on fine-grain locking in cp-67 smp kernel at the science center, invented compare...
Careful design avoids a lot of need for debugging.
Hardware cache coherence isn't hard; just very messy (expensive) when you go to lots of processors.
You can't however really do without it if you have a multi-threaded kernel with several CPUs potentially operating in kernel space at the same time. The cache coherence is necessary to provide fast atomic locks on critical regions and resources.
If you don't have atomic locking, then you end up creating some sort of virtual resource to "manage" access if all CPUs handle system calls; and that will saturate far more quickly, especially if the easy road is chosen and only one such resource is provided system-wide.
The Soul of Barb's New Machine was creat 1134
There is that. It is also the ability of any CPU to be able to pick up where another CPU left off without having to reexecute the same code to set up...
A kluge is to only allow one particular processor into each process' and kernel space at a time; defeating the potential to multi-thread.
Lots of strategies that are safe in uniprocessor architectures break when you add another processor and try to make it share some resource. Even compilers have broken; usually optimisations that fail to recognize "volatility" present in multi-threading.
Turning the system "upside-down" and having a thread as an indivisible unit that may have exclusive access to no more than one particular physical resource at a time; and requiring one or more threads to exist within each process or kernel space still requires that there be some means of resource sharing between threads to get real work done.
And it still requires arbitration of some sort to resolve contention in case of coincident requests to attach a particular resource to two or more threads. The easiest way to do that is via a hardware "test-and-set" instruction operating on memory; with cache-coherence ensuring that all processors have an identical view of the arbitration space. The cache-coherence hardware must directly support test-and-set (or similar) as a special case because it has to happen in a single machine cycle. -- "Bernd Felsche - Innovative Reckoning, Perth, Western Australia ASCII ribbon campaign I'm a .signature virus! X against HTML mail Copy me into your ~-.signature and postings to help me spread!
Alt Folklore Computers Newsgroups