Dual Core CPUs are slower than Dual Single core CPUs

and the multiple threads are conserving cache lines in many cases by making use of the exact same data (so you may be getting higher per instruction cache hit ratio for the same number of cache lines).

there is an analogy to this from long ago and far away involving real storage for tss-360 paging (from the 60s). tss-360 was originally announced to run on a 512kbyte 360-67 ... but the tss-360 (fixed) kernel was rapidly growing. eventually the minimum was 768kbytes and to really get anything done with tss-360 you needed 1024kbytes (largest memory configuration).

then then benchmarked two processor tss-360 on a two processor 360-67 with two megabytes of real storage (each processor came with 1mbyte max. and multiprocessor support allowed the addressing to be linear) ... and tss-360 thruput was coming out around 3.5times that of tss-360 uniprocessor operation.

somebody made the claim that tss-360 scale-up, multiprocessor support and algorithms were obviously the best in the industry ... being able to get 3.5 times the thruput with only two times the resources.

it turns out that it was a relative measurement, both tss-360 uniprocessor and multiprocessor thruput was quite bad ... using an absolute measure (as opposed to purely relative measurement).

the issue was that the tss-360 kernel requirements had grown so that if you attempted to perform almost any operations ... with the amount of real storage left over for paging in a 1mbyte configuration ... would page thrash. with double the real storage (2mbytes) ... the amount of real storage left over for application paging increased by a factor of 5-10 times (compared to single processor, 1mbyte configuration) ... resulting in tss-360 seeing 3.5 times the aggregate thruput (in two processor configuration) relative to single processor configuration (however, neither numbers were actually that remarkable).


