I'm using an Intel Core2 microarchitecture-based Xeon 5440 server, dual socket, totalling 8 cores.
I've read a few docs on "Accounting cycles on Core2 Microarchitecture".
What comes out mostly is for the data bus :
Measuring FSB saturation is straightforward. This can be done in terms of the fraction of bus cycles used for data transfer: BUS_DRDY_CLOCKS.ALL_AGENTS/CPU_CLK_UNHALTED.BUS
I'm okay with that, but that shows the data bus utilisation.
This one does not take into account the potential snooping from cache coherency, because such operations transit on the address bus. So I won't see if the bus is saturated with snoops from this formula.
My question then is : does anyone know how to compute address bus utilisation ? I must see if this is the bottleneck in my case or not.