    cache coherence in Xeon



      I need to model and quantify penalty incurred when a data is there in a processor private cache(L1,L2) and some other core need to access it. I am interested in both xeon and Xeon phi processors.

      I would like to know where a cache coherence unit snoops in xeon?

      when processor access a location from L1 or when a req is made to L2 or when a req is made to L3.

      how it is handled in directory based cache unit of xeon phi (knight landing)?

      some insight about how to measure them on actual processors would be really nice.