2 Replies Latest reply on Aug 4, 2010 11:09 PM by alexeypa

    Caching on read vs. caching on write


      We are seeing interesting caching effects. Copying of memory (dest <- source) takes 10 times less time if source was written before. We do not see improvement if the source was read before. Here is the test code. We run it baremetal:


      UINT64 StartTime;
      UINT64 StopTime;
      UINT32 Output = 0;
      CHAR InputVariable[1024];
      CHAR OutputBuffer[1024];


      StartTime = Rdtsc();


      for (UINT32 Offset = 0; Offset < 1024; ++Offset) {


      #ifdef WRITE_FIRST
          InputVariable[Offset] = OutputBuffer[Offset];
          OutputBuffer[Offset] = InputVariable[Offset];




      StopTime = Rdtsc();


      printf("Initialization : %16I64u\n", StopTime - StartTime);


      StartTime = Rdtsc();


      for (UINT32 Index = 0; Index < 10000; ++ Index) {
          for (UINT32 Offset = 0; Offset < 1024; ++ Offset) {
              InputVariable[Offset] = OutputBuffer[Offset];


      StopTime = Rdtsc();


      printf("Time Taken: %16I64u\n", StopTime - StartTime);




      The results:


      (1)    With WRITE_FIRST

           Initialization :           124200

           Time Taken:       1119994407 (i.e. each iteration takes about the same time as the first one).


      (2)    Without WRITE_FIRST:

           Initialization :           120790

           Time Taken:        112907282 (each iteration is 10x faster than the 1st one).


      The buffers reside in main memory. Thay are mapped using WRITEBACK caching policy.


      Can you please explain the results we are seeing?

        • 1. Re: Caching on read vs. caching on write

          Here is a possible explanation.


          1. SCC does not allocate a cache line on write. Only read misses will fill the cache.

          2. A write miss will generate a non-burst write whereas a read miss will bring in a cacheline 32 bytes. In case of back to back writes the write buffer should generate a burst write.


          When you enable WRITE_FIRST during initialization InputVariable[Offset] = OutputBuffer[Offset] will read all the source addresses (OutputBufer[]) into the cache as a cache line fill. After this the source reads will be cache hits but during your write to InputVariable[] all the writes will be byte writes going to main memory. I dont think back-to-back writes will occur since we are operating on char. Hence this will be slower.


          When you do NOT enable WRITE_FIRST OutputBuffer[Offset] = InputVariable[Offset] will read all the destination addresses (InputVariable[]) into the cache. The reads to OutputBuffer[] will be a cache line fill but all your writes will be a cache hit. This should be much faster.



          Let me know if this makes any sense or if I am analyzing it incorrectly.

          • 2. Re: Caching on read vs. caching on write
            Let me know if this makes any sense or if I am analyzing it incorrectly.

            This sounds like a very plausible explanation. By reading both source and destination (when WRITE_FIRST is not defined) the code brings both locations into the cache.