13 Replies Latest reply: Jul 26, 2011 5:01 PM by mwvantol RSS

    Strange behaviour when reading MPB

    markus_pm

      Hi there,

       

      I'm experiencing some very strange behaviour. As a simple setup I allocate the MPB memory as uncacheable I/O-Memory with MPBT bit set. When I afterwards read the data again, I get for all cacheline sized parts only the first 8 bytes. For example, if the MPB actually looks like this:

       

      00000000  |  ad de ef be fe ca be ba

      00000008  |  fe ca be ba ad de ef be

      00000010  |  ef be fe ca be ba ad de

      00000018  |  be ba ad de ef be fe ca

       

      00000020  |  00 01 02 03 04 05 06 07

      00000028  |  08 09 0a 0b 0c 0d 0e 0f

      00000030  |  10 11 12 13 14 15 16 17

      00000038  |  18 19 1a 1b 1c 1d 1e 1f

       

      The result when reading all the data is as if the buffer would contain:

       

      ad de ef be fe ca be ba

      ad de ef be fe ca be ba

      ad de ef be fe ca be ba

      ad de ef be fe ca be ba

       

      00 01 02 03 04 05 06 07

      00 01 02 03 04 05 06 07

      00 01 02 03 04 05 06 07

      00 01 02 03 04 05 06 07

       

      Any combination of "volatile" qualifiers and CL1INVD instructions doesn't change the result at all.

       

      Disabling L1 Cache just displays the wrong result a bit more slowly.

       

      Not using the MPBT bit, however, solves the problem. Maybe it has something to do with the fact that MPBT is the same bit as PSE?

       

      Another update: Using the MPBT bit AND declaring the MPB memory as cacheable memory, works as well.

       

      Does anyone have a clue what could go wrong here? If I dump the MPB with sccDump -m, all data is displayed correctly.

       

      Thanks,

       

      Markus

       

      Message was edited by: Markus Partheymueller

        • 1. Re: Strange behaviour when reading MPB
          tedk

          Markus, can you post the code you are using to see this behavior. If you think it is more appropriate, please file a Bugzilla bug and post there.  Can you post the source code and a Makefile? I want to be able to build it from source myself and see if I can reproduce this behavior. Thanks.

          • 2. Re: Strange behaviour when reading MPB
            markus_pm

            I'm afraid that's not so easy. I'm using my code on top of an L4 microkernel and the corresponding runtime environment L4Re. In general, I allocate the MPB memory as uncacheable, MPBT I/O memory, write 32 bytes to it, and afterwards read it again. Then I get this weird result. If the memory is cacheable, everything is correct.

             

            sccLinux obviously also uses cacheable memory, this lead me to trying this as well, which worked.

             

            I would be very interested in results of some basic baremetal application doing the same thing. That way I could get a clue if it is a system of a software problem.

            • 3. Re: Strange behaviour when reading MPB
              tedk

              How are you allocating MPB memory as uncacheable?

               

              If I understand your post correctly ... you allocate some MPB memory as uncacheable, set MPBT, write to that memory; but then when you read a cacheline, you see the first 8 bytes of that 32-byte line repeated thoughout the line. Is this all happening on one core?  It seems to me that if it's one core doing the writing and reading, it shouldn't matter whether the memory is cacheable or not; but if you are having one core read what another has written, then the operation of L1 is important.

               

              MPBT memory is intended to bypass L2, so it must be L1 that your allocated memory is not using. You tried using CL1INVD ... I assume just to see what happens, but if you are truly not using L1, CL1INVD isn't going to do anything.

               

              You see the same problem if you disable L1, which is actually not the same as allocating uncacheable memory because L1 is non-unified.

               

              You don't see the problem when you use MBPT and allocate MPB as cacheable. That part is encouraging. This is what I would see as typical operation.

               

              I'm not sure if this is appropriate but have you looked at Bug 46? There is a known hw SCC bug in the MPB bypass logic.

              • 4. Re: Strange behaviour when reading MPB
                jheld

                You are giving conflicting direction to the HW.  MPBT is cacheable (L1)

                You can allocate MPB and map it as uncacheable in the pagetable - but don't set it as MPBT. The result is undefined.

                • 5. Re: Strange behaviour when reading MPB
                  markus_pm
                  How are you allocating MPB memory as uncacheable?

                  I just map memory from MPB and make sure that in the pagetable entry there is bit 3 set.

                   

                  I'm not sure if this is appropriate but have you looked at Bug 46? There is a known hw SCC bug in the MPB bypass logic.

                  No, I'm not using this bypass bit, never considered using it because of this hardware bug.

                   

                   

                  Jim, Page 31 of the EAS says

                  Defining data as UC + MPBT could be used to accelerate UC writes to the DDR3 memory because the hardware uses the write combine buffer when the MPBT bit is set.

                  There's no mention of undefined behaviour of MPB memory. So if this is really no valid configuration for MPB, I think it should be mentioned somewhere in the manual. Or did I just miss it somewhere else?

                  • 6. Re: Strange behaviour when reading MPB
                    jheld

                    I see what you mean.  The WCB consolidates the write accesses when MPBT is on, which is an enhancement given the 'write-around' behavior of P54C.

                    The EAS suggests you can trigger the WCB without side effects from the cache aspects of MPBT and use it with UC.  I'm suspicious - we'll check with the folks who did the MPBT implementation to see if they have tested that mode.

                    • 7. Re: Strange behaviour when reading MPB
                      tedk

                      I did check with the implementers as Jim suggested. In theory one should be able to use the WCB without enabling L1.

                       

                      There is some related work that you can look at. Have you seen Efficient Memory Copy Operations on the 48-core Intel SCC Processor?

                      http://communities.intel.com/docs/DOC-6872

                      • 8. Re: Strange behaviour when reading MPB
                        markus_pm

                        As far as I can tell, they never use uncached access to the MPB (or MPBT-tagged memory).

                         

                        But I also can't really see the problem when doing so, so I would be glad to hear from the hardware experts what exactly happens with these flags.

                         

                        For now, I'll stick with MPBT+cached memory.

                        • 9. Re: Strange behaviour when reading MPB
                          mwvantol

                          We have just investigated this problem, and we can confirm that this behavior indeed happens when both the uncacheable and MPBT flags are set, and only occurs when reading data. Not only from the MPB, but also when reading data from main memory with these two flags set.

                           

                          What happens (I think) is the following;

                          - A read request of uncachable MPBT type arrives at either the MPB or the memory controller,

                          - For some reason, not a 32-bit value, but a whole cacheline is sent back as reply

                          - Then, as the P54C bus is 64bits wide, the data read by the core is always the first 64 bits of the cacheline (this is what you see in your example)

                           

                          My assumption is that the M-unit sends a request for a whole cacheline when issuing a read reqeust of MPBT data, not checking the uncacheable flag. However when the data comes back the data is not put in the cache, but just on the P54C bus.

                           

                          Is this a viable explanation? At least it fits exactly the behavior that we have observed...

                          • 10. Re: Strange behaviour when reading MPB
                            tedk

                            So does this mean that Markus cannot get at the other values? His MPB contains the following

                            00000020  |  00 01 02 03 04 05 06 07

                            00000028  |  08 09 0a 0b 0c 0d 0e 0f

                            00000030  |  10 11 12 13 14 15 16 17

                            00000038  |  18 19 1a 1b 1c 1d 1e 1f

                             

                            And he reads

                            00 01 02 03 04 05 06 07

                            00 01 02 03 04 05 06 07

                            00 01 02 03 04 05 06 07

                            00 01 02 03 04 05 06 07

                             

                            Markus, is that what you get when you read 0x20 and then 0x28 and then 0x30 and then 0x38? Does that mean that it’s not possible to read the correct value in 0x28 ... that, for example,  0x28 really does contain 08 09 0a 0b 0c 0d 0e 0f but your  read returns 00 01 02 03 04 05 06 07

                             

                            Is the reason ... that the P54C always returns the cacheline that contains the address we are reading (whether or not the memory is cacheable), but if it is not cacheable then only the first 8 bytes get put on the bus. And so you cannot access the other addresses.

                             

                            But we access uncacheable shared memory just fine. Is this strange behavior only for the MPB and MPBT memory?

                             

                            We'll of course try out some example here, but I wanted to understand the specifics of the issue first.

                            • 11. Re: Strange behaviour when reading MPB
                              markus_pm

                              Yes, that's the behaviour I'm experiencing. No matter what 8-byte offset of a cacheline I read, I get always the value of the first eight bytes. There's no way to get the other 24 bytes of the cacheline.

                               

                              Michiels suggestion sounds very reasonable to me. Maybe that's the thing to investigate.

                              • 12. Re: Strange behaviour when reading MPB
                                tedk

                                Yes, Michiel's suggestion does seem logical. I do know though that when we have uncacheable shared memory we can access all the locations. The problem you are experiencing seems specific to  MPBT memory, and I don't understand why that would be true.

                                • 13. Re: Strange behaviour when reading MPB
                                  mwvantol

                                  We actually ran a test today for all possible configurations of the PCD, PWT and MPBT flags. Only when both PCD and MPBT are set this situation occurs, so not when using uncached shared memory. Therefore I suspect that the M-unit makes a wrong request (always a whole cacheline) in MPBT mode.

                                   

                                  Perhaps we should also test if the data ends up in L1 at the same time even though PCD is set, but I don't really have time to write such tests at the moment unfortunately - and if PCD + MPBT is broken anyway, it's not very important either, it's more a curiosity thing