8 Replies Latest reply on Jan 15, 2013 10:48 PM by dagger

    How can I measure time across different cores on SCC baremetal?

    vjain27

      Hi,

      I am working on SCC baremetal and I need to measure time of some communication between two cores.

      What is the way to go about it ?

       

      Thanks

      Vaibhav Jain

        • 1. Re: How can I measure time across different cores on SCC baremetal?
          JanArneSobania

          Hi,

           

          starting with sccKit 1.4.0, you can use the global time-stamp counter in the FPGA. It is a 64 bit counter running at 125MHz, available in the form of two 32-bit (dword) values in registers 0x8224 (lower 32 bits) and 0x8228 (higher 32 bits). In the default LUT mapping, that are physical addresses 0xF9008224 and ..28, respectively.

           

          SCC Linux can use the GTSC as its clocksource. Even if you are using baremetal, you may be interested in how the corresponding "scc" clocksource is implemented:

          https://github.com/hpi-scc/linux-kernel/commit/50e55b43ceefd32896a99db644fe91d619280501

           

          The helper function sccsys_read_grb_entry is implemented in sccsys, as a simple read from the mapped GRB range. sccsys->grb is the virtual address of the ioremap-ed GRB range, grb_offset is its physical address (set to 0xF9000000):

          https://github.com/hpi-scc/linux-kernel/blob/aa243173f471da427d177c9b85196ba0f53ae932/drivers/char/sccsys.c

           

          Regards,

          Jan-Arne

          • 2. Re: How can I measure time across different cores on SCC baremetal?
            dagger

            Hi Jan-Arne,

             

            I am using Barrelfish installed on SCC. And also I want to access this time-stamp counter.

            Then I map the physical address to an virtual address.

            However the 64-bit value I read at this address is always ZERO.

             

            I am wondering if I need to write the LUT to map the registers to physical address 0xF9008224. If need, what value (8-bit destination ID, 3-bit sub-ID, and etc. ) should I write into the LUT entry ?

             

            Thanks,

            Zhiquan

            • 3. Re: How can I measure time across different cores on SCC baremetal?
              vjain27

              Hi Jane-Arne,

               

              Thanks a lot for the solution. However I am not very clear about the GTSC. Is it a count-down timer?

              What value is it initialized to and when ? Could you please give the basic code to measure time across two cores ?

              I could find the code for reading the counters in the source that you mentioned but I am not sure on how to proceed.

               

              Thanks

              Vaibhav Jain

              • 4. Re: How can I measure time across different cores on SCC baremetal?
                JanArneSobania

                Hi Zhiquan,

                 

                I am using Barrelfish installed on SCC. And also I want to access this time-stamp counter.

                Then I map the physical address to an virtual address.

                However the 64-bit value I read at this address is always ZERO.

                 

                I am wondering if I need to write the LUT to map the registers to physical address 0xF9008224. If need, what value (8-bit destination ID, 3-bit sub-ID, and etc. ) should I write into the LUT entry ?

                unfortunately I'm not familiar with the boot process of Barrelfish. Does it rely on sccBoot or sccGui, like SCC Linux? The LUT should be initialized automatically by these programs.

                 

                Which attributes do you use in your page table? The GRB range must be mapped non-cached (PCD=1), and accessed using 32-bit operations. To read the counter, you need to use two dword accesses ("mov e*x, dword ptr [...]" in Intel assembly language). In C, I would try using volatile int ("int value = *(volatile int*)(p)").

                 

                Regards,

                Jan-Arne

                • 5. Re: How can I measure time across different cores on SCC baremetal?
                  JanArneSobania

                  Hi Vaibhav,

                  Thanks a lot for the solution. However I am not very clear about the GTSC. Is it a count-down timer?

                  What value is it initialized to and when ? Could you please give the basic code to measure time across two cores ?

                  I could find the code for reading the counters in the source that you mentioned but I am not sure on how to proceed.

                  the global TSC is a 64-bit counter that increments at 125MHz. That frequency is supplied by the FPGA and independent of any core frequencies you can set via RCCE.

                   

                  According to the EAS, the counter is read-only. However, that should not be a problem. Just read its value before and after the operation whose duration you want to measure, take the difference, then divide by 125000000 to convert it to seconds.

                   

                  The more interesting problem is how you define "before" and "after" when multiple cores are involved. In that case, you need some kind of logical ordering of events (compare for example Lamport's "happened before" relation), and send the timestamps (GTSC values) around. Try something like this (pseudo-code):

                   

                  First Core (executes first part of your operation):

                  unsigned long long start_time = read_global_tsc();

                  <first part of operation>

                  send_to(other_core, <intermediate data of your operation>, start_time);

                   

                  Second Core (executes second part of your operation):

                  unsigned long long start_time;

                  (<intermediate data of your operation>, start_time) = receive_from(first_core);

                  <second part of operation>

                  unsigned long long end_time = read_global_tsc();

                  unsigned long duration_in_microseconds = (unsigned long)((end_time - start_time) / 125);

                   

                  Regards,

                  Jan-Arne

                  • 6. Re: How can I measure time across different cores on SCC baremetal?
                    dagger

                    Dear Jan-Arne,


                    Thanks a lot.

                    unfortunately I'm not familiar with the boot process of Barrelfish. Does it rely on sccBoot or sccGui, like SCC Linux? The LUT should be initialized automatically by these programs.

                    Yes, the boot process of Barrelfish is also rely on sccMerge and sccBoot. I have check the LUT entry of 249 at runtime, this entry was wrote correctly just as described in the sccMerge script. So I think there is no problem with the LUT.

                    Which attributes do you use in your page table? The GRB range must be mapped non-cached (PCD=1), and accessed using 32-bit operations. To read the counter, you need to use two dword accesses ("mov e*x, dword ptr [...]" in Intel assembly language). In C, I would try using volatile int ("int value = *(volatile int*)(p)").

                    I think the PCD is set correctly, as the pages was mapped with a bitmap of DEVICE_PAGE_BITMAP, which is defined as follow:

                     

                    #define DEVICE_PAGE_BITMAP                              \

                        (X86_32_PTABLE_PRESENT | X86_32_PTABLE_READ_WRITE | \

                         X86_32_PTABLE_CACHE_DISABLED | X86_32_PTABLE_USER_SUPERVISOR | TABLE_GLOBAL_PAGE)

                     

                    And aslo, I use two dword accesses the read the low 32 bits and high 32 bits respectively.

                     

                    BUT after these all, it still does work.

                    What's more weird, I found that the 64-bit values read from phy addr of 0xf9008244 are different when I use different base mapped phy addr (certainly use corresponding offset to access the addr). For example:

                    When I map a virtual addr with size of 1 pages to base addr of 0xf9008000, the 64-bit value read from (this virtual addr + 0x224) is always 0x0.

                    And when I map 16 pages to base addr of 0xf9000000, then 64-bit value readed with offset of 0x8224 is always a constant non-zero value. and the value does not change even I reboot the Barrelfish.

                     

                    Thanks,

                    Zhiquan

                    • 7. Re: How can I measure time across different cores on SCC baremetal?
                      JanArneSobania

                      Hi Zhiquan,

                       

                      I think the PCD is set correctly, as the pages was mapped with a bitmap of DEVICE_PAGE_BITMAP, which is defined as follow:

                       

                      #define DEVICE_PAGE_BITMAP                              \

                          (X86_32_PTABLE_PRESENT | X86_32_PTABLE_READ_WRITE | \

                           X86_32_PTABLE_CACHE_DISABLED | X86_32_PTABLE_USER_SUPERVISOR | TABLE_GLOBAL_PAGE)

                      assuming the constants are right, that definition looks fine.

                       

                      What's more weird, I found that the 64-bit values read from phy addr of 0xf9008244 are different when I use different base mapped phy addr (certainly use corresponding offset to access the addr). For example:

                      When I map a virtual addr with size of 1 pages to base addr of 0xf9008000, the 64-bit value read from (this virtual addr + 0x224) is always 0x0.

                      And when I map 16 pages to base addr of 0xf9000000, then 64-bit value readed with offset of 0x8224 is always a constant non-zero value. and the value does not change even I reboot the Barrelfish.

                      Indeed, that is very weird and should never happen. Are you sure your address arithmetic is correct? That is, do you define the variable that holds the virtual address of your mapping as "void*" or (unsigned) "char*", and not something else like "int*"? Example:

                       

                      void* grb_base = map_physical_memory(0xF9000000, 0x10000, NO_CACHE);

                      unsigned long* value = *(unsigned long*)(grb_base + 0x8224);

                       

                      You can replace the "void*" before grb_base with "unsigned char*", and it would still access the same location in memory. But you cannot use any base type that is larger than 1 byte, without also recalculating the offset. If, for example, you change the definition of grb_base to "int*", the access would target a wrong location (offset 0x8224*sizeof(int) = 0x20890), and likely crash (if no mapping exists) or just return garbage.

                       

                      If you are unsure, try to cast the base address to (unsigned char*) beforehand, like this:

                       

                      unsigned long* value = *(unsigned long*)(((unsigned char*)grb_base) + 0x8224);

                       

                      Regards,

                      Jan-Arne

                      • 8. Re: How can I measure time across different cores on SCC baremetal?
                        dagger

                        Hi Jan-Arne,

                         

                        Sorry for late reply.

                        The key point of my problem is just as you said. I defined the base addr with type of "volatile int*", and added the offset directly when access the memory, witch must target a wrong location.

                        I made such a basic mistake...

                         

                        Thanks very much:)

                         

                        Regards,

                        Zhiquan