1 2 Previous Next 16 Replies Latest reply on Jan 30, 2016 1:51 AM by xbolshe

    ttyS0 input overruns

    dfwJones

      We recently tried the new 1.2.0 BSP for the Quark which gives us kernel version 3.14.28 instead of the old 3.8.7.

       

      The problem is that with the new kernel we are seeing large numbers of ttyS0 input overruns. We weren't seeing this problem with 3.8.7.

      For example:

      Dec 15 02:22:41 SN01291 kernel: [  640.831598] ttyS0: 1 input overrun(s)

      Dec 15 02:22:43 SN01291 kernel: [  643.228205] ttyS0: 1 input overrun(s)

      Dec 15 02:22:53 SN01291 kernel: [  653.291803] ttyS0: 2 input overrun(s)

      Dec 15 02:23:19 SN01291 kernel: [  679.421426] ttyS0: 1 input overrun(s)

      Dec 15 02:23:29 SN01291 kernel: [  689.429202] ttyS0: 2 input overrun(s)

       

      We tend to see higher numbers of overruns when the processor is busier.

       

      We tried to add error correction to the data being passed over the serial port, but that only helps a little bit. When we get too much data lost we can't recover.

       

      Is there a setting we can change to adjust the FIFO trigger level of the 16550? It doesn't appear that we can make that change through setserial. Was there a change made to the serial drivers between 3.8.7 and 3.14.28? Is there another buffer that can have its size adjusted?

        • 1. Re: ttyS0 input overruns
          PabloM_Intel

          Hi dfwJones,

           

          Aside BSP 1.0.1 which was developed thinking on the Galileo board, the following BSPs are designed to be used in a Quark environment, but not specifically for the Galileo Platform. So the BSP that you’re using now may present this kind of issues when being used in this environment. We’ll investigate your issue and we’ll let you know when we have more updates.

          Could you please tell us what processes are you running? We would like to know under what conditions you get this "overrun" messages.

           

          Regards,

          PabloM_Intel

          • 2. Re: ttyS0 input overruns
            dfwJones

            The faults are happening on our own board, not on a Galileo board. Our board's design leverages the Galileo design.

             

            The faults happen when the processor is busy. During normal operation our board has a number of things running almost all the time.

             

            We have a sensor that streams data in bursts over the ttyS0 port. The bursts can be many per second.

             

            The software on the Quark takes the sensor data and does limited processing before sending it over the Ethernet connection.

             

            The board also has a USB webcam which enumerates as /dev/video0. A low frame rate video stream is sent over the Ethernet connection.

            • 3. Re: ttyS0 input overruns
              PabloM_Intel

              Hi dfwJones,

               

              I think I understand your problem now, have you checked this thread before Problem:configuration of 8250/16550 uart driver and its effect on ethernet? Another user had a similar request, and he needed to configure the UART driver and xbolshe provided a possible solution. I would suggest you to check the thread, you might find some useful information.

               

              Regards,

              PabloM_Intel

              • 4. Re: ttyS0 input overruns
                danstokes

                Hi PabloM,

                 

                I am also seeing this same issue with the exact symptoms as dfwJones.  I reviewed the link you suggested, but I don't see the relevance to the ttyS0 overrun issue.  In our application, we need both the serial and Ethernet interfaces active.  The behavior I'm seeing is exactly what you might see if the serial buffers were too shallow or interrupts are being disabled by some other process for too long.

                 

                Best regards,

                John

                • 5. Re: ttyS0 input overruns
                  xbolshe

                  Hi,

                   

                  Intel Quark processor has only one thread/one core (Intel® Quark™ SoC X1000 (16K Cache, 400 MHz) Specifications).

                  If a heavy task does not allow to switch to the Linux driver in a time, tty overruns are expected.

                   

                  Intel Quark has a FIFO buffer with 16 bytes length for operations. And there is no way to increase it.

                   

                  For now an internal buffer length is 4095 bytes. It is possible to increase it.

                  But I guess an increased buffer will not fix a requirement to get a data from FIFO buffer.

                   

                   

                  by the way, may you provide several shorts in the time of the command below when a heavy task is executed in case of 3.8.7 and 3.14.28 kernel?

                   

                  Command
                  cat /proc/interrupts

                   

                  May you provide more information about a serial port speed and actual data rate?

                   

                  BR,

                  xbolshe

                  • 6. Re: ttyS0 input overruns
                    dfwJones

                    I've spent a lot of time digging deeper into the problem.  I have three separate versions of the kernel; 3.8.7, 3.14.28, 3.19.8.  The problem happens on both 3.14.28 and 3.19.8, but does not happen on 3.8.7.

                     

                    When we see the problem, we get system messages like this:

                    [  334.896442] ttyS0: 10 input overrun(s)

                    [  336.293599] ttyS0: 13 input overrun(s)

                    [  337.328057] ttyS0: 16 input overrun(s)

                    [  338.591951] ttyS0: 11 input overrun(s)

                    [  340.215313] ttyS0: 9 input overrun(s)

                    [  341.360737] ttyS0: 14 input overrun(s)

                    [  342.553417] ttyS0: 15 input overrun(s)

                    [  343.600646] ttyS0: 6 input overrun(s)

                     

                    As you can see, we are seeing very large numbers of overruns every second.

                     

                    I dumped the contents of /proc/interrupts before and after running the tests.  We are seeing very large increases in the counts in all cases. Assuming I'm reading the output correctly, it looks like the serial port is set to the same interrupt in all 3 kernels, but in the case of 3.8.7, it doesn't share the interrupt with anything else. The other two kernels appear to have multiple peripherals sharing the same interrupt.?

                    3.8.7:

                        17:        2255   IO-APIC-fasteoi   serial

                    3.14.28:

                        17:     162316   IO-APIC-fasteoi   dw_dmac, dw_dmac, pxa2xx-spi.1, serial

                    3.19.8: 

                        17:          795   IO-APIC  17-fasteoi   INTEL_MID_DMAC2, intel_quark_uart, INTEL_MID_DMAC2, intel_quark_uart, pxa2xx-spi.1

                     

                    Assuming that the sharing is taking place, how do we move those other peripherals to other interrupts?

                     

                    If you need the full output of /proc/interrupts, let me know and I can post it.

                    • 7. Re: ttyS0 input overruns
                      xbolshe

                      Hi,

                       

                      may I ask you to test how it will work with this image?

                      It has UARTs on different interrupts:

                       

                      24:   72   PCI-MSI-edgeINTEL_MID_DMAC2, intel_quark_uart
                      25: 2319   PCI-MSI-edgeINTEL_MID_DMAC2, intel_quark_uart

                       

                      And please post all output of /proc/interrupts after a heavy load.

                       

                      BR,

                      xbolshe

                      • 8. Re: ttyS0 input overruns
                        dfwJones

                        Thank you for producing a new kernel. The kernel as you packaged it boots, but it lacks our product's environment. I tried merging the kernel with all of our environment, but it didn't go well. It looks like there are a number of devices (/sys/proc/gpio, eth0, etc.) that aren't loading which prevent our stuff from running.

                         

                        Is it possible for you to tell us how you managed to move the other devices away from the interrupt that the serial port is using? That way I can make the change and rebuild the kernel here. At the moment, I think we'd prefer to try and continue using the 3.19 we are building from here:

                        galileo-sources/iot_1.2.0_kernel_3.19.8 at master · xbolshe/galileo-sources · GitHub

                        • 9. Re: ttyS0 input overruns
                          xbolshe

                          Hi ,

                           

                          the repository you have mentioned above now have an update.

                          It is related with an UARTs interrupt separation.

                          I guess you may try to use it.

                           

                          Now it looks like:

                           

                          root@quark:~# cat /proc/interrupts
                                     CPU0
                            0:         29   IO-APIC-edge      timer
                            7:          2   IO-APIC-edge
                            8:          1   IO-APIC-edge      rtc0
                            9:          2   IO-APIC-fasteoi   acpi, gpio_sch
                           16:         91   IO-APIC  16-fasteoi   pxa2xx-spi.0, ohci_hcd:usb2
                           17:          0   IO-APIC  17-fasteoi   pxa2xx-spi.1
                           19:          4   IO-APIC  19-fasteoi   ehci_hcd:usb1
                           24:          0   PCI-MSI-edge      INTEL_MID_DMAC2, intel_quark_uart
                           25:       9098   PCI-MSI-edge      INTEL_MID_DMAC2, intel_quark_uart
                           26:       3948   PCI-MSI-edge      mmc0
                           35:        287   PCI-MSI-edge      intel_qrk_gip
                           36:          1   PCI-MSI-edge      pch_udc
                           37:       4157   PCI-MSI-edge      enp0s20f6
                           40:          2       gsi-sch_gpio_irq  0-0020
                           46:         29   PCI-MSI-edge      iwlwifi
                          100:          2  cy8c9540a-irq  gpiolib
                          NMI:          0   Non-maskable interrupts
                          LOC:      16500   Local timer interrupts
                          SPU:          0   Spurious interrupts
                          PMI:          0   Performance monitoring interrupts
                          IWI:          1   IRQ work interrupts
                          RTR:          0   APIC ICR read retries
                          TRM:          0   Thermal event interrupts
                          THR:          0   Threshold APIC interrupts
                          MCE:          0   Machine check exceptions
                          MCP:          0   Machine check polls
                          ERR:          2
                          MIS:          0
                          
                          

                           

                          BR,

                          xbolshe

                          • 10. Re: ttyS0 input overruns
                            CMata_Intel

                            Hi dfwJones,

                             

                            Do you have updates on this?

                            Have you tried with the suggestion from xbolshe?

                             

                            Regards,

                            Charlie

                            • 11. Re: ttyS0 input overruns
                              dfwJones

                              Sorry for the delays, we've encountered a few other issues with 3.19. I may start separate threads for them.

                               

                              We can't yet fully test the new build.  For some reason we aren't getting the /dev/video0 device to show up like it used to with the 3.14 in the official BSP. I've tried everything I can think of to enable with menuconfig.

                               

                              Without the streaming video, we were still seeing the overrun errors under the original 3.19.  With the new version using the interrupt fixes, we haven't yet seen any overruns. This is a very good sign so far.  We will keep testing as soon as we can figure out the video0 problem.

                               

                              I can't yet call it fixed, but it is looking good.

                               

                              Thanks.

                              • 12. Re: ttyS0 input overruns
                                0andriy

                                What kind of interrupt fixes are you talking about?

                                • 13. Re: ttyS0 input overruns
                                  xbolshe

                                  To understand a difference just compare interrupt list for kernel 3.19.8 shown above and the original Intel BSP 1.2.0 below:

                                   

                                   

                                  root@quark:~# cat /proc/interrupts
                                             CPU0
                                    0:         46   IO-APIC-edge      timer
                                    7:          1   IO-APIC-edge
                                    8:          1   IO-APIC-edge      rtc0
                                    9:          1   IO-APIC-fasteoi   acpi, gpio_sch
                                  16:       3554   IO-APIC-fasteoi   mmc0, pxa2xx-spi.0, ohci_hcd:usb2
                                  17:        887   IO-APIC-fasteoi   dw_dmac, dw_dmac, pxa2xx-spi.1, serial
                                  19:         79   IO-APIC-fasteoi   ehci_hcd:usb1
                                  32:          1         --sch_gpio_irq_chip  0-0020
                                  40:       7373   PCI-MSI-edge      intel_qrk_gip
                                  41:          1   PCI-MSI-edge      pch_udc
                                  42:          0   PCI-MSI-edge      enp0s20f6
                                  100:          1  cy8c9540a-irq  gpiolib
                                  NMI:          0   Non-maskable interrupts
                                  LOC:       3370   Local timer interrupts
                                  SPU:          0   Spurious interrupts
                                  PMI:          0   Performance monitoring interrupts
                                  IWI:          0   IRQ work interrupts
                                  RTR:          0   APIC ICR read retries
                                  TRM:          0   Thermal event interrupts
                                  THR:          0   Threshold APIC interrupts
                                  MCE:          0   Machine check exceptions
                                  MCP:          0   Machine check polls
                                  ERR:          1
                                  MIS:          0
                                  
                                  

                                   

                                  As you may see several devices are located on the same shared interrupt:

                                   

                                  17:        887  IO-APIC-fasteoi  dw_dmac, dw_dmac, pxa2xx-spi.1, serial

                                   

                                  Interrupt fixes allow to separate them.

                                   

                                  BR,

                                  xbolshe

                                  • 14. Re: ttyS0 input overruns
                                    xbolshe

                                    BTW, I have /dev/video0 with kernel 3.19.8:

                                     

                                    cam2.png

                                     

                                    BR,

                                    xbolshe

                                    1 2 Previous Next