7 Replies Latest reply on Apr 17, 2017 4:50 PM by Intel Corporation

    USB Linux Kernel Panic

    mdergosits

      I am using an edison to drive a USB camera and writing frames of video to a USB drive. I have noticed repeated kernel panics when recording video and writing to a usb drive simultaneously. Here is an example kernel panic:

      [  434.553324] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.557323] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.561323] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.565323] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.569322] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.573320] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.577320] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.581320] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.585319] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.589319] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.593318] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.597317] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.601318] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.605316] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.609317] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.613315] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.617314] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.621317] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.625314] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.629315] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.633314] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.637312] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.641312] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.645311] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.649311] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.653310] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.657309] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.661309] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.665309] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.669308] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.673307] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.677307] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.681307] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.685305] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.689306] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.693304] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.697303] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.701304] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.705302] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.709302] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.713302] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.717301] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.721302] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.725300] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.726710] EXT4-fs warning (device sda1): __ext4_read_dirblock:908: error reading directory block (ino 2, block 1)

      [  434.729301] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.733299] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.737298] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.741299] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.745297] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.749297] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.753296] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  434.757296] dwc3-host dwc3-host.2: ERROR Transfer event TRB DMA ptr not part of current TD

      [  437.338466] mmc2: Timeout waiting for hardware interrupt.

      [  443.017829] mmc0: Timeout waiting for hardware interrupt.

      [  443.017911] mmc0: dump APIC RTE reg - L32: 0x00008982, H32: 0x03000000

      [  452.116384] BUG: soft lockup - CPU#0 stuck for 23s! [swapper/0:0]

      [  452.116450] Modules linked in: iptable_filter ip_tables usb_f_acm u_serial g_multi libcomposite bcm4334x(O)

      [  452.116527] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O 3.10.17-yocto-standard #6

      [  452.116600] Hardware name: Intel Corporation Merrifield/BODEGA BAY, BIOS 542 2015.01.21:18.19.48

      [  452.116678] task: c19481c0 ti: f6c0a000 task.ti: c1942000

      [  452.116735] EIP: 0060:[<c1433666>] EFLAGS: 00200246 CPU: 0

      [  452.116763] EIP is at delay_tsc+0x3d/0xb0

      [  452.116781] EAX: 3c082ca5 EBX: 3c082aa2 ECX: 00000000 EDX: 00000035

      [  452.116799] ESI: 000001f4 EDI: 00000000 EBP: f6c0bf68 ESP: f6c0bf54

      [  452.116816]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068

      [  452.116834] CR0: 8005003b CR2: 18e73000 CR3: 01a67000 CR4: 001007f0

      [  452.116850] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000

      [  452.116865] DR6: ffff0ff0 DR7: 00000400

      [  452.116878] Stack:

      [  452.116907]  f6c0a000 3c082c2d 00000000 c1b2c110 000026eb f6c0bf70 c14335e4 f6c0bf78

      [  452.116967]  c1433600 f6c0bfac c147b1d6 0000002e c1953e30 c1b302c4 c1b45110 c1b450f0

      [  452.117024]  c1b30234 080031b4 00000000 c1b2c110 c1b2c20c 00000000 f6c0bfb8 c147c2bb

      [  452.117081] Call Trace:

      [  452.117137]  [<c14335e4>] __delay+0xe/0x10

      [  452.117167]  [<c1433600>] __const_udelay+0x1a/0x1c

      [  452.117198]  [<c147b1d6>] serial_hsu_command+0x30b/0x720

      [  452.117229]  [<c147c2bb>] serial_hsu_tasklet+0x1a/0x2d

      [  452.117260]  [<c1238945>] tasklet_action+0x74/0xb8

      [  452.117289]  [<c12382f5>] __do_softirq+0xcf/0x1f8

      [  452.117319]  [<c1238226>] ? local_bh_disable+0x17/0x17

      [  452.117334]  <IRQ>

      [  452.117364] [  452.117400]  [<c123852d>] ? irq_exit+0x57/0x91

      [  452.117427]  [<c12026ae>] ? do_IRQ+0x70/0x84

      [  452.117460]  [<c16c4931>] ? common_interrupt+0x31/0x38

      [  452.117493]  [<c154cba1>] ? cpuidle_enter_state+0x45/0xb1

      [  452.117523]  [<c154ccc4>] ? cpuidle_idle_call+0xb7/0x107

      [  452.117554]  [<c12a0519>] ? trace_hardirqs_off+0x11/0x25

      [  452.117585]  [<c1206856>] ? arch_cpu_idle+0xd/0x21

      [  452.117615]  [<c1265354>] ? cpu_startup_entry+0xc8/0x124

      [  452.117644]  [<c16acf6f>] ? rest_init+0x73/0x75

      [  452.117677]  [<c19d690e>] ? start_kernel+0x2e2/0x2e7

      [  452.117708]  [<c19d62ac>] ? i386_start_kernel+0x82/0x86

      [  452.117722] Code: 00 00 00 e8 c0 f4 28 00 e8 85 63 00 00 89 c7 8d 76 00 0f ae e8 0f 31 89 c3 89 e0 25 00 e0 ff ff 89 45 ec 8d 76 00 0f ae e8 0f 31 <89> 45 f0 29 d8 39 f0 73 46 b8 01 00 00 00 e8 50 f5 28 00 8b 45

      [  452.118367] Kernel panic - not syncing: softlockup: hung tasks

      [  452.118432] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O 3.10.17-yocto-standard #6

      [  452.118501] Hardware name: Intel Corporation Merrifield/BODEGA BAY, BIOS 542 2015.01.21:18.19.48

      [  452.118571]  00000000 c19484c0 f6c0be5c c16bb6e2 f6c0be74 c16b9ab2 00000000 c19484c0

      [  452.118700]  f6c0bf18 00000017 f6c0be94 c128731c c1878240 c19481c0 00000000 f73eafc0

      [  452.118826]  00000069 c1287209 f6c0beb4 c124ea05 f6c0beec f73eac40 f73eac7c f73eac40

      [  452.118951] Call Trace:

      [  452.118999]  [<c16bb6e2>] dump_stack+0x16/0x18

      [  452.119056]  [<c16b9ab2>] panic+0x86/0x180

      [  452.119115]  [<c128731c>] watchdog_timer_fn+0x113/0x132

      [  452.119178]  [<c1287209>] ? __touch_watchdog+0x19/0x19

      [  452.119242]  [<c124ea05>] __run_hrtimer+0x8e/0x132

      [  452.119305]  [<c124f28f>] hrtimer_interrupt+0x10f/0x211

      [  452.119375]  [<c1219a80>] smp_apic_timer_interrupt+0x5d/0x70

      [  452.119443]  [<c16c0682>] apic_timer_interrupt+0x32/0x38

      [  452.119510]  [<c16c00d8>] ? _raw_spin_unlock_irqrestore+0x37/0x47

      [  452.119578]  [<c1433666>] ? delay_tsc+0x3d/0xb0

      [  452.119637]  [<c14335e4>] __delay+0xe/0x10

      [  452.119692]  [<c1433600>] __const_udelay+0x1a/0x1c

      [  452.119751]  [<c147b1d6>] serial_hsu_command+0x30b/0x720

      [  452.119816]  [<c147c2bb>] serial_hsu_tasklet+0x1a/0x2d

      [  452.119878]  [<c1238945>] tasklet_action+0x74/0xb8

      [  452.119938]  [<c12382f5>] __do_softirq+0xcf/0x1f8

      [  452.119998]  [<c1238226>] ? local_bh_disable+0x17/0x17

      [  452.120046]  <IRQ>  [<c123852d>] ? irq_exit+0x57/0x91

      [  452.120131]  [<c12026ae>] ? do_IRQ+0x70/0x84

      [  452.120189]  [<c16c4931>] ? common_interrupt+0x31/0x38

      [  452.120254]  [<c154cba1>] ? cpuidle_enter_state+0x45/0xb1

      [  452.120319]  [<c154ccc4>] ? cpuidle_idle_call+0xb7/0x107

      [  452.120383]  [<c12a0519>] ? trace_hardirqs_off+0x11/0x25

      [  452.120448]  [<c1206856>] ? arch_cpu_idle+0xd/0x21

      [  452.120509]  [<c1265354>] ? cpu_startup_entry+0xc8/0x124

      [  452.120573]  [<c16acf6f>] ? rest_init+0x73/0x75

      [  452.120634]  [<c19d690e>] ? start_kernel+0x2e2/0x2e7

      [  452.120697]  [<c19d62ac>] ? i386_start_kernel+0x82/0x86

      [  452.120824] emmc_ipanic: panic notified

      [  452.120878] sdhci_pci_power_up_host: host controller power up is done

       

      The I am write to the USB drive the faster the kernel will panic. It seems like there is an issue with the onboard mmc controller as well. I have tried this with multiple Edisons and multiple usb drives. I am trying to under what triggers this and what can be down about it.

        • 1. Re: USB Linux Kernel Panic
          Intel Corporation
          This message was posted on behalf of Intel Corporation

          Hello mdergosits,

          Thank you for interest in the Intel® Edison Breakout Board.

          I would like to try and reproduce the issue you are experiencing.

          Could you please provide the details that surround your project (image, external circuitry, additional hardware, commands needed to reproduce the issue, connections, power supply, etc.)?

          Any visual aid that you think may help us gain better understanding of the issue is appreciated.

          I’ll be waiting for your response.

          Regards,
          Andres V.

          • 2. Re: USB Linux Kernel Panic
            mdergosits

            steps to reproduce:

             

            1. attach a usb hub to one of the edison's usb ports

            2. attach a thumb drive to one of the usb ports and format it as ext4

            3. attach a uvc camera to another usb port

            4. mount the usb drive to /tmp/usb

            5. use ffmpeg to record to the usb drive `ffmpeg -f v4l2 -pixel_format h264 -video_size 1920x1080 -r 30 -vcodec h264 -i /dev/video1 -vcodec copy -reset_timestamps 1 -f segment -strftime 1 -format mp4 -segment_time 2 /tmp/usb/%s.mp4`

            6. head -c 1000000 /dev/urandom > rand.txt

            7. write to the usb drive in a loop "while true; do cp rand.txt /tmp/usb/rand-`date +'%s'`; done"

            • 3. Re: USB Linux Kernel Panic
              Intel Corporation
              This message was posted on behalf of Intel Corporation

              Hello mdergosits, 

              Thank you for sharing the steps requested.

              I will now proceed to try to reproduce the issue you are experiencing.

              As soon as I find something that you may find relevant, I’ll post it here.

              Have a nice day.

              Regards,
              Andres V.

              • 4. Re: USB Linux Kernel Panic
                Intel Corporation
                This message was posted on behalf of Intel Corporation

                Hello mdergosits,

                Sorry for the late reply.

                I just wanted to inform you that I’m still trying to reproduce the issue that you are experiencing.

                I'm experiencing a little issue, when calling the ffmpeg command, I don’t know if it is a matter of the parameters (that may not be compatible with the camera I’m using) or if I’m calling the command in a different directory, but no data is transferred to the USB drive.

                Could you please share more details about how are you calling the ffmpeg command so I can try and reproduce it? And please share the specifications of your camera in order to use a similar one or adapt the parameters to the one I have.

                Also, could you please share a picture of your setup?

                Thank you for your patience.

                I’ll be waiting for your response.

                Regards,
                Andres V.

                • 5. Re: USB Linux Kernel Panic
                  Intel Corporation
                  This message was posted on behalf of Intel Corporation

                  Hello mdergosits,

                  I was wondering if you had the chance to check the questions asked.

                  If you have any other question or update, don’t hesitate to contact us.

                  Regards,
                  Andres V.

                  • 6. Re: USB Linux Kernel Panic
                    mdergosits

                    I believe this issue was caused because the camera and usb drive were drawing to much power and browned out from the current limiting that our system was performing. This caused the usb drive to disconnected, which caused the ext4 and usb drivers to log messages many times. All these messages were being logged to our serial port which ended up taking a large amount of the CPU.  See these lines below:

                    [  452.117198]  [<c147b1d6>] serial_hsu_command+0x30b/0x720

                    [  452.117229]  [<c147c2bb>] serial_hsu_tasklet+0x1a/0x2d

                    When we turned off logging to the serial console

                    echo 0 > /sys/kernel/debug/printk

                    we didn't see the issue.  Additionally I applied a kernel patch:

                    From bf5e027d4fe446628073baa8988f026dab1476e5 Mon Sep 17 00:00:00 2001

                    From: Auto Configured <auto.configured>

                    Date: Tue, 4 Apr 2017 15:25:20 -0700

                    Subject: [PATCH] fs: ratelimit some printk statements, mostly in ext4

                     

                     

                    These spewed a lot of messages and caused a panic

                    ---

                    fs/buffer.c       | 2 +-

                    fs/ext4/page-io.c | 2 +-

                    fs/ext4/super.c   | 2 +-

                    3 files changed, 3 insertions(+), 3 deletions(-)

                     

                     

                    diff --git a/fs/buffer.c b/fs/buffer.c

                    index 7e0240f..e7f7b4e 100644

                    --- a/fs/buffer.c

                    +++ b/fs/buffer.c

                    @@ -113,7 +113,7 @@ static int quiet_error(struct buffer_head *bh)

                    static void buffer_io_error(struct buffer_head *bh)

                    {

                      char b[BDEVNAME_SIZE];

                    - printk(KERN_ERR "Buffer I/O error on device %s, logical block %Lu\n",

                    + printk_ratelimited(KERN_ERR "Buffer I/O error on device %s, logical block %Lu\n",

                      bdevname(bh->b_bdev, b),

                      (unsigned long long)bh->b_blocknr);

                    }

                    diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c

                    index 4acf1f7..e7fbec5 100644

                    --- a/fs/ext4/page-io.c

                    +++ b/fs/ext4/page-io.c

                    @@ -214,7 +214,7 @@ ext4_io_end_t *ext4_init_io_end(struct inode *inode, gfp_t flags)

                    static void buffer_io_error(struct buffer_head *bh)

                    {

                      char b[BDEVNAME_SIZE];

                    - printk(KERN_ERR "Buffer I/O error on device %s, logical block %llu\n",

                    + printk_ratelimited(KERN_ERR "Buffer I/O error on device %s, logical block %llu\n",

                      bdevname(bh->b_bdev, b),

                      (unsigned long long)bh->b_blocknr);

                    }

                    diff --git a/fs/ext4/super.c b/fs/ext4/super.c

                    index 3f7c39e..c4667f2 100644

                    --- a/fs/ext4/super.c

                    +++ b/fs/ext4/super.c

                    @@ -601,7 +601,7 @@ void __ext4_warning(struct super_block *sb, const char *function,

                      va_start(args, fmt);

                      vaf.fmt = fmt;

                      vaf.va = &args;

                    - printk(KERN_WARNING "EXT4-fs warning (device %s): %s:%d: %pV\n",

                    + printk_ratelimited(KERN_WARNING "EXT4-fs warning (device %s): %s:%d: %pV\n",

                            sb->s_id, function, line, &vaf);

                      va_end(args);

                    }

                    --

                    2.7.4

                    which rate limited most of the messages that were causing the lions share of the output (over 4MB in one second). I still see panics occasionally though.

                    • 7. Re: USB Linux Kernel Panic
                      Intel Corporation
                      This message was posted on behalf of Intel Corporation

                      Hello mdergosits,

                      Thank you for sharing your experience with the community, we really appreciate it.

                      I think you are right regarding the relation between the kernel panics and the limited power. You should consider using an externally powered USB hub in your setup to lower the possibility of experiencing kernel panics.

                      If you have any other question or comment, don’t hesitate to contact us.

                      Have a nice day.

                      Regards,
                      Andres V.