5 Replies Latest reply on Feb 13, 2015 4:18 PM by David_Intel

    CentOS 6.6 Intel  SRCSAS18E

    Penguinpages

      I have been running an intel mobo with hardware RAID and 6 x 1TB Drives for about four years.  The server stopped booting a few weeks ago and I am trying to resurect it and if possible salvage the data. It was not even showing up in POST cycle as a controller, so I moved it to another board I had around and it shows up, but still not working correctly.    The original system was running CentOS 5.x and controller moved to new system running CENTOS 6.6

       

      Issue:  RAID SRCSAS18E RAID 5 Volume not showing up.  Cannot do "ctrl+G" from POST stage (it pauses for about 45seconds during post).  OS driver installed as well as CLI and web tool but controller does not show up to web tool, and when I issue command to re-flash (likly it would be an upgrade as I don't recall last time I did a firmware flash on it) it does not give me very useful response.

       

       

       

      Hardware:

      [root@titan1 ~]# lspci

      00:00.0 Host bridge: NVIDIA Corporation C55 Host Bridge (rev a2)

      00:00.1 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

      00:00.2 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

      00:00.3 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

      00:00.4 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

      00:00.5 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a2)

      00:00.6 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

      00:00.7 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

      00:01.0 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

      00:01.1 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

      00:01.2 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

      00:01.3 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

      00:01.4 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

      00:01.5 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

      00:01.6 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

      00:02.0 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

      00:02.1 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

      00:02.2 RAM memory: NVIDIA Corporation C55 Memory Controller (rev a1)

      00:03.0 PCI bridge: NVIDIA Corporation C55 PCI Express bridge (rev a1)

      00:06.0 PCI bridge: NVIDIA Corporation C55 PCI Express bridge (rev a1)

      00:07.0 PCI bridge: NVIDIA Corporation C55 PCI Express bridge (rev a1)

      00:09.0 RAM memory: NVIDIA Corporation MCP51 Host Bridge (rev a2)

      00:0a.0 ISA bridge: NVIDIA Corporation MCP51 LPC Bridge (rev a3)

      00:0a.1 SMBus: NVIDIA Corporation MCP51 SMBus (rev a3)

      00:0a.2 RAM memory: NVIDIA Corporation MCP51 Memory Controller 0 (rev a3)

      00:0b.0 USB controller: NVIDIA Corporation MCP51 USB Controller (rev a3)

      00:0b.1 USB controller: NVIDIA Corporation MCP51 USB Controller (rev a3)

      00:0d.0 IDE interface: NVIDIA Corporation MCP51 IDE (rev a1)

      00:0e.0 RAID bus controller: NVIDIA Corporation MCP51 Serial ATA Controller (rev a1)

      00:0f.0 RAID bus controller: NVIDIA Corporation MCP51 Serial ATA Controller (rev a1)

      00:10.0 PCI bridge: NVIDIA Corporation MCP51 PCI Bridge (rev a2)

      00:10.1 Audio device: NVIDIA Corporation MCP51 High Definition Audio (rev a2)

      00:14.0 Bridge: NVIDIA Corporation MCP51 Ethernet Controller (rev a3)

      01:00.0 PCI bridge: Intel Corporation 80333 Segment-A PCI Express-to-PCI Express Bridge

      01:00.2 PCI bridge: Intel Corporation 80333 Segment-B PCI Express-to-PCI Express Bridge

      02:0e.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1068

      06:08.0 VGA compatible controller: Matrox Electronics Systems Ltd. MGA 1064SG [Mystique] (rev 02)

      06:0a.0 Ethernet controller: Intel Corporation 82544EI Gigabit Ethernet Controller (Copper) (rev 02)

      [root@titan1 ~]# lspci -vv

      <snip>

      02:0e.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 1068

              Subsystem: Intel Corporation RAID Controller SRCSAS18E

              Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ SERR- FastB2B- DisINTx-

              Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-

              Interrupt: pin A routed to IRQ 16

              Region 0: Memory at cfef0000 (32-bit, prefetchable) [size=64K]

              Region 2: Memory at cfdc0000 (32-bit, non-prefetchable) [size=128K]

              [virtual] Expansion ROM at cfe00000 [disabled] [size=32K]

              Capabilities: [c0] Power Management version 2

                      Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)

                      Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-

              Capabilities: [d0] MSI: Enable- Count=1/2 Maskable- 64bit+

                      Address: 0000000000000000  Data: 0000

              Capabilities: [e0] PCI-X non-bridge device

                      Command: DPERE- ERO- RBC=512 OST=4

                      Status: Dev=02:0e.0 64bit+ 133MHz+ SCD- USC- DC=bridge DMMRBC=1024 DMOST=4 DMCRS=16 RSCEM- 266MHz- 533MHz-

              Kernel modules: megaraid_sas

       

      [root@titan1 CmdTool2]# dmesg |less

       

       

      <snip>

      scsi 2:0:0:0: Direct-Access ATA  Hitachi HTS72322 FCDO PQ: 0 ANSI: 5

      ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)

      ata4.00: ATA-8: WDC WD2500BEKT-60PVMT0, 01.01A01, max UDMA/133

      ata4.00: 488397168 sectors, multi 16: LBA48 NCQ (depth 31/32)

      ata4.00: configured for UDMA/133

      scsi 3:0:0:0: Direct-Access ATA  WDC WD2500BEKT-6 01.0 PQ: 0 ANSI: 5

      ata6: SATA link down (SStatus 0 SControl 300)

      megasas: 06.803.01.00-rh1 Mon. Mar. 10 17:00:00 PDT 2014

      megasas: 0x1000:0x0411:0x8086:0x1001: bus 2:slot 14:func 0

      megaraid_sas 0000:02:0e.0: enabling device (0080 -> 0082)

      ACPI: PCI Interrupt Link [AXV7] enabled at IRQ 16

        alloc irq_desc for 16 on node -1

        alloc kstat_irqs on node -1

      megaraid_sas 0000:02:0e.0: PCI INT A -> Link[AXV7] -> GSI 16 (level, low) -> IRQ 16

      megasas: Waiting for FW to come to ready state

      sr0: scsi3-mmc drive: 40x/40x writer cd/rw xa/form2 cdda tray

      Uniform CD-ROM driver Revision: 3.20

      sr 0:0:1:0: Attached scsi CD-ROM sr0

      STARTING CRC_T10DIF

      sd 2:0:0:0: [sda] 488397168 512-byte logical blocks: (250 GB/232 GiB)

      sd 3:0:0:0: [sdb] 488397168 512-byte logical blocks: (250 GB/232 GiB)

      sd 2:0:0:0: [sda] Write Protect is off

      sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00

      sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

      sd 3:0:0:0: [sdb] Write Protect is off

      sd 3:0:0:0: [sdb] Mode Sense: 00 3a 00 00

      sd 3:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

      sda:

      sdb: sdb1 sdb2

      sd 3:0:0:0: [sdb] Attached SCSI disk

      sda1 sda2

      sd 2:0:0:0: [sda] Attached SCSI disk

      INFO: task modprobe:351 blocked for more than 120 seconds.

            Not tainted 2.6.32-504.3.3.el6.x86_64 #1

      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

      modprobe      D 0000000000000000     0   351      1 0x00000000

      ffff880237281eb8 0000000000000082 0000000000000000 0000000000000001

      ffff880237281e18 ffffffff8115c876 0000002a3dab586f ffffffff810bfcff

      ffff880237281e48 00000000fffe3007 ffff88023725baf8 ffff880237281fd8

      Call Trace:

      [<ffffffff8115c876>] ? vfree+0x36/0x80

      [<ffffffff810bfcff>] ? load_module+0x1abf/0x1cd0

      [<ffffffffa003d000>] ? wait_scan_init+0x0/0xd [scsi_wait_scan]

      [<ffffffff8136cbe5>] wait_for_device_probe+0x55/0x90

      [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40

      [<ffffffff810a5155>] ? __blocking_notifier_call_chain+0x65/0x80

      [<ffffffffa003d009>] wait_scan_init+0x9/0xd [scsi_wait_scan]

      [<ffffffff8100204c>] do_one_initcall+0x3c/0x1d0

      [<ffffffff810bfff1>] sys_init_module+0xe1/0x250

      [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b

      FW state [0] hasn't changed in 180 secs

      pcidata = 30400

      megaraid_sas 0000:02:0e.0: megasas: FW restarted successfully from megasas_init_fw!

      megasas: Waiting for FW to come to ready state

      INFO: task modprobe:351 blocked for more than 120 seconds.

            Not tainted 2.6.32-504.3.3.el6.x86_64 #1

      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

      modprobe      D 0000000000000000     0   351      1 0x00000000

      ffff880237281eb8 0000000000000082 0000000000000000 0000000000000001

      ffff880237281e18 ffffffff8115c876 0000002a3dab586f ffffffff810bfcff

      ffff880237281e48 00000000fffe3007 ffff88023725baf8 ffff880237281fd8

      Call Trace:

      [<ffffffff8115c876>] ? vfree+0x36/0x80

      [<ffffffff810bfcff>] ? load_module+0x1abf/0x1cd0

      [<ffffffffa003d000>] ? wait_scan_init+0x0/0xd [scsi_wait_scan]

      [<ffffffff8136cbe5>] wait_for_device_probe+0x55/0x90

      [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40

      [<ffffffff810a5155>] ? __blocking_notifier_call_chain+0x65/0x80

      [<ffffffffa003d009>] wait_scan_init+0x9/0xd [scsi_wait_scan]

      [<ffffffff8100204c>] do_one_initcall+0x3c/0x1d0

      [<ffffffff810bfff1>] sys_init_module+0xe1/0x250

      [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b

      INFO: task modprobe:351 blocked for more than 120 seconds.

            Not tainted 2.6.32-504.3.3.el6.x86_64 #1

      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

      modprobe      D 0000000000000000     0   351      1 0x00000000

      ffff880237281eb8 0000000000000082 0000000000000000 0000000000000001

      ffff880237281e18 ffffffff8115c876 0000002a3dab586f ffffffff810bfcff

      ffff880237281e48 00000000fffe3007 ffff88023725baf8 ffff880237281fd8

      Call Trace:

      [<ffffffff8115c876>] ? vfree+0x36/0x80

      [<ffffffff810bfcff>] ? load_module+0x1abf/0x1cd0

      [<ffffffffa003d000>] ? wait_scan_init+0x0/0xd [scsi_wait_scan]

      [<ffffffff8136cbe5>] wait_for_device_probe+0x55/0x90

      [<ffffffff8109eb00>] ? autoremove_wake_function+0x0/0x40

      <snip>

       

       

       

       

       

      I don't think any of the above messages from kernel are any more then long time out values for drives. That is odd but ??   I can post full kernel message output if needed.....

       

       

      Installed Intel RAID services / tools:

      [root@titan1 RAID]# ls

      ir3_1068SASHWR_FW_v1.12.280-0826_pkg-v7.0.1-0075.zip  ir3_Linux_x86_RWC2_v14.05.02.03.tar.gz  Linux_x64_RWC2_v14.08.01-04.tar.gz

      ir3_CmdTool2_Linux_v8.07.16.zip                       ir3_UEFI_CmdTool2_v2.03.00.s6.zip       MR_Linux_drv_v6.705.07.00.tgz

       

      [root@titan1 RAID]# cd /usr/local/RAID\ Web\ Console\ 2/

      [root@titan1 RAID Web Console 2]# ./start

      starthelp.sh         startmonitorhelp.sh  startupui.sh

      [root@titan1 RAID Web Console 2]# ./startupui.sh

      Messave above is just lack of connection from web UI to ?? agent?  What I am not clear on is what agent or service can I check for to be running such that this UI could connect to?

       

       

      Attempt to flash firmware to controller:

      [root@titan1 tmp]# cp /root/RAID/ir3_1068SASHWR_FW_v1.12.280-0826_pkg-v7.0.1-0075.zip .

      [root@titan1 tmp]# unzip ir3_1068SASHWR_FW_v1.12.280-0826_pkg-v7.0.1-0075.zip

      Archive:  ir3_1068SASHWR_FW_v1.12.280-0826_pkg-v7.0.1-0075.zip

        inflating: update.nsh

         creating: CmdTool2/DOS/

        inflating: CmdTool2/DOS/CmdTool2.exe

        inflating: CmdTool2/DOS/CMDTool2_DOS_v8.00.11_rel-notes.txt

        inflating: CmdTool2/DOS/LICENSE_DOS32A.txt

         creating: CmdTool2/Linux/

        inflating: CmdTool2/Linux/CmdTool2-8.00.13-1.i386.rpm

        inflating: CmdTool2/Linux/CMDTool2_Linux_v8.00.13_rel-notes.txt

        inflating: CmdTool2/Linux/Lib_Utils-1.00-07.noarch.rpm

        inflating: CmdTool2/Linux/Lib_Utils2-1.00-01.noarch.rpm

         creating: CmdTool2/Solaris/

        inflating: CmdTool2/Solaris/CmdTool2

        inflating: CmdTool2/Solaris/CmdTool2.pkg

        inflating: CmdTool2/Solaris/CMDTool2_Solaris_v8.00.06_rel-notes.txt

         creating: CmdTool2/UEFI/

        inflating: CmdTool2/UEFI/CmdTool2.efi

        inflating: CmdTool2/UEFI/CMDTool2_UEFI_v2.01.00.S6_rel-notes.txt

         creating: CmdTool2/Windows/

        inflating: CmdTool2/Windows/CmdTool2.exe

        inflating: CmdTool2/Windows/CmdTool2Support.zip

        inflating: CmdTool2/Windows/CMDTool2_Windows_v8.00.11_rel-notes.txt

        inflating: 68_fw826.rom

        inflating: ir3_1068SASHWR_Firmware_v1.12.280-0826_readme.txt

        inflating: License_v2.pdf

        inflating: update.bat

        inflating: 68_fw826_4MB.rom

      [root@titan1 tmp]#

      [root@titan1 tmp]# mv *.rom /opt/MegaRAID/CmdTool2/

      [root@titan1 CmdTool2]# ./CmdTool264  -adpfwflash -f 68_fw826.rom

      Invalid input at or near token 68_fw826.rom

       

      Exit Code: 0x01

      [root@titan1 CmdTool2]#

       

       

       

      Questions:

       

      1) Has anyone any experiance and futher direction on how to debug this further?

      2) Does anyone have this RAID controller running CentOS / RHEL 6.6?

      3) The inabiltiy to do <ctrl +G> and UEFI is not good. What I do have is a note in my system change control about something like this which noted to get by it by "disabling all other controller BIOS on motherboard" .  I would like to validate if others have this issue, and or work around for this.

       

      Thanks,

        • 1. Re: CentOS 6.6 Intel  SRCSAS18E
          sylvia_intel

          Hello Penguinpages, thanks a lot for posting your question at Intel communities.

          I would like to inform you that I have moved your post to the Server Community at the following URL https://communities.intel.com/community/tech/servers

           

          Regards,

          • 2. Re: CentOS 6.6 Intel  SRCSAS18E
            David_Intel

            We have done validation to Red Hat Operating System up to 5.0 Update 3. My concern is that if the RAID controller is not initializing properly you might not be able to access the volume.

             

            I would probably disconnect the data cable from the RAID controller and see if you can access the RAID BIOS (Ctrl+G) without the drives. You might as well want to try the firmware update with the drives disconnected.

            • 3. Re: CentOS 6.6 Intel  SRCSAS18E
              Penguinpages

              I will try unplugging the drive connections and see if I can get into <ctrl+g>  / RAID BIOS.

               

              I did also note your comment about "Tested to CentOS 5.3"  but the website does not RHEL (CENTOS) 6 drives... hense why I moved it to a system which had CENTOS 6.6 already running so I could use that and avoid driver / base load os issues.     https://downloadcenter.intel.com/SearchResult.aspx?lang=eng&keyword=srcsas18e+

               

              Please validate that CentOS 6.x is valid.

              *****************

              MR_Linux_drv_v6.705.07.00.tgz

               

              ==========================

              Supported RAID Controllers

              ==========================

              <snip> SRCSAS18E*,

               

              ===================

              Package Information

              ===================

              Driver Version = 6.705.07.00

               

              Driver update for RHEL 5.11, RHEL 6.6, SLES 12

               

              *****************

              Thanks,

              • 4. Re: CentOS 6.6 Intel  SRCSAS18E
                Penguinpages

                I know it has been several weeks on this thread but I have been plodding away on this.  I have tried every way I can think of toe resurect this card.  1) Rule out USB keyboard not sending Ctrl+G by digging out old PS2  2) removing all drives from SATA ports  3) Unpluging all devices from motherboard  that could interupt bus  4) disable all components in motherboard that could interupt  5) Try two different motherboards.   6) Pull cache upgrade off card to see if it would boot without cache (though this just ended in the card not even showing up on POST cycles)

                 

                I spent $500+ on this card only a few years ago... so I am really trying not to give up on it.  Any other suggestions? 

                • 5. Re: CentOS 6.6 Intel  SRCSAS18E
                  David_Intel

                  I believe you have exhausted all possible troubleshooting steps to bring the controller back to function properly. I would recommend contacting our Intel® Customer Support team for a proper follow up of this situation.