14 Replies Latest reply: Oct 26, 2012 12:58 PM by Michael Gregg RSS

IBM x3550 M2, x3650 M2, HS22, and dx360

MichaelBrinkman-IBM Community Member
Currently Being Moderated

Wanted to know if anyone here has tried out one IBM's new servers.  I was on the team that developed the new UEFI code stack for these servers and would like to get first impressions from the community.  This was a large effort and will provide many avenues for innovation as we go forward.  I would be also interested in any new functions that you would like to have in the pre-boot or systems management areas of your system.

 

Michael Brinkman

Mgr. UEFI Team.

  • 1. Re: IBM x3550 M2, x3650 M2, HS22, and dx360
    jwise@arrow.com Community Member
    Currently Being Moderated

    Yes.. .dozens of times:)


    ****

    Background:

    I am a Senior Solutions Architect who helps our Engineers at our integration facility to find resources to help when new product is having issues, or where complex designs require a more solutions based set of technical resources. I only supliment the already great and talented resources who debug and work every day with the IBM Modular product line (Intel). Any time a new product is released, I ask to be informed so I can come get some 'hands on' the new systems so I can keep my knowledge of the changing technologies sharp.

    ***

     

    The change to the Nehalem processor marks a major change for anyone who has to deal with integration. I had been getting feedback from out facility of more then "the usual issues" of a new product so I made time to go over and take an example and document it. This may not reflect all issues and as any new product is released, is sure to be worked out. I have to say up-front that the issues are NOT in any way sestemic of a core technology issue or even critical in nature. IBM and Intel have for years done very well at avoiding and midigating those level issues. What is reflected below are things that, IMAO, need to be reviewed to see if there are ways to make improvement. These oppinions reflect the integration and setup aspect of the system and also do not capture long term runtime experiance (always looking for demo gear to prove that out for our labs

     

     

    ****************

    System: HS22 Model 7870

    Product ID: 7870C3U

    SN: (not listed to protect the inocent)

     

    System Specs:

    1Socket (quad core)

    RAM

    PC3-10600 PN:43x5061 (qty 6)

     

    Areas of Improvement:

    1) System Integration times: These are how long it takes us from time of unboxing to where we can test, update and ship a box. This "T Time" has more than doubled. This impacts how quickly we can meet customer demands for shipping. Most of the issues stem from boot times. From the time the hardware finished being installed and power is activated to the point where a technical person can interact with the system for loading (or updating). This has more than doubled and has the greatest impact on how long it takes to debug and root cause a hardware issue.

                   Blade placed into chassis and powered: 0.00

                   Chassis Acceptance of Blade: 2:12

                   First screen: 34sec

                   BIOS Options: 52sec

                   Legacy boot Option (select media for loading): 2:24

                   (Total Time to get to be able to select boot target to load OS: 3:16)

     

    2) System Boot Error from factory and (I believe the cause of long boot times after initial flash) "PXE-99: Unexpected Network Error".  I did not have time to further root cause this with any kind of sniffer on the Blade Switch.

     

    3) RAM Debug Issues: With the new processors binding to RAM, their is a "difficult" process to debug bad RAM and to replace it. Due to the long boot cycles, if the system is NOT being stable, RAM is the first (and most usually) the culperate. Errors on RAM with the new form factor are definatly higher. The comments I have relate to a few things about how this impacts the Nehalem systems. BIOS does not always indicate slot and so 16DIMMs x 3:16min per boot cycle make root cause of DIMM very laborious. When a DIMM is noted as being bad, it can be easily replaced BUT the system locks out the slot. This requires that the battery is pulled from the system board for a 10sec or so to clear the error. This also has impacts on EFI / BIOS settings and configurations that will cause issues for customers. This also impacts systems with the "minimum" DIMM configuration sets where they will see an impact in system core speed and boot capabilities due to the RAM failure. (I did not personally see this but errors are indicated in the BIOS as to the slot and I saw no indicators on the planar board for indication of bad DIMM, such as light path, I need to confirm this, and document how a customer can check if their system "changes" are due to bad DIMM)

     

     

    2) IPMI Issues: Many customers us IPMI to manage and monitor system components. The Blade Managment Module does have the ability to affect the IPMI definition of the HS22 systems, nor can you make the changes (view only at this time) in the BIOS.This is even more of an issue when the systems are laid down with VMWare Hypervisor which does not have an OS level agent to report system events back and so relies purly upon IPMI for reporting and statistics. This is actually such an issue that it is in debate if customers will accept shipment of product as it is not "properly configured". {Ex: BSMP IP Address Range 192.168.70.200  host IP was set but can not change to reflect our customer's IP range set of 10.x.x.x)


    3) Missing Serial Numbers: We still on occation receive units which have no Serial number. When you insert them into the chasis, the managment module shows blank data for the model and serial number. The only fix at this time due to the IPMI issue listed above is to declare the unit 'bad" and replace the entire system. This imediatly throughs an entire day of productivity out, at best... assuming we have spares in stock. Though I did not see this process, they had this same issue on the x3650M2 Nehalem systems but there is a fix of using a flash utility from IBM. The issue with the blade units is being worked on by IBM but no ETA.

     

     

    4) Logic and understanding of EFI and its relation to "Legacy boot" option. It is likly a learning issue, but in the last 10min I had that I could work on this topic I tried to learn how to make some of what I would classify as basic changes to the system to impact boot times and boot targets (such as SAN boot, iSCSI targets, setting BOOT for ONLY Fibre HBA and then impact the boot option for the ONE time that the initial OS would be loaded). I was not able to lay this out and need to play with it. The Engineer did explain that there seems to be some level of reliance on the "Legacy boot" option that has to be used and worked with that has not been fully worked out.

     

     

    Good things: I like to make sure to also take note of things the Engineers say "that is great!"

    1) RAID Capabilities in BIOS saves enormous amounts of time and can allow use to quickly setup what is almost always basic RAID 1 for hardware integration and testing

     

    2) System performance is VERY impressive. I have not had any HPC builds to do any "kicking of the tires" on my own but the system performance for the little time I have had my hands on them (post BOOT) have very fast load and application times for VMs (which is much of what we design.

     

    3) Legacy boot option options are still very helpful.

     

     

     

    I hope this helps. Look me up if you have any additional questions or points of clarifications you are needing.

  • 2. Re: IBM x3550 M2, x3650 M2, HS22, and dx360
    masbe Community Member
    Currently Being Moderated

    I have limited experience with the x3650 M2. Unfortunately I haven't had the opportunity to explore all of its features, so the only comment I'll make is with regard to the time it takes to a) enable the power button so the server can be powered on and b) the time taken to load the EFI: each of these takes longer than I would like and has made troubleshooting DIMM problems very time consuming. Sorry to be negative: if I have anything positive to comment in the future I'll post again!

  • 3. Re: IBM x3550 M2, x3650 M2, HS22, and dx360
    MichaelBrinkman-IBM Community Member
    Currently Being Moderated

    Thank you for your feedback.  We have identified boot time from an AC or DC boot cycle as an issue and have been working to improve this.  We have made some incremental improvements in this area in our 2Q and soon to be released 3Q maintenance drop, but we expect to make large improvement in boot time in 4Q.

     

    As for your issue with debuging DIMM's, please refer to white paper we just released on this subject:

     

    "Troubleshooting Memory on IBM System x and BladeCenter Servers"

     

    https://www-947.ibm.com/systems/support/supportsite.wss/docdisplay?lndocid=MIGR-5081319&brandind=5000008

     

    Sincerely,

     

    Michael Brinkman

  • 4. Re: IBM x3550 M2, x3650 M2, HS22, and dx360
    Community Member
    Currently Being Moderated

    Hi Michael,

     

    I have installed many of these servers being a Systems X & Storage Technical specialist at one of the IBM Larger partners. I had mainly faced two problems with the new EFI:

     

    1- Boot time is too slow. It take about 20 minutes to boot a VMware ESX when EFI is used, compared to less than 2 minutes with the normal bios.

    2- Boot from SAN specially with Qlogic cards does not seems to make life any easier. I had even got to write a post on how to get this setup trying to help our customers through the process. The post can be found at:

    The file ql2300.sys is corrupted. press any key to continue.

     

    Though I can tell that the performance of the new systems after they boot up is about 1.5 times what the earlier models used to give. I am comparing the performance as a VMware setup where I can install about 1.5 times the time of VMs at an HS22 to what I used to get on HS21XM.

     

    I hope that help,

    Eiad

  • 5. Re: IBM x3550 M2, x3650 M2, HS22, and dx360
    MichaelBrinkman-IBM Community Member
    Currently Being Moderated

    Thanks for your feedback.  I woul like to know your boot performance when we drop our Oct code drop.

  • 6. Re: IBM x3550 M2, x3650 M2, HS22, and dx360
    virtualpete Community Member
    Currently Being Moderated

    Hi,

     

    working on a couple of x3550 M2's in a remote co-location, dont have kvm hardware, or ibm director in this case - so have been relying on the remote admin features of imm.    would have loved to use the remote console but the requirements needed to get a java webstart app going are impossible for me, ie that the host im running the browser on can't get to the internet, and due to site rules there's no chance of fixing that - so the remote console feature is basically useless for now, could someone consider this scenario maybe come up with something that doesnt need to pull .jar files down from t'internet?   ... anyway, we're using redhat, so we have the option of a serial console at least and I did figure out how com2 is accessible via the cli imm - 'console 1' command, so i prevailed.

     

    Our host was shipped to us 3 weeks ago and had such old firmware on it that it really wasnt workable, perhaps you could push an alert up the supply chain to try and ensure units are not shipped out with known bad firmware - i spent the best part of a day with hung imm, that died simply because i rebooted the host OS, arranging remote staff to powercycle boxes, applying firmware and waiting 5-15 minutes each restart until i got them to a stable configuration - it's not the sort of experience that motivates customers to buy another 100 or so units.

     

    anyway - now the servers are at these levels....

    IMMYUOO32F-2009/08/2608/26/2009
    UEFID6E128A-2009/08/2008/20/2009
    DSADSYT19A-2009/08/2008/20/2009

     

    able to boot/build/manage ok - but looking forward to your speedups in Q4 release.. - hopefully you can do something with the warm-reboot times - there seems to be about 5 minutes of dead time during a simple reboot that is not desirable. maybe if i was looking at the console i'd see what the holdup is...

     

    one other thing i've noticed is that the 'onetime PXE Network Boot' option on the http imm interface does not work for me (and BTW cant find it on the CLI IMM) - documentation refers to certain conditions needing to be met, but does not explain what these are - maybe its a bug, maybe its something im doing wrong, so hoping your documentation people are planning on adding some more detail on this feature. i had to go down the path of getting someone at remote site to use F1 on console to set BootOption.BootOption to "PXE Network=Hard Disk 0", build off PXE, then reboot with the DHCP and tftp server disabled, wait about 10 minutes for PXE to try 12 or so times on each network interface before proceeding to the hard disk boot, before being able to fix BootOrder with asu - this is because I can't for the life of me find where you can modify the uefi settings via the IMM.

     

    and also - while i think of it - is there a way to cut back the access rights of asu and MegaCli to just readonly?   im worried that a mischeivous individual on a compromised system can do things like 'reset to factory' on the raid controller or uefi - it seems the only option open to me currently is to disable the USB interconnect to IMM completely, but that doesnt help with the RAID.

     

    otherwise - love your work.   keep it up...

  • 7. Re: IBM x3550 M2, x3650 M2, HS22, and dx360
    pmcfee Community Member
    Currently Being Moderated

    Just got done getting the HS22 to Boot From SAN (SVC). We recently received a HS22 blade (7870-AC1) and tried to configure it to Boot From SAN. After several attempts of trying to configure the Qlogic card (QMI2572) a support call was placed to IBM since the Qlogic card was unable to save any settings. Another Qlogic card was also ordered to rule out a defective card and it had the same issue. I wouldn't even recommend trying to Boot From SAN without these firmware levels.

     

    Firmware levels should be at:
    BIOS 1.04, Build P9E130AUS
    Diagnostics 1.13, Build P9YT40A
    Blade Sys Mgmt Processor 1.05
    Qlogic QMI2572, BIOS Revision 2.08
    IMM (Integrated Management Module) 1.05, Build YU0032F

     

    Overall we were able to configure the HS22 to Boot From SAN but were only able to do so by loading an OS (internal drives) and applying the updates since Update Express didn't have the latest firmware for the Qlogic card. The time to deploy a blade has increased. The HS21 took less than 30 minutes to have up and runnning when booting from the SAN compared to the 2 1/2 hrs to deploy the HS22. Please share this email as needed.IBM x3550 M2, x3650 M2, HS22, and dx360

     

    Message was edited by: William Lea, adjusted to smaller font and removed Bold.

  • 8. Re: IBM x3550 M2, x3650 M2, HS22, and dx360
    MichaelBrinkman-IBM Community Member
    Currently Being Moderated

    We have just release a new white paper to help people tune their UEFI based system to speed up boot time.  Please review this document:

     

    http://www-947.ibm.com/systems/support/supportsite.wss/docdisplay?lndocid=MIGR-5083207&brandind=5000020

     

    Regards,

     

    Michael Brinkman

  • 9. Re: IBM x3550 M2, x3650 M2, HS22, and dx360
    sthu01 Community Member
    Currently Being Moderated

    Hi Mike;

     

    I have tried to install ESX 4.0 on IBM x3550 M2 server. As ESX 4.0 is non-uEFI aware OS, I have
    setup "Legacy Only" flag as per the white paper.For some reasons, the system failed to boot
    from CD/DVD Rom.

     

    The boot manager is setup as follows:

     

    Legacy Only
    CD/DVD Rom
    Floppy
    Hard Disk0

     

    I am sure the DVD media is good because I can be able to install on other system. Also
    the system can successfully boot from Windows 2008 (uEFI aware OS) DVD.

     

    Please advise

     

    Thanks
    -Soe

  • 10. Re: IBM x3550 M2, x3650 M2, HS22, and dx360
    Community Member
    Currently Being Moderated

    Critical updates released 12 months after market release are UNACCEPTABLE

    And bios/raid interface one of the most user unfriendly I have seen in 20 years.

    Attempting to download siad updates from 3 different sites also timed out. You have inadequate bandwidth for the job.

    And this is my second attempt to write this as i didn't get the verify code right and you guys couldn't even write a session variable to hold my comments.

    Overall an extemely disappointing result

    Critical updates released 12 months after market release are UNACCEPTABLE

    And bios/raid interface one of the most user unfriendly I have seen in 20 years.

    Attempting to download siad updates from 3 different sites also timed out. You have inadequate bandwidth for the job.

    And this is my second attempt to write this, as i didn't get the verify code right and you guys couldn't even write a session variable to hold my comments.

    Overall an extemely disappointing result

  • 11. Re: IBM x3550 M2, x3650 M2, HS22, and dx360
    Currently Being Moderated

    I would like to agree with Unbiased on the stupidity. Of this supposedly new age system.

     

    I have had one dumped on me to remove VMWare and put Server 2008 on it.

     

    The thing will not boot off the DVD Unless it is a VMWare DVD.

     

    After hours of thralling through IBM's useless junk I hit Google, After at east an hour of reading others with the same Issue I find this http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=MIGR-5085881&brandind=5000020 Which tells me the install two MS apps before I install mt new OS. Now being an MS tech I do know how to do this. But where on this page is IBM's account details for issueing the invoice for the time I had to waste due to their exceptionally intersesting systems and installations techniques. Quite frankly any Solution that any company pinned to the WWW that required the Client to hire an MS certified tech to spend hours and hours fixing their own buggy junk should be disbarred from ever selling sdaid junk again.

     

    I realise this post will go wholly unnoticed but that will not stop me bagging the manufactures of this buggy junk till they stop or go bankrupt. I think the funny thing this time is the owner of said system happens to be a very large IBM dealer. When I explain what the issue is I am sure they will annoy there upstream people till they get some thing for free. So this one will cost you IBM, It will cost me too but at least it will cost you.

     

    Now off to see if I can find a way of fixing IBM's stuff up without having to dismantle rack and return to workshop or spend hours in car back and forth trying to get the boot version right.

     

    Hope others read this and realise they to have been duped.

  • 12. Re: IBM x3550 M2, x3650 M2, HS22, and dx360
    Currently Being Moderated

    I am using the following on a new  IBM x3550 M3  (12GB ram, 2.266GHz Xenon processor)

    Firmware TypeVersion StringRelease Date
    IMMYUOOC7E09/30/2011
    UEFID6E154A09/23/2011
    DSADSYT89P10/28/2011

    I must say I am sorely dissapointed with the "speed" of USB booting in the legacy BIOS mode in the IBM UEFI implementation.

    Basically a "similar CPU" in a Sun X4275 will boot a 275 MB usb key image in just 32 seconds, while the IBM x3550 M3 takes over 363 seconds for the same image. Measuring the IBM form the time it starts a legacy USB key boot until I get an OS prompt is ridiculously long.


     

      BEG: 1:27:05 pm (start SmartOS USB 2.0 USB key)
      END: 1:33:38 pm (done into running Solaris 11)
      ---
      TOOK:   6:33    (six minutes and 33 seconds - pretty slow - only 0.75MB/sec.)

     

    It is almost as if the UEFI implementation uses a tiny block size like 512 byte read, rather than a larger buffer during reads.  Once I am in the OS I can benchmark the performance of the USB key I booted off, IMHO if the IBM UEFI code read a 8192 or better yet a 32768 block size booting would be super fast.

     

    So in a Solaris 11 operating systems we see the following performance characteristics for my USB key, ranging form 512 byte to 131072 bytes.  Looks like either  8192 (12.3 MB/sec in a booted OS) or better yet a 32768 (20.2 MB/sec in a booted OS)  would be a nice read size.  It also looks like a 512 block size (0.64 MB/sec in a booted OS) matches the results I seem to experience in my lengthy boots.

     

    time dd if=/dev/dsk/c1t0d0p0 of=/dev/null bs=512 count=524288

        524288+0 records in

        524288+0 records out

        real 31m19.499s

        => 00.64MB/sec. on  Solaris 11  (this is the speed of the IBM bios boot speed)


    time dd if=/dev/dsk/c1t0d0p0 of=/dev/null bs=1024 count=262144

        262144+0 records in

        262144+0 records out

        real 1m39.989s

        => 02.56MB/sec.Solaris 11


    time dd if=/dev/dsk/c1t0d0p0 of=/dev/null bs=2048 count=131072

        131072+0 records in

        131072+0 records out

        real 0m50.215s

        => 05.09MB/sec. Solaris 11


    time dd if=/dev/dsk/c1t0d0p0 of=/dev/null bs=4096 count=65536

        65536+0 records in

        65536+0 records out

        real 0m33.056s

        => 07.74MB/sec. Solaris 11


    time dd if=/dev/dsk/c1t0d0p0 of=/dev/null bs=8192 count=32768

        32768+0 records in

        32768+0 records out

        real 0m20.757s

        => 12.33MB/sec. Solaris 11


    time dd if=/dev/dsk/c1t0d0p0 of=/dev/null bs=32768 count=8192

        8192+0 records in

        8192+0 records out

        real 0m12.785s

        => 20.02MB/sec. on smartos (as expeected and seen on a Win7 box)


    time dd if=/dev/dsk/c1t0d0p0 of=/dev/null bs=131072 count=2048

        2048+0 records in

        2048+0 records out

        real 0m11.532s

        => 22.19MB/sec. Solaris 11

    Making a UEFI compliant USB key might be possible but seems like a lot of effort, and of course the "same slow" USB key read issue might also exist in the UEFI implementation.   Others complain about slow UEFI boot from USB keys refer to google searches like:

     

    ESXi 4 hypervisor slow usb boot +IBM +"system x"

     

    +IBM +"system x" slow usb boot

     

    The only reference to a work around is a cryptic document section "XSW02525-USEN-00", or Introducing UEFI-Compliant Firmware on IBM System x and BladeCenter Servers, which doesn't really say anything useful.

     

    Jon Strabala

  • 13. Re: IBM x3550 M2, x3650 M2, HS22, and dx360
    Currently Being Moderated

    I'm installing Centos on a new IBM x3250 M4. The machine itself is nice. However ive never seen a server take sooo long to boot!  Its a good 6 minutes before it even starts to load from cd or installed os. Surely IBM management modules shouldnt take that long?

  • 14. Re: IBM x3550 M2, x3650 M2, HS22, and dx360
    Currently Being Moderated

    I received 20 x3650 M4's last week.

     

    I think that my feedback would be about the time it takes for the system to post. It is incredibly long.

     

    We have a test system that these machines are supposed to be for that reboots and re-provisions these machines several times a day.

     

    These systems will typically take greater than 5 minutes to finish post. This is arduous to wait for. They even take longer that a minute or more to even power on.

     

    Is there a way to disable most of what they are doing to get to a sub minute post time?

     

    I have called IBM support about this already, and they seem to suggest there is nothing that I can do about this.

More Like This

  • Retrieving data ...