9 Replies Latest reply on May 5, 2011 1:16 PM by JayDubya

    Scheduled Optimization on Heavily Used Data Drive

    JayDubya

      We're testing an x-25v 40gb on a data ingester running xp-pro receiving a feed at around 10Mbps. Total drive use is consistent at around 10GB with thousands of small files updating every hour. So far the drive has performed much better than the striped raids we've been using. I've only optimized the drive twice in a week & it's still doing a better job on scheduled output. I would really like to schedule the optimization to run once every few days, but I don't want to shutdown the service & interrupt the feed. I've seen the warning that only 1GB will be available for use during optimization, & I don't believe the system would exceed that in the brief period required for optimization (though the service was shutdown while optimizing).

       

      Would this be worth trying on a non-os drive? What are the potential hazards?

       

      Also, how urgent should our move to the 320 series be? We recieved 3 last Friday after noting the performance on this test unit and contending with speed/fragmentation issues on other ingesters running striped raid. I'm thinking perhaps we should swap out the value series on the test box prior to putting it into production.

        • 1. Re: Scheduled Optimization on Heavily Used Data Drive
          DuckieHo

          Your application is writing 10GB an hour for nearly 24/7 for weeks at a time?

           

          If this is the case, I would highly suggest you look at SLC SSD due to the number of P/E cycles.

           

          The 320 provides much better sequential writes and improve random writes as well.

          http://www.anandtech.com/bench/Product/359?vs=148

          • 2. Re: Scheduled Optimization on Heavily Used Data Drive
            JayDubya

            The feed runs at just over 10 megabits per second & the service is set to highly filter the data per ingester specialization. It's nowhere near 10GB per hour, but back to the original question. Can the optimization scheduler be set to run safely without shutting the ingest service down?

            • 3. Re: Scheduled Optimization on Heavily Used Data Drive
              DuckieHo

              10Mbps... 24/7?  Let's assume 0.25-1.0MB/s with your filter.... that would be like 20-85GB per day.  That is quite a lot of writes.  I would recommend you get a SLC drive or increase overprovisioning.

               

               

              "Safely" is a relative term.... Yes, you can run the SSD Optimizer while the drive in use.  The risk is if there is a large burst of writes while the optimizer is running.

               

              Have you noticed any performance degradation from not running the Optimizer as often?  There is no <5min downtime during the week to schedule the process?

              1 of 1 people found this helpful
              • 4. Re: Scheduled Optimization on Heavily Used Data Drive
                JayDubya

                Sorry for the confusion. Total drive space used is only 10GB of the 40GB available.

                 

                As for degradation, the only thing I've noticed is a slight delay on output of two high res images that output every half hour. It looks like a delay of 1 minute after 3 days of running 24/7. Immediately after optimization, scheduled output is right on the scheduled time.

                 

                The problem with downtime is data from the stream gets missed while the service is off & there is no re-capturing it, not to mention the issues with automating the service shutdown. I'd really like to automate the whole process as I'm the only one handling it & I've got over 50 other machines to maintain along with new production, & development. Sometimes the service takes longer to shutdown, & I'd hate to fire off optimization prematurely.

                 

                I'll probably have to script something in the scheduler software we use. Can the optimizer be fired from the command line?

                • 5. Re: Scheduled Optimization on Heavily Used Data Drive
                  koitsu

                  The "Intel SSD Toolbox.exe" is a GUI-only binary from what I can tell; no argv parsing exists in it.  There's a utility under the Utilities\ directory called "Analyzer.exe" but all it ever spits back when run is the text "Invalid".  I have no idea what Analyzer.exe does anyway, so messing with it (or even focusing on it) is probably a bad idea.

                   

                  Effectively what you want is a utility that limits the LBA ranges that TRIM applies to (e.g. doing them in very small batches over time, rather than a large LBA range all at once).  I know Linux can provide something like this (search Google for "wiper.sh") and FreeBSD could as well (but we don't have the ability for userland applications to get a list of cylinder groups which have been free'd by the filesystem, so right now the kernel does TRIM itself, and in a very sub-optimal way (see links below).

                   

                  Relevant FreeBSD thread where the methodology used has come under scrutiny:

                   

                  http://lists.freebsd.org/pipermail/freebsd-fs/2011-April/011341.html

                  http://lists.freebsd.org/pipermail/freebsd-fs/2011-April/011345.html

                  http://lists.freebsd.org/pipermail/freebsd-fs/2011-May/011360.html

                  http://lists.freebsd.org/pipermail/freebsd-fs/2011-May/011362.html

                   

                  Possibly the Intel SSD Toolbox utility could offer something similar ("gradual" TRIMming rather than all-at-once).

                   

                  Also, are you sure you 4K-aligned your partitions on the SSD drive, re: performance concerns?  Just making sure.  I imagine you have, but wanted to ask anyway.  (I'm also a little surprised you're using XP Professional in what sounds like a commercial/borderline server-esque environment; don't worry, I have nothing against XP Pro (I use it happily on my home workstation) but given your work model I'm surprised...)

                  1 of 1 people found this helpful
                  • 6. Re: Scheduled Optimization on Heavily Used Data Drive
                    JayDubya

                    Haven't run any alignment tools yet, but really haven't had any problems yet either. When we would config our striped raids, we would set block size at 4kb since the machines ingest so many small files. What kind of performance gain can we get from running a tool such as Paragon's Alignment tool?

                     

                    We use XP Pro mostly because it's been very stable & cost effective when the ingesters were setup. The downside has been the connection limit on mapped drives, with the number of renderers tapping the ingesters. We've been adding ingesters as needed to accomodate.

                     

                    I spoke with the ingester software developers, & they said they would write something for the service that would shut it down for a brief period & fire off the optimizer then wait for a status return before firing off the service. I told them there was no command line option for the optimizer & they said "Oh... why not?"

                    Seems too reasonable to me. How difficult would it be for Intel to put a command line option together?

                    • 7. Re: Scheduled Optimization on Heavily Used Data Drive
                      DuckieHo

                      I will foward the suggestion to Intel for command line options for the SSD Optimizer.

                      • 8. Re: Scheduled Optimization on Heavily Used Data Drive
                        koitsu

                        JayDubya wrote:

                         

                        Haven't run any alignment tools yet, but really haven't had any problems yet either. When we would config our striped raids, we would set block size at 4kb since the machines ingest so many small files. What kind of performance gain can we get from running a tool such as Paragon's Alignment tool?

                         

                         

                        When you say "block size at 4KBytes", are you referring to the RAID (presumably RAID-0) stripe size, or are you referring to the NTFS filesystem cluster size?  They're two completely unrelated things.

                         

                        RAID-0 stripe size defines how much data is written to each member of the array at once.  Example: two disks in a RAID-0 array, with stripe size of 32768 bytes.  This would result in drive 0 getting 16KBytes of data written to it, and drive 1 getting 16KBytes of data written to it.  Stripe size doesn't matter with RAID-1 because there's no striping.  RAID-5, same thing.  Trust me on this, this is how it works.  Don't believe nonsense like what you might see on Youtube below (the individual describing it gets "most" aspects of RAID correct, but he's absolutely wrong in how the data gets written to the physical disks):

                         

                        http://www.youtube.com/watch?v=RYBtmVMtH1g

                         

                        A much better visual diagram (read the image caption very slowly and carefully) is here:

                         

                        http://www.pcguide.com/ref/hdd/perf/raid/levels/singleLevel0-c.html

                         

                        NTFS cluster size is how many blocks (in bytes) to allocate per file.  E.g. if you choose a cluster size of 16KBytes and you write a file to the disk that is only 500 bytes in size, on the *actual filesystem* the file takes up 16KBytes.  A larger cluster size can result in improved performance when very large files are used.  Using a benchmarking tool like ATTO can help determine what the "optimal" cluster size is.  On XP, NTFS filesystems default to 4096 byte clusters.  I tend to use that for my OS disks, but increase the cluster size on my storage disks (filled with movies, games, etc.) to 32KBytes solely since there's a performance gain.

                         

                        More importantly, neither stripe size nor NTFS cluster size has anyhing to do with partition alignment.

                         

                        Partition alignment defines which LBA offset an NTFS/FAT32/FAT/ext2fs/ext3fs/FreeBSD slice starts at.  You can't start at LBA 0 because that's reserved for OS bootstraps and the MBR/PBR, and there are blocks after LBA 0 which are also used for bootstrapping (usually the OS bootloader).  Most people use an offset of 128 (e.g. 128*512 = 65536, or 64KBytes, which is 4K-aligned), which is enough for the "stage 0" and "stage 1" bootstraps (enough to figure out how to load things like NTLDR off a filesystem, etc.).  Compare that to what XP might pick when creating a partition on its own (could use offset 37 for example; 37*512 = 18944.  18944 / 4096 = 4.625, which means it's not 4KByte aligned, thus performance hit).  Make sense?

                         

                        Think of SSD flash like RAM; memory has to be aligned in what's called "pages" (usually 512 bytes, 4096 bytes, 64KBytes, or 2MBytes for superpages; I think there's even larger).  Non-aligned allocation within the kernel results in *really* bad system performance, so that's why the kernel manages memory in pages.  CPUs also require this aligned paging.  Remember: I'm talking about RAM/memory here and not SSDs, but the *concept itself* is the same.

                         

                        There are SSD benchmarks all over the web showing what the performance difference is between a non-4K-aligned partition and an aligned partition.  There's quite a substantial difference (almost double in some cases).  I urge you to use the resources available to you to find evidence for yourself -- it's an educational process.  :-)

                         

                        I cannot comment on "Paragon's Alignment Tool" because I've never used it; I use DISKPAR (not DISKPART) to partition an SSD first.  OCZ has a topic on how to do this.  You can accomplish the same using DISKPART but the syntax/user interface is very different.

                         

                        Finally, be aware that overall Windows 7 offers significant improvements with SSDs that XP simply does not have.  Am I telling you to upgrade to Windows 7?  No (example: I myself don't run it!).  I'm saying that W7 supports native TRIM (e.g. no need to run the Intel SSD Optimiser; W7 takes care of all of this at the NTFS and storage controller driver layer).  Just a point worth considering.

                         

                        Hope this helps, or at least is educational.  There's so much misinformation on the 'net about all of this because everyone and their dog thinks they're a "tech whiz" when in fact only some people truly understand storage technology.

                        • 9. Re: Scheduled Optimization on Heavily Used Data Drive
                          JayDubya

                          Thank you guys for the valuable input. We went ahead & purchased Paragon's Disk Alignment Tool & ran it on one of the newer 320 series 40gb drive, then transferred the data off the X-25 value drive to the 320 series & swapped it out. The ingester seems very happy & dead on schedule. I wrote a script in the scheduler software to shutdown the service, fire off the toolbox executable, run optimization through the gui, then restart the service. A kluge no doubt, but it works. Again, thank you kindly. -jw