The Server Room Blog

7 Posts tagged with the shannon_cepeda tag
0

Here's the 6th follow-up post in my 10 Habits of Great Server Performance Tuners series. This one focuses on the sixth habit: Try 1 Thing at a Time.

http://communities.intel.com/openport/servlet/JiveServlet/downloadImage/1900/IMG_2382-edit-x350-noExif.jpg

Like habit 2, Start at the Top, this habit looks easy to understand and to keep. But, due to the constant desire for productivity, I and most others I know in the performance community have broken it many times. Some times I even get away with it. But trying to keep this habit is important, because when I don't get away with it, breaking this rule results in even more work than I was trying to save.

The concept behind this habit is simple - when you are optimizing your platform or your code, make only one change at a time. This allows you to measure the effect of each change, and only accumulate the positive changes (however small) into your workload. I have seen instances, for example, where 2 small changes applied at the same time to a workload cancelled each other out: one caused a small in performance and the other a small increase. If these changes weren't tested individually, we would have missed out on that performance gain.

Another thing that can happen in a complex workload is that two changes that seem independent can interact with each other. Like many developers know from fixing bugs, changing one thing may affect something else. Keeping all your changes separate can help you identify these interactions more easily.

You may be wondering when it is acceptable to break this habit. I think of performance methodology, and this rule in particular, as similar to the scientific method we learned in school. It's always good to follow it - doing so will help you quantify your successes and failures, stay organized, and defend your conclusions - but, you can still make a big breakthrough without it. In some cases, like when you are making small local changes to source code in completely different modules, or when you are changing two things you are certain won't interact, the habit can be broken. But the advice I give, especially to those involved in long-term optimization projects, is to follow it.

What has your experience been? Please share your "changing multiple things at one time" stories.

Keep watching The Server Room for information on the other 4 habits in the coming weeks.

0 Comments Permalink
0

Here's the 5th follow-up post in my 10 Habits of Great Server Performance Tuners series. This one focuses on the fifth habit: Know Your Workload.

http://communities.intel.com/openport/servlet/JiveServlet/downloadImage/1481/IMG_3749-x200-noexif.jpg
Spend some time getting to know your workload.


The idea of a "workload" is integral to the concept of performance. The workload is the set of software and tests that you run on the server in order to measure its performance. Also part of the workload is the is concept of the "metric", which means, the number you will use to quantify performance. You should understand as much as you can about your workload in order to characterize and interpret your system's execution.

Let's look at the real-life example of a car's fuel economy. The EPA measures fuel economy using 2 workloads: city and highway. Each workload tests different aspects of the car's performance, and the metric used to quantify that performance is miles per gallon (MPG). Like the EPA's fuel economy test, a good workload for server performance tuning should have the following three characteristics:

  • Measurable - There is a quantifiable metric.
  • Reproducible - Measurements are repeatable and consistent.
  • Representative - The workload should be typical of normal operating conditions and should stress the parts of the system (including code) where performance is most critical.

Depending on the usage model for the server(s) you are tuning, some example appropriate workloads might be: loading websites , processing XML, encoding/decoding MP3s, responding to database queries, rendering frames, etc. Metrics could be time to run, number of users serviced, transactions processed per second, etc. If your metric is time, take special care that you are measuring it accurately.

After choosing or creating a suitable workload, spend some time getting to know it. Measure the variance between runs. Use O/S and processor-level tools (to be discussed in the blog for habit #8) to sample the workload's characteristics at various points during its execution.

One thing to remember about sampling is that you want to make your sample interval at least as long as the amount of time it takes to complete a unit of work in your workload. For example, suppose your workload is a stream of web page requests and you are measuring response time. If the longest response time you see is about 2 seconds, then you want to make sure you take samples over 2 seconds in length. It's best to use a multiple of your longest operation time, so 4 or 6 seconds in this case. This way you can be sure your samples include one complete operation in the workload. Then try to determine if the workload is stable - meaning, do the characteristics vary at different times during execution? (If so, you will need to sample more often to understand the workload or possibly split it into phases). Use the data to get an idea of your workload's CPU, memory, network, and I/O usage.

At the application level, become familiar with the software stack you will use. How is the workload generated (user, clients, test files, etc)? Understand the major operations that occur - what components of the O/S are needed? What device drivers are used? And finally, study the application(s). Know whether the application(s) being tested are single- or multi-threaded and as much as you can about the internals.

Choosing (or developing) an appropriate workload is necessary for correct performance measurement and tuning. Being as familiar as you can with the workload will help you to interpret your performance data and identify areas for optimization.

Keep watching The Server Room for information on the other 5 habits in the coming weeks.

0 Comments Permalink
1


Here's the 4th follow-up post in my 10 Habits of Great Server Performance Tuners series. This one focuses on the fourth habit: Know Your BIOS.

http://communities.intel.com/openport/servlet/JiveServlet/downloadImage/1357/IMG_2318-noExif.jpg

My last blog talked about beginning your system tuning by consulting a block diagram. The other thing you should always look at is your system's BIOS. Many server BIOSes these days allow you to configure options that affect performance. Like everything in the performance world, which set of BIOS options will be best will depend on your workload!

First things first, how do you find this "BIOS"? Most servers have a menu called "Setup" (or something similar) that you can access while the system is booting, before it starts loading the operating system. This "Setup" menu allows you to access your system's BIOS. Changes that you make here will affect how the operating system can utilize your hardware, and in some cases how the hardware works. If you change something here, you usually have to reboot and then the change will "stick" through all future reboots (until you change it again). As platforms grow increasingly sophisticated, they are offering a widening array of user-configurable options in Setup. So a good practice is to examine all the menu options available whenever you get a new platform. Here are some of the most common options on Intel platforms that could affect performance:

  • Power Management - Intel's power management technology is designed to deliver lower power at idle and better performance/watt (+without significantly lowering overall performance+) in most circumstances. There are 2 types - P-States, which attempt to manage power while the processor is active, and C-States which work while the processor is idle. In some BIOSes, both of these features are combined into one option which you should enable. In other cases they are separated. If they are separate, here's what to look for:
    • Intel EIST (or "Enhanced Intel Speedstep" or "Intel Speedstep" or "GV3" on older platforms) - This is the P-State power management that works while the processor is active. Leave it enabled unless directed to change it by an Intel representative.
    • Intel C-States - If you have this option or something similar, it is referring to the power management used when the processor is idle. Enable all C-States unless directed by an Intel representative.
  • Hardware Prefetch or Adjacent Sector Prefetch - These options try to lower overall latencies in your platform by bringing data into the caches from memory before it is needed (so the application does not have to wait for the data to be read). In many situations the prefetchers increase performance, but there are some cases where they may not. If you don't have time to test these options, then go with the default. Intel tests the prefetch options on a variety of server workloads with each new processor and makes a recommendation to our platform partners on how they should be set. If, however, you are tuning and you have the time to experiment, try measuring performance using each of the prefetch setting combinations.

There are several other options that might affect performance on specific platforms. Some examples might be a snoop filter enable/disable switch, a setting to emphasize either bandwidth or latency for memory transactions, or a setting to enable or disable multi-threading. In these cases, if you don't have time to test, use your Intel or OEM representative's suggestion or go with the default setting.

Being familiar with how your system's BIOS is configured is another basic component of system tuning.

Keep watching The Server Room for information on the other 6 habits in the coming weeks.

1 Comments Permalink
3

Here's the 3rd follow-up post in my 10 Habits of Great Server Performance Tuners series. This one focuses on the third habit: Know Your Platform.

http://communities.intel.com/openport/servlet/JiveServlet/downloadImage/1247/IMG_2376-edit-x350-noExif.jpg

As we learned in my last blog, we should start our server performance tuning by looking for system-level bottlenecks. This involves understanding exactly how data flows into and out of your platform - and to do this, you need a block diagram. A block diagram shows the major components on the server's motherboard and the paths between them. From a good block diagram you can derive the maximum data transfer rate (aka bandwidth or throughput) achievable as data flows along those paths.

I usually look at my block diagram before beginning system tuning in order to identify potential bottlenecks. But some people use them in parallel: they measure the bandwidth of various parts of the system and then confirm what they see using the block diagram. You can determine if various parts of your system are heavily stressed, bottlenecked, or lightly utilized. In general you want to trace the path from where data enters your server (NIC, HBA, etc) up to the processor and back to memory or out of the server. The paths connecting one component to another are commonly known as buses. For each bus, multiply the speed by the width to determine the maximum potential bandwidth.

Let's use the block diagram for the Intel S5400SF server board as an example. It has 2 FSBs, each capable of 1333 or 1600 Mega-Transfers/second (MT/s). Each transfer on the FSB is 64 bits (8 bytes), so 8 bytes * 1,600,000,000 transfers gives a maximum theoretical bandwidth of 12.8GB/s per FSB segment. Keep in mind though that in reality a bus will not achieve its theoretical maximum bandwidth - depending on the type of bus it will probably realize 66-80% of the possible throughput.

http://communities.intel.com/openport/servlet/JiveServlet/downloadImage/1246/block_diagram.JPG

So, where do you find these diagrams? If you are using an Intel server platform, the block diagrams can usually be found in the technical product specification for each board. If you purchase a platform from one of our OEM partners, ask your salesperson where to get it.

Look at the maximum bandwidth achievable on each link your data will travel over to gain a deeper understanding of how your workload will run on your platform.

Keep watching The Server Room for information on the other 7 habits in the coming weeks.

3 Comments 0 References Permalink
0

Here's the 2nd follow-up post in my 10 Habits of Great Server Performance Tuners series. This one focuses on the second habit: Start at the top.

Let me start by relating a true (although simplified) story. My team at Intel has built up years of expertise running a particular benchmark. So when the time came to start running a new, similar benchmark, we thought: "No problem." We began running tests while the benchmark was still in development. Immediately we had an issue: the type of problem that would normally indicate our hardware environment wasn't set up properly. We checked everything that we had seen cause the issue in the past, and we couldn't find anything. So, we blamed the new benchmark. After all, we were experts and we had been setting up these environments for years! We knew what we were doing. You can probably guess where this story is going: after weeks of doing things to work around the "benchmark issue", we figured out that we had mis-configured the environment, resulting in a bottleneck on one part of our testbed. We didn't thoroughly test that part of the environment because it had never caused us problems with the old benchmark. And of course, on the new benchmark it was critical. We had broken one of the most important rules of performance tuning: Start at the Top.

So now you know how easy it can be to not Start at the Top. Even seasoned performance engineers can get overconfident and forget this rule. But the consequences can be dire:

  • 1. You have to eat major crow when you realize your mistake. I'm just now getting over the humiliation.
  • 2. You might have put tunings in place to address issues that weren't really there. This is at best wasted work and at worst something that you have to painstakingly undo when you fix the real issue.

So...how do you avoid this situation? Simple: use the Top-Down Performance Tuning process. This means you start by tuning your hardware. Then you move to the application/workload, then to the micro-architecture (if possible). What you are looking for at each level are bottlenecks: situations where one component of the environment or workload is limiting the performance of the whole system. Your goal is to find any system-level bottlenecks before you move down to the next level. For example, you may find that your network bandwidth is bottlenecked and you need to add another NIC to your server. Or that you need to add another drive to your RAID array, or that your CPU load is being distributed un-evenly. Any bottlenecks involving your server system hardware (processors, memory, network, HBAs, etc), attached clients, or attached storage is a system-level bottleneck. Find these by using system-level tools (which I will touch on in the future blog for Habit #8), remove them, then proceed to the application/workload level and repeat the process.


Being vigilant about using the top-down process will ensure you don't waste time tuning a non-representative system. And it just may save you some embarrassment!

http://communities.intel.com/openport/servlet/JiveServlet/downloadImage/1225/IMG_2506-measureBottleneck-edit2-x250.jpg
Always measure your bottlenecks!

Keep watching The Server Room for information on the other 8 habits in the coming weeks.

0 Comments 0 References Permalink
1


As a follow-up to my first post on the 10 Habits of Great Server Performance Tuners, this post focuses on the first habit: Ask the Right Question.

http://communities.intel.com/openport/servlet/JiveServlet/downloadImage/1207/Greg_Questioning_100by100_no_exif.JPG
6 years of performance work have taught me to start all my projects with this habit. Before I explain the kinds of questions I ask, let me demonstrate why this is important. Here are some example undesirable outcomes of performance tuning:


  • You spend months of experimentation trying to match a level of performance you saw reported in a case study on the internet, only to find out later that it used un-released software you can't get yet.
  • You spend months optimizing your server for raw performance. As part of your optimization you fully load it with the best available memory and adapters. Then you find out that your management/users would have been happier with a lower level of performance but a less costly system.
  • Your team works hard to maximize the performance of your application server for the current number of users you have, but makes decisions that will result in bottlenecks and re-designs when the number of users increases.

The outcome we are all hoping for with our tuning projects is that we provide the best level of performance possible within the budgetary, time, and TCO constraints we have. And of course, without sacrificing any other critical needs we'll have for our server, either now or in the future. Since performance optimization can take a lot of time and resources, consider the following questions before embarking on a project:

  • Why are you tuning your platform? (This helps you decide the amount of resources to dedicate.)
    • As part of this question, consider this one: How will the needs and usage models for this server change over the course of its life?
  • What level of performance are you hoping to achieve?
  • Are your expectations appropriate for the software and server system you are using?
    • In determining if your expectations are appropriate, refer to benchmarking results or case studies where appropriate and make sure any comparisons you make are apples to apples!
    • A corollary to this question is: is the server being used appropriate for the application being run?
  • What qualities of your platform are you trying to optimize: raw performance, cost/performance, energy efficiency (performance/watt), or something else?
  • Is performance your top priority for the system, or is scalability, extendibility, or something else a higher goal?

Thinking about the answers to these questions can help you navigate the trade-offs and tough decisions that are sure to pop up, and will help make your tuning project successful.

Keep watching The Server Room for information on the other 9 habits in the coming weeks.

1 Comments 2 References Permalink
2

I have been working as a full-time performance engineer at Intel for 6 years. I started by benchmarking server products for performance validation and now I focus on the TPC-C and TPC-E OLTP server benchmarks. I have used a variety of workloads in this job and spent time optimizing each level of the performance hierarchy: application, system, and processor. I, like many of you, have learned the "tricks of the trade" the hard way: by trial, error, and success. I'm sharing now, so you can all benefit from the things I've picked up along the way.

Let's start with some general methodologies to follow when tuning performance, whether you do it full-time, as a hobby, or just in your spare cycles after getting your "regular work" done. I will follow up with a more detailed post on each habit individually.


1. Ask the right question: Why are you tuning your platform? What level of performance are you hoping to achieve? What do you (or your users) care most about: raw performance, cost/performance, performance/watt, or something else?

2. Start at the top: The first and easiest part of your application server to tune is the hardware itself. Move on to the software and workload only after you feel confident that you have removed any system-level bottlenecks.

3. Know your Platform: This should be where you begin your system (hardware) tuning. The first thing, which I can't stress enough, is to get a block diagram of your platform. Then study it!

4. Know your BIOS: Server BIOSes these days come with more and more options. Be sure to give your new platform's BIOS a once-over. Pay particular attention to options relating to performance and power.

5. Know your Workload: To quantify performance, you need a workload! Some examples: web server response time, boot time, frames rendered per second, simultaneous connections supported, etc. Understand as much as possible about how the work gets done.

6. Try one thing at a time: Little changes that seem harmless can significantly alter the behavior of your system. Or worse, they can interact with each other to wreak havoc. Always try one change at a time, and for goodness' sake, do habit number 7.

7. Document and Archive: When you change something, log it! For each experiment you do, store your hardware and software configuration, performance level, and any collected data.

8. Use the right tool for the job: There are free data collection tools out there for various levels of the tuning process. System tuning tools include such as Performance Monitor for Windows or Sar for Linux. Application-level tools include Intel ® VTuneTM for both Windows and Linux.

9. Don't break the law: Amdahl's Law, that is. Amdahl's Law tells us the maximum amount of performance improvement we will get from a particular enhancement. Amdahl can help you set your expectations properly and clue you in to when you should be suspicious.

10. Compare apples to apples: Todd Christ reminds us of this habit in the last paragraph of this post. Don't compare the performance of mis-matched systems. If you must do it, know exactly what the differences are: the processor, memory type/speed/vendor, a software component, chipset, etc. Dig into the configuration details!

So now you have the high-level list! Stay tuned to The Server Room for more information about each habit in the coming weeks.

2 Comments Permalink