1 2 Previous Next

The Data Stack

21 Posts authored by: WallyP
  1. IT departments will be embroiled in IT transformation from now on.

  2. Private Clouds proliferate

  3. IT shops are surprised at the number of corporate applications in the public cloud

  4. MicroServers will be a bigger part of the market than 10% by EOY 13

  5. Mission Critical will continue the steady migration to virtualized infrastructure

 

Prediction #1:  IT departments will be embroiled in transformation because they will now have competition.  Their customers will go to the public web if they can’t get support from the corporate shop.  This transformation is already happening.  One shop I’ve talked with is striving to provide their business unit customers with rapid deployment of virtual servers.  The old way of forcing these business units to wait up to six months or more for a server are rapidly ending.

 

When a business unit is told they’ll have to wait, the business opportunity that they are trying to tackle might just slip away.  Instead of letting that happen they’ll go to a public cloud provider and get a server in minutes.   One article recently referred to this as ‘Rogue Clouds’.  (See prediction #3 above!)

 

Of course IT shops don’t want to lose control but the proliferation of alternatives for their customers is turning all this around.  IT shops have to transform.   IT shops have to transform to remain relevant.

 

To do this they are choosing to move to a homogenous environment of low cost 2 socket servers or even MicroServers (see prediction 4 above).  By having a standardized environment on which to build a highly virtualized compute farm or even private cloud (See prediction #2 above) the IT shop can provide flexibility and high availability to their customers.   Again, a shop I’ve talked to is using VMware’s vSphere to provide high availability to their customers.  They say that they have better availability and uptime at a much lower cost than they were getting on their AIX environment.   Suddenly Mission Critical in the private cloud starts to make sense.  (See prediction #5 above)

 

Let’s build out that AIX example.  Say the AIX server is hosting 100 virtual machines in combinations of LPARS and virtual software.  Let’s say that there is an unanticipated fault that brings down the entire machine.  Unless these virtual machines are clustered to a virtual machine on another server, all 100 have just gone down.  To bring the machine back up, the structure of the virtual machines has to be recreated followed by restarting restored images of the 100 servers.  That’s going to take a while.   One choice to avoid this is to purchase or lease a second AIX server to cluster as a passive high availability host.   The impact to the corporation could be hours of outage or costs in the 7 to 8 figures for the clustered solution.

 

Another choice to port the applications to a virtual server in a private cloud where high availability can be selected as a configuration option while instantiating the server.   If the host server happens to fail then the server is instantiated on another identical server immediately; no need to await configuration of the server.   The net impact to the corporation is possibly a second or so of outage.  The cost of the 2 socket server isn’t very high and replacing it in the rack is a choice as having spare servers isn’t an expensive proposition.   And now with server disaggregation (yes, Intel is a collaborator) the task of configuring server farms is becoming simpler than assembling a model using toy blocks.

 

The role of IT is rapidly changing.  The role of the IT staff is changing.  The role of the server is changing.  The architecture of different brands of servers is losing any distinction.  The change is occurring at an ever rapid rate.  The only thing we in IT can be assured of is that the ground under our feet is shifting and flexibility is a guiding principle for our careers.

 

Follow me on Twitter @WallySP for more!

I'm one of the featured speakers on this webinar.  I'd sure like you to join me and ask questions. 

 

Get expert insight for modernizing your mission-critical IT environment

Intel® IT Center Talk to an Expert webinar series:
Tuesday, December 18, 2012
10 a.m. Pacific standard time

Keeping the most important workloads running 24-7 can keep IT managers up at night. It is also essential to be able to rapidly analyze the ever-increasing amounts of data being generated every day.  Increasingly, these challenges need to be met in a tight budgetary environment.

You’re faced with a strategic choice: Continue to try to meet these growing demands with legacy infrastructure, or migrate your mission-critical deployments to an open-standards-based environment built upon Intel® Xeon® processor-based solutions.

Join me and our panel of experts to learn a proven approach to modernization. Topics include:

  • Trends and market data related to mission-critical application modernization and a perspective on the updated solution stack
  • Strategic options to address the current software, hardware, and business challenges in your mission-critical environment
  • Practical steps for your mission-critical migration projects, including which workloads to transition first, and ways to divide the project into manageable pieces
  • Lessons learned in modernizing mission-critical environments, based on a rich history of innovation and global experience from IBM and Intel

Register now >

If you are using legacy UNIX infrastructure to host those applications…

 

What is a Mission Critical application?

 

Everybody has their own definition, regardless of how the definition is framed it boils down to the applications that are required to run the business; if the application is not running then the mission of the business is threatened or even shut down.  Consequently mission critical applications are hosted on the most robust and redundant hardware platforms available.

 

Thirty-five years ago it was unthinkable to host such an application on anything but a mainframe.  Midrange servers existed but only the VAX VMS Cluster could sport the high availability requirements to rival the mainframe.   Mostly these servers, if used in business, were used for departmental operations.   Mini computers were relegated to the periphery of the business.

 

Then SMP (Symmetric Multi-Processing) servers began showing up in the 90’s allowing for large UNIX servers and the configuration of servers into high availability architectures.  Processors used for these servers ranged from proprietary RISC processors like SPARC or POWER to industry standards processors like 386 or 486 processors and eventually the early Pentium processors from Intel.  Lower costs compared to the mainframe led IT departments to host Mission Critical applications on these UNIX servers. The last ten years have seen another shift.  These large UNIX servers are increasingly being supplanted by lower cost servers sporting high performance and reliable Intel Xeon processors.   The chart below from Intel’s promotion of the Itanium Processor dramatically shows this shift.

 

IT Spend 2011.png

 

But the shift in IT spending isn’t the only message from this chart.  The chart shows the amount of IT spend for hosting Mission Critical applications but also shows that the total amount spent for hosting Mission Critical applications has GONE DOWN over the last decade!  In 2001 IT was spending $58.136B for hosting Mission Critical applications.   In 2011 the spend for Mission Critical applications was $56.5B.  That’s about $1.6B less in 2011 than in 2001!  In addition, it shows that about 26% of the amount spent, that spent for RISC Servers, is for supporting only about 3% of the hardware in the Data Center.

 

How can this be?  The answer is that IT Directors are increasingly turning to Intel Xeon based servers for hosting their Mission Critical applications because as we all know there are an ever increasing number of Mission Critical applications in the business.
How is it possible that the number and scope of Mission Critical applications has skyrocketed in size but the amount of money spent by IT to run these applications has gone down?

 

The answer’s lies in Moore’s Law.


Moores Law.png

 

Basically Moore’s Law says that the numbers of transistors on a processor will double every 18 to 24 months.   This means that the capability of a processor to process data increases significantly every 18 to 24 months.   The low cost Intel Xeon processor has been adding functionality along with speed, and has been capable of running just about any Mission Critical application in IT’s portfolio. So the capability and performance of the Intel processors increases with Moore’s Law.  Instead of doubling the size of a processor every 18 to 24 months, Intel is relentlessly driven to shrink the manufacturing process of the features of the processor.   Shrinking feature size is another way to achieve high manufacturing volume while keeping the processor size similar to its predecessor all the while doubling the numbers of transistors.  By manufacturing a high volume of processors Intel is able to sell these processors at a reasonable price, near the price point of the previous generation.

 

Shrinking the manufacturing process of the processor features each generation requires new machinery in a whole new processor fabrication facility.  To construct a new fab today is running from $3 to $5 Billion each and it is estimated that the costs of a 300mm fab will rise to $10 Billion soon.   The chart below shows that the number of companies with their own fabs will shrink in the next generation.

 

Fab Size.png

 

So what does this have to do with your paying too much for your Mission Critical compute infrastructure?  It’s pretty simple.  Intel makes hundreds of millions of chips each generation.  Fab costs, the costs of ramping up a new process technology, and R&D costs are amortized across hundreds of millions of processors allowing Intel to keep the costs for current generation processors and succeeding generations down.

 

All developers of processors face these issues in the development and manufacture of the processors they design.   For companies making only hundreds of thousands or even just a few million processors, the fixed costs of a fab, the variable costs of ramping up a new process, and the variable costs of R&D have to be amortized across just those processors.  With fewer processors to spread the costs across, something has got to give.  Either the manufacturing process has to stay the same as the previous generation or the price has to go up to absorb the costs.

 

Instead Intel has been able to provide increased performance and features generation after generation while keeping the price to the consumer at a reasonable level.  Most IT managers are aware of this.  They are faced with needing to meet ever tightening Service Level Agreements while their budgets are being kept flat or even being reduced.  These managers are able to meet these demands by putting their Mission Critical application on servers built on the reliable and powerful Intel family of Xeon processors.

I get questions occasionally from customers.

 

One recently was, ‘Can Intel Xeon Processors handle a 20TB Oracle database?’

 

We get this question occasionally and the question doesn’t make any sense to me.  I understand the basis of the question; the customer is concerned that Xeon can tackle a very large database.   Is the question really, ‘Can Xeon read in a lot of data and processes it efficiently and quickly?’  We can easily show that the Xeon E7 family of processors can do this faster in benchmark tests than most proprietary RISC processors.

 

xeone7_tpch_1kGB.jpg

Higher is better

 

 

Where the question falls apart is in the premise, can a 64-bit Xeon address 20TB?  If a 64-bit RISC processor can address 20TB, then a 64-bit Xeon will as well.  No database is going to be read 20TB of data at a time and besides, an Oracle database is going to have a lot of space that is either empty or not used.  (For instance is there really 20TB of data or is it really 12TB or less?)  But the concern of the customer usually goes deeper.  So let’s break this issue down.

 

What is the number of users?  This is a useful question.  For instance is it a data warehouse with only a handful of users?  Or is it a highly transactional database with thousands of users?  In either scenario Xeon is great.  (In 2008 and 2009 I was a DBA for Oracle on a benchmark they were running of a 10TB medical database with between 10 and 20 thousand of users.  The Xeon processors for this benchmark were a number of generations ago.)

 

Another question that is maybe being asked is: ‘What is the largest data file I can create for my 20TB database?’  What I’ve found behind this question is a concern regarding the manageability of the database given the number of datafiles that would need to be created to get to 20TB.  (For that benchmark 3 years ago it took me all weekend to build a 10TB database with 1GB datafiles.  I had them spread out but there were an awful lot of them.  Today, with much faster I/O creating a 20TB database will be much faster.)

 

Another concern being raised by the original question would be memory addressability.  For large databases the thinking is that the datasets being processed in memory are very large.  Can Xeon address as much memory as a proprietary RISC processor?  In other words, can Xeon scale up?  Do the platforms sporting a Xeon e7 processor have the memory capacity as servers with a proprietary RISC processor?  We can easily demonstrate that Xeon will fill the bill by platform diversity from various vendords that can support 2TB to 6TB of RAM.

 

Another concern raised by the question might be on concurrent processing.  With a 20TB database a lot of the processing may utilize Oracle’s parallel query function.  The Xeon E7 family with its multiple core and hyper-threading technologies can easily handle significant parallel processing.  For example, I started running Oracle Parallel Query Option, PQO, in 1996 when the feature first came out and I was using a 24 processor Sequent server utilizing Pentium processors.

 

I imagine there are additional ways to break this question down but overall the question: "Can the Xeon E7 processor run a 20TB database?" deserves an answer that addresses the real issues.   The simple answer is a resounding YES!

You have 60 UNIX servers from one vendor running a consumer facing application; the finest money could buy 4 years ago. They have performed adequately and the vendor has provided good support but now the bill is due as the vendor has been increasing the costs of support.   Upgrades are expensive forklift changes with no promise of reduced costs.  That’s the hard place.  The rock is that the increasing costs of keeping this consumer facing application going is leading marketing to impose a fee for use on the consumers.  Consumers are up in arms and setting up petitions on change.org.  The news channels are featuring this new fee as an example of the squeeze on consumers.

 

This is the big squeeze that many enterprises are finding themselves in.  How do I reduce the Total Cost of Ownership (TCO) of an application while at the same time how can I provide the new services that consumers have come to expect to be free?  And if you don’t do it the competition will.

In fact from the ZDNet report on IT Budget priorities in 2012 this is the divide between IT budget winners and IT budget losers:

 

… these results paint a picture of a two-speed IT market in which some organizations are pushing ahead aggressively with transformative projects based on new technology or new delivery methods while others are bunkering down and looking inside for cost cutting opportunities. The danger for the latter group is they may find it increasingly hard to compete against leaner, more agile, more modern and more automated competitors.

 

You can get ahead in this game.  ‘New technology and new delivery methods’ are another way of saying that these budget winners are moving to servers based on low cost Intel Xeon Processors.  These processors are bringing the qualities you have found in the UNIX Processors into the low cost Intel Xeon family of high performance processors.

 

You’ll have to make sacrifices.  It will mean some work and those expensive servers will have to find a new home.  Maybe there is a ‘Rescue’ organization in your community that will take them off your hands and find them a new good home.

The hardest part will be going from sole source vendors to having a wide range of choices of vendors.   Gosh, you can choose from IBM, Dell, HP, Bull, Cisco, Oracle, SGI, SuperMicro, Lenovo just to name a few.  You also have a choice of rock solid enterprise operating systems from Linux and Linux variants such as Oracle, Red Hat, SUSE, to non-Linux operating systems such as  Windows, to Solaris.   Now you can be sure to get the lowest cost possible and you no longer are locked into one vendor.

e7 Server OEM slide 1 slide.jpg

 

Now you have options but you need a plan to get there.  With this many servers hosting one application the effort can seem daunting.

Break the effort down.  What are the easy parts to migrate?  For instance, products that are COTS (Common Off The Shelf) software that have a version for UNIX and a version for Linux and there isn’t much extensive customizations involved can be low hanging fruit, so to speak.  Databases that can be migrated using a tool from the database vendor can be next.  For instance Oracle Streams for Oracle databases or Replication server for Sybase are examples of these tools.  In this manner a lot of the application can be easily converted.

 

So now the user facing layer has been changed.  The backend layer consisting of databases has been changed.  Now the middleware layer can be complex and include logic servers with lots of custom code or with a big ERP package the middleware can be all COTS software.   This middleware can be a big challenge.   But now you have experience in migration and can approach it as a veteran.

 

One word of caution here; I keep hearing stories of missing something while migrating from veterans of in this IT modernization effort.  Usually missed are the undocumented or poorly documented data feeds that were added in a hurry to or from some application.  You can try to avoid this by running a rehearsal migration before the production one and have the application tested as if in UAT or with a test harness.

What am I talking about here?  See my white paper on migration methodologies on our website

 

Let me end this with a story about how using the Intel Xeon Processor family significantly helped a University meet their data processing requirements.  The University of Colorado had to address their rapidly rising compute costs.  They decided to migrate their mission-critical IT environment from legacy UNIX servers running RISC processors to Intel® Xeon® processors. The results were staggering. Their data center footprint dropped to 1/20th (5%) its former size. Power consumption dropped to 1/10th (10%) of its pre-migration levels. And their ERP performance jumped by 400% to 600%. In Year 1 alone, they saved $600,000.   While your results may vary you can read the white paper on this achievement on the Intel.com web site.

 

Do you find yourself caught in the squeeze I discuss here?  What are some of the steps you have taken?

Costs are being closely watched in every division in an enterprise, but particularly in the IT department - controlling IT spending is an objective in nearly every organization. There are also corporations where what has been traditionally described as IT is now the basis of the business. So limiting costs will reduce the cost of operations. (Think of Amazon, Google, PayPal, Salesforce.com, etc.) This transition point in IT is being driven by the appearance of clouds, whether dark or with a silver lining. The choice for IT administrators tends to fall into a.) use an internal, private cloud, b.) use an external, public cloud, or c.) use a hybrid cloud consisting of both private and public clouds.

 

It is readily apparent that running applications in the cloud reduces costs by providing IT customers with self-service provisioning. The paradigm shift in retailing that has significantly reduced staffing and customer service representatives is now reaching IT. Increasingly the individual, whether a staff member in a corporation or an entrepreneur starting a new Web based business, can self-select the computing resources required for an application. Businesses and governments are moving into the cloud.

 

Furthermore, the cloud has turned the economics of backup and recovery on its head. In the public cloud, the cloud vendors maintain backups for virtual servers. It would be just as easy for administrators of private clouds to maintain standby backups of the applications at significantly lower cost than the traditional duplicate architecture used in many DR schemes.

 

(I’ll admit that the vision I embrace here doesn’t exist without glitches but the hurdles are technical and will be overcome. The ODCA is working to develop solutions to address these issues.)

 

The cloud is great for new and dynamic applications. Developers can work in test environments in the cloud, shortening development cycles dramatically.  Because a virtual server in the cloud is dynamic in and of itself, testing performance parameters of applications - both new apps and upgrades - can result in previously unavailable accuracy in provisioning the eventual production server. Costs and time associated with application development can be significantly reduced.

 

Hybrid Clouds, discussed here by Billy Cox add to the mix by offering an alternative to building the entire infrastructure in-house. Many components of an application can be hosted outside the walls when the application does not use proprietary data.

 

The old consultant adage is "You want it right? You want it soon? You want it cheap? Choose only two." The cloud is allowing some managers to choose all three.

 

This is great news for those developing and deploying new applications. Those legacy applications running in Linux* or Windows* servers can likely take advantage of cloud economics. Migration to the cloud from Windows or Linux can be done carefully and efficiently using automated P2V tools from various vendors.

 

But what about all those applications running on AIX or Solaris* platforms, how can they take advantage of the economies of the cloud? This may require a complex migration as most clouds are built on servers from various manufacturers that are running on Intel Xeon processors. With this we’re back to the old RISC to Xeon processor migration issues (i.e. big endian Power* and SPARC* processors and little endian Xeon processors).

 

The first step in planning to move an application running on expensive legacy servers is to look at my migration spectrum. What sort of application is being targeted to run in the cloud? Where does it fit on the spectrum? Determining this will give you a 10,000 foot view of the effort that will be required.

 

Most applications running on expensive legacy servers are running in partitions. I suppose some of you were thinking, how does an application on a Power 5 server fit into a server carved out of the cloud? Isn’t that something like trying to fit the size 23 feet of Shaquille O’Neal into size 10.5 sneakers? But the applications are usually smaller than the entire server because they are running in a partition, or the utilization rate is near the data center norm of 15%.   The best news is that the performance difference between modern Intel Xeon processors and Power 5 servers is such that an application running in the legacy server can easily fit in a server carved out of the cloud.

 

Please share some of your experiences in migrating legacy UNIX* applications into the cloud. Did you find it difficult?

In the first post on this topic on server modernization, I started us off on moving an Oracle Database the Old Way "Quickly" and have gotten us to this point.

 

Now we will set up all of the scripts for the application and the network links. This is the tedious part - there are many individual scripts to export/import the data in the tables. Organize these into a giant script that will run them all at once. Below is a sample of the beginning of the script. Make one for each separate group of tables you are moving. (You can see how old this is from the ORACLE_HOME.)

 

 

#!/bin/sh
######################################################################
# set -x
#  FILENAME: expimp.sh
#  USAGE: expimp.sh 'user to export' 'SID to export from' 'SID to export to'
#
#  for example:
# nohup expimp_resh.sh struct &
# Will export/import the database structure from one system to the other
#  and use a dummy label for the USER
#
######################################################################
RET=0
E_ORAHOME="Legacy server ORACLE_HOME"
I_ORAHOME="Target server ORACLE_HOME"
PATH=`pwd`
TABLE_NAME=$1
RESH="/usr/bin/resh target-1"
SRC_MKNOD="/etc/mknod"
DST_MKNOD="/etc/mknod"
DATE=`date`
echo $DATE
ORATAB=/etc/oratab
EXP="/home/oracle/product/7.3.4/bin/exp"
IMP="/home/oracle/product/7.3.4/bin/imp"
EXPFILE="/export/home/wpereira/exp${TABLE_NAME}.par"
IMPFILE="$PATH/imp${TABLE_NAME}.par"
EXPLOG="/export/home/wpereira/exp${TABLE_NAME}.log"
IMPLOG="$PATH/imp${TABLE_NAME}.log"
SRC_USER="system/’password’"
DST_USER="system/’password’"
EXP_PIPE="/tmp/eximpipe.$$"
IMP_PIPE="/tmp/eximpipe.$$"
${RESH}  "
    $SRC_MKNOD ${EXP_PIPE} p
    export ORACLE_SID=ORCL
    export TWO_TASK
    export ORACLE_HOME=${E_ORAHOME}
    $EXP userid=${SRC_USER} file=${EXP_PIPE} log=${EXPLOG}  parfile=${EXPFILE} & " &
#
#
ORACLE_SID="ORCL"; export ORACLE_SID
ORACLE_HOME=$I_ORAHOME; export ORACLE_HOME
$DST_MKNOD ${IMP_PIPE} p
${RESH} "cat ${EXP_PIPE}" > ${IMP_PIPE} &
# run the import
$IMP userid=${DST_USER} file=${IMP_PIPE} log=${IMPLOG} parfile=${IMPFILE} &
exit 0

 

 

Be sure to set up a high speed link between the legacy server and the new server. This link will allow for remote execution of the scripts.

 

 

Once the database has been created, you need to create all the objects in the database for the application.  Similar to allocating space for tablespaces on disks, Oracle has to allocate space for the objects like tables while the rows are being loaded. This space allocation takes time, so the better approach is to pre-allocate the space prior to the actual movement of the data which means that the application is shut down.

 

In this step you are also creating the VIEWS, PROCEDURES, PACKAGES, TRIGGERS, SYNONYMS, etc. of the application in the database. (You’ll have to disable the TRIGGERS before the data loads.)

 

Now that you have the export/import scripts set up, invoke them in one super script. You probably won’t be able to do the entire database in parallel. During the rehearsal, you’ll find where to put in the ‘wait’ statements.

 

Once all the ROWS have been imported you can invoke the set of scripts to create the INDEX's. When these are finished, run the script to ENABLE the CONSTRAINTS which are actually generating INDEX's. The next major step is to ENABLE the TRIGGERS. This is also done by script.

 

The audit of the database is among the last steps. Match the output of DBA_OBJECTS in both databases. You also need to get the number of ROWs in the legacy database and compare that to the number of ROWs imported into the target database. This is where the team comes in handy; each member is doing a different audit task.

 

Now open the application to use by the testers. Because you don’t want to add any test data to the production database this has to be done carefully.

 

If everything passes muster, then the target database can be opened for production. For the paranoid, the legacy database can be opened for parallel production in the event that the firm wants to move back to the old system.

 

Finally, document the entire effort so that the configuration is clear for others to understand. And go get some rest.

 

This process doesn’t take advantage of Oracle Streams or Oracle’s GoldenGate. But I believe there is a use for this methodology even today. What do you think? Do you agree or disagree?  Do you have migrations that will require this older approach because no other will work? Do you have any suggestions to improve this process? What have you seen when doing this with Datapump? Do you recommend any changes to this methodology when using 10.2 or 11.2 as the target database?

Let’s say you’re moving an Oracle database from a legacy RISC server to Linux on Intel architecture. You’ve been asked to improve the performance of the application along with the migration. That’s easy, the move itself will likely accelerate the performance (pdf)  of the application by hosting the application on a faster platform. But you know you can get more from the database by taking advantage of the migration to reorganize its structure. There is also a limited outage window to execute the migration. To re-organize the database you’ve chosen to use the Oracle tools export and import, i.e. DataPump if moving from a newer Oracle database. These tools, you know, will give you the best opportunity to reorganize the database.

 

So, how are you going to do this on the target server? Let’s go through the steps:

 

  1. Configure the hardware, memory, BIOS, operating system (sysctl.conf?), storage (HBA Driver anyone?), and network configuration.
  2. Install Oracle Database Enterprise Edition and create a sample database.
  3. Document the tablespaces and data file sizes of the production database (query the data dictionary).
  4. Start the tablespace creation process on the new database to replicate the tablespaces of the production database. (See below for a speed up idea.)
  5. While these are running:
    1. Export the source database from a quiesed Q/A database. (ROWS=No)
    2. Create an INDEX create script by Importing the export file you just made with the parameter INDEXFILE=’filename’ (FULL=Y)
    3. Edit the ‘filename’ file to have each INDEX run in Parallel Query mode
    4. Generate a file of the constraints for each user
      • Edit this file and make a copy, one to disable the constraints, the other to enable the constraints
  6. Configure a private network (the fastest available) between the source server and the target server. Allow for remote procedure calls.
  7. Create a series of scripts to export the source database table by table, starting with the largest table first.  Bundle the smaller tables into separate import files. Create import scripts that import the same tables as the export scripts. (ROWS=Y, INDEXES=N, CONSTRAINTS=N.) You’re just moving the data.
  8. Put these all into a shell script where they are called all at the same time. Be sure to have the output from each script sent to a file.
  9. Run an import to pre-create the various OBJECTS or structures of the application in the database but without ROWs, INDEXes, or CONSTRAINTS.
  10. Start the data migration.
    1. Shut down the application and put the legacy database into single user mode
    2. Disable the TRIGGERS in the target database
    3. Fire off the script to export the data from the legacy database to a named pipe and import into the target from the same named pipe
    4. Once the row data has been migrated, start the script to create the INDEXes in the target database
    5. Run the script to ENABLE the CONSTRAINTS in the target database
    6. Run the script to ENABLE the TRIGGERS
  11. Audit the database to ascertain that all OBJECTS in the Legacy database are in the target database and that each TABLE has the same number of ROWS in both databases.
  12. Open the new database to production (or mirror this with the legacy database).
  13. Disable the private network used for the migration of the data.

 

 

First create a database with just the minimal tablespaces (SYSTEM, ROLLBACK, USER, SYSAUX, etc.) but make each tablespace is a size optimal for the application. Create the application tablespaces for the application laid out on storage (the way you envisioned initially). To get the list of the application specific tablespaces use the following script:

 

Col tablespace_name for a45 SIZE_IN_BYTES 999,999,999,999
Spool tbs_size.out
SELECT TABLESPACE_NAME, SUM(BYTES) SIZE_IN_BYTES
FROM DBA_SEGMENTS
GROUP BY TABLESPACE_NAME
Spool off

 

 

For these custom tablespaces there is a trick to make them in parallel. While you can’t ADD a tablespace to the database in parallel to another tablespace, you can add data files to the tablespaces in parallel. For example, make the initial data file for the tablespace at the size 100MB. Then do an ADD DATAFILE for each tablespace at a respectable size. You can execute as many of the DDL commands ADD DATAFILE in parallel as long as your server and storage can handle it. (This activity will also give you a good opportunity to measure the maximum sustained I/O to the storage.)

 

When the data files are being added is a good time to generate the INDEX creation DDL. To get the CREATE INDEX text use import with the INDEXFILE option. Edit the DDL to put the INDEX's in the tablespaces you want them to be in with the EXTENT sizes that are optimal. Run this script to create empty tables. Now you have completed the space allocation for the tables and eliminated this time consuming process from the migration schedule.

 

In part two I will continue the steps to get this move complete, including a snippit of a Bash shell script to run the export/import processes in parallel.

(I'm just giving you one thread in the sample, you'll have to duplicate the line edited for your particular circumstances.)

It’s time for the final migration.  You’ve completed a Proof of Concept and a rehearsal for the process but now you have to complete the process and move the actual production application while it remains in production. Regardless of the migration methodology you’ve chosen, if you don’t get a lump in your stomach at the start, you’ve got ice water running in your veins. I’ve heard this process compared to changing all the tires on an 18-wheeler while it is still running down the road.

 

All of the previous steps taken lead up to this event.  The initial analysis defined the scope of the migration.  The proof of concept determined what the scope really is, and defined all of the components involved in the migration.  Also, the proof of concept likely ported and proofed the components that could be rewritten or modified without impact on the production system.  The rehearsal provided a timeline for the migration, along with an estimate of the production outage that may need to occur.

 

With all this data, the application owners and those charged with the migration need to sit down and plan this final step.  Cold feet can’t stop this intrepid team. Resumes prepared and recruiters notified, the team pushes on.

 

The final step plan starts with targeting the possible outage at the most reasonable time. This is often a weekend or a late evening, depending upon how long the outage would last.  Often, holiday weekends are chosen  (I once did a migration over the fourth of July holiday).  Include in your plan for best case and worst case scenarios for the outage duration.

 

Start the migration by cleaning up the target server.  Set the BIOS to the optimal settings you determined during the POC.  Load the operating system and set the parameters. If you’re installing Linux, add any required packages and, of course, add sysstat.  Install any application systems that are required by the application you run. Tune these for the application and platform.

 

Begin the migration by setting up the application system.  If you are moving a database, you set up the database.  If you are hosting a custom application, install the source code and compile the executable.  Set up the network links for the final conversion.

 

As soon as all the preliminary tasks are done, begin the migration.  For a migration of a custom coded application, this could as simple as pointing the other applications dependent on it to the new host.  If a binary data store like a database is involved, then the migration can be a lot more complex.  But you know what to do -- you’ve found out the best method in the POC and practiced it in the migration rehearsal.

 

Once the data is moved over and the application is ready to run, if possible run the test harness against the new system.  Audit the database to be sure all the tables, indexes, procedures, views, referential integrity constraints, database links, synonyms, and other database objects are all there. Finally, the last step is opening it back up to production. Then, go get some sleep!

 

After one migration I went back to the hotel and slept hard. I’d had maybe 4 hours rest in 72 hours, and I was pretty bleary.  When I went in Monday morning, there were cars in the parking lot.

 

"Hurrah! It worked! I thought. The company wasn’t going to let any workers into the plant if the migration had failed, but here they were working.   Subsequent testing by the user team assembled by the company to proof the process showed a few glitches on Monday (and for a few more days) but those were quickly tackled by our QA team. The lesson here is that once you’re finished with the steps, you aren’t completely done. You still have to monitor the situation to ensure all is well.

 

However, the migration is now finished.  You’ve got an all new platform for the application.  The old one can be put out to pasture (but it’s probably not a good idea to use it to make a fishing reef, given all the chemicals in a computer system!)

The modern application can be quite a complex beast.  Some applications are self-contained in one huge binary (or more likely a collection of binaries) working together.  It is surprising how many times I encounter someone struggling in a customer service scenario with a ‘green screen’.  They fill in the blanks on the screen by pressing all sorts of function keys.  These applications are often distinguished by the frustration of the individual cursed by having to interact with it.  The logic and the data of the application are all in the binary and associated data store.  This application architecture has mostly been found in mainframes with transaction processors like CICS controlling the action.

 

Other applications are still part of the older Client Server architecture, where the backend is a database server supporting clients running on desktop computers. Here, the logic and data are in a back-end database. The logic is invoked by the clients running executables on the backend server. The backend server in this architecture has to be big enough to run the database and all of the client processes.   The Oracle Applications 10.2 is a good example of this architecture.  As for hardware, this architecture drove the sales of ever larger and more powerful backend UNIX SMP servers.  Processing threads and memory made this application run efficiently.

 

Client Server Applications

 

I consider other modern applications a variant of the Client Server, in that the back-end database is still there holding all the data but the clients have changed to web servers that support the users interacting with the application using browsers.  Logic is now running in the web servers on smaller Linux or Windows boxes but the application is still beholden to a big backend SMP UNIX or Linux server.  Examples of this sort of application are SAP and Oracle Applications 11.

 

 

ApplicationArchitecture.jpg

 

There are the ‘XaaS’ applications.  Add your favorite letter in place of the X such as an ’I’ for Infrastructure or an ‘S’ for Software.  These applications can have multiple databases, as well as other data sources like data from the web to support the application.  In this architecture, there is no one backend database controlling the application.  The logic of the application is in the web servers that make up the application.  The application gets data from multiple services.

 

 

DC_SOA_Diagram.jpg

 

Sometimes, the application servers and database servers are all co-resident in the same server.  Each application is segregated from the others using a form of virtualization in big UNIX servers, such as AIX LPARs or Solaris Containers.

 

LPAR2RRD-arch1.jpg

 

All of these application types lead to different migration procedures.  This doesn’t mean that the migration steps for each type on the migration spectrum are different here.  Each component in the architecture running on a big UNIX (RISC) or mainframe server needs to be migrated separately.  In other words, whether the middle tier is on a discrete hardware platform or on a virtualized server, move the applications in each tier separately as a distinct effort.   By breaking it down into the distinct parts, the process can be completed with less risk to the overall application.

 

Migration of Client Server


This may prove to be the most challenging migration.  I’ve had a lot of experience here as this was the predominate architecture of systems into the mid-90’s.  This architecture has the application code and the data store on the same server.

 

This migration presents a number of temptations. The temptation to update the application system to multi-tier web based systems.  The temptation is to upgrade at the data store at the time of the migration (then again, if you are running on Oracle 9i or Oracle 8, then maybe you have to upgrade because the new system cannot host these older versions!). Another temptation is to ‘fix’ problems with the application system. If you can’t resist the temptations, then add several weeks/months to the migration plan.

 

If you haven’t succumbed to the temptations, then this migration includes some of the steps in the code conversion migration methodology (for the application systems) AND the steps found in the data base migration methodology. All of this has to be done concurrently to minimize any outage. This requires a lot of testing and planning to ensure the migration steps converge together at the proper time.

 

An example is an Oracle database with a custom application system accessing the data.  The first step would be to convert the application system (shell scripts, PERL, make files, OCI programs, etc.) on a test platform and proof it. Then, test the conversion of the database to the target system and possibly the new version of the database. If it is possible, upgrade the database on the legacy platform first.

 

Migration of a Multi-Tier Application


This application architecture has a web tier that faces the world, an application tier that contains the application logic, and is in between the web tier and the data store tier. The data store tier hosts all of the data supporting the application and is usually a commercially available database server like Sybase, Oracle or DB2 (if it is SQL Server, then your work is done for this tier).

 

This migration may be close to finished already.  Often, the web tiers and application tiers already run on Intel based servers.  Other configurations will have the tiers in virtual machines in one large server, such as LPARs in an AIX server.

 

In the first case, where only one or two tiers run on legacy servers, the migration would resemble the client server but with the isolation of the application systems from the other tiers, it can be migrated separately. The migration process would be:

 

  1. Migration of the web tier to Linux servers.
    1. Install the native Linux version of the Web Server program and modify
    2. Use these web servers for the production application.
    3. Migration of the application tier
      1. Install a native Linux version of the application or
      2. Convert the application using the code conversion methodology
      3. Use these application servers for the production application.
      4. Migrate the database using the database migration methodology

 

Migration of Cloud Services


Not all services supporting cloud services run on Intel hardware. Some run on mainframes, and others on legacy UNIX servers. Determining the numbers, effort and sequence of the migrations of these services is part of the initial analysis of the systems.

 

Tackle each one following the appropriate methodology. Bring them online individually to avoid the problems associated with adding too many variables when determining solutions to problems.

 

The point here is to demonstrate that the UNIX or Mainframe to Linux migration methodologies are modules that can be combined or called upon as needed to migrate applications. The savvy manager can add these to the overall migration plan.

 

Here’s to your successful upgrading of your application infrastructure!

I was at a conference recently and while I demonstrated the Xeon E7 processor RAS (Reliability, Availability, Serviceability) features, I discussed RISC to IA migration with the attendees. Occasionally I got blank stares."Oh, we’re doing server migration," was a common response, along with "we’re migrating our UNIX servers to Linux."

 

Perhaps we here at Intel get ourselves tripped up over our usage of terms. Many of us live in the world of processor architectures.  This leads us to use the terms RISC to IA, which refer to the different processor architectures.  Moving up the stack to the hardware layer, some refer to Server Migration or Platform Migration.  Let’s take another step up the stack to the operating systems. Here, what we discuss is referred to as UNIX to Linux migration.  This also includes moving applications from proprietary servers to the cloud and even data center consolidation where large proprietary servers are replaced by Virtual Servers running on Intel based hardware.

 

If you’re doing any of the above, then you need to read these blogs for hints and suggestions.  You should also visit the server migration website for helpful tips.  The website is public and is filled with white papers, case studies and How-Tos. We add to the site all the time, so keep visiting us!

 

The term proprietary hardware usually refers to servers that are marketed uniquely by one vendor.  For example, POWER 7, sold uniquely by IBM, or SPARC sold by Oracle. Both fall into this definition.  While these processors have many good features, they have a serious limitation.  Usually, once you buy into the server line, you are limited to the same processor based servers for future upgrades. These upgrades can involve significant physical changes in the data center.  Often, the term ‘forklift upgrade’ is used as this type of upgrade entails the replacement of an entire rack of hardware.

 

The term ‘Open Servers’ is used for hardware that utilizes the x86 architecture.  This hardware is characterized by two processor vendors and numerous choices for server platform manufacturers, as well as form factors.

 

Another term used here is commodity server.  This usually means something cheap but low cost Xeon processors are anything but ‘cheap’ in the pejorative sense.   Here, commodity is related to availability from a wide range of vendors in a variety of form factors.

 

Another term used at a very technical level is Endianness. This refers to the byte order of mutli-byte data. In a crude example, say the number 1758 is represented by two bytes. On a Big Endian machine, like SPARC or POWER, the first byte has 17 and the second byte has 58. On a Little Endian machine, like the Intel Xeon, the first byte has 58 and the second byte has 17. There are a lot of reasons for each architecture but suffice it to say that multi-byte data has to be converted before it can be used by a machine with a different architecture.  While some programs are written to be ‘endian neutral,’ most databases are not, and custom built programs aren’t either.

 

This endian difference is the reason we have to go through this migration protocol. If you search for endian difference on Intel.com, you can find white papers and even a video addressing how to code for this.

 

I want to end on where the term ‘endian’  came from (I worked with Danny Cohen 10 years after he wrote this paper). Almost 300 years ago, Jonathon Swift wrote Gulliver’s Travels, a satire on the politics of the day.  He had the characters of the world he created go to war over which end of an egg to break, the big end or the little end, when taking the shell off of a hardboiled egg.  To me, the bottom line is that we have to be careful how seriously we take ourselves. While endian differences created considerable work for us, there should never be an argument about who is right or wrong.

Never do something in IT that ends up in the newspaper and embarrasses your boss.  That’s the second rule of business.  The first rule is ‘Never Surprise your Boss.’

 

It appears that the Bank of America team that manages the on-line banking website system forgot this.  They did a feature update AND a platform migration at the same time.  This works only when it is done with careful planning and a lot of testing.

 

I do not know if the server platform migration was a RISC to RISC migration or a RISC to IA migration. Either way, it appears that basic steps were overlooked.

 

The first basic step is planning.  Plan for how the application upgrades intersect with the platform migration.  Plan to ensure that the application upgrades will not burden the overall service to the end customer.  Plan the day of the release. Conduct a Proof of Concept and test the whole configuration in there before Release To Market (RTM).

 

Enterprises know the traffic volumes for their web sites.  In the case of a bank, they know that traffic increases at the beginning of the month when paychecks, and Social Security comes in and a lot of bills are paid. In the case of a retail firm the web site and back end need to be locked down by Halloween to shake out problems before ‘Black Friday’.  I did a retail application (RETEK) RISC to IA migration starting in August and we were hard pressed to get everything done by early November.

 

Following my blog theme of RISC to IA migration testing is crucial.  Test after you complete the PoC to determine if the new hardware really handles the load. Frequently, you may discover that you need new hardware for your production system.  This is the hardware you target for the dress rehearsal of the production migration.  Test it to ensure that the application meets the firms Service Level Agreements (SLAs) and that the eventual production migration won’t end up in the paper.

 

A quick perusal of the web shows that there are a wide range of testing tools for web sites and application performance. I don't recommend any particular tool; testing tool selection must take into account your budget and your environment.

 

Plan the testing to have the application hit by more users than expected in production.  With some testing tools, this can get expensive. But what is the cost of an embarrassment where the application doesn’t perform and it causes you to lose customers?

 

Testing can reveal problems before the application is released.  One firm had a large Bill of Materials for their product all in one child table in an Oracle database.  The BOM explosion was perfectly fine in the original environment but testing revealed that performance was terrible on the new, more powerful platform.  Since this was done prior to the release of the application to the users, root cause analysis could be done and it did not affect the business.   (The problem was an instruction in one processor that wasn’t in the other processor.)

 

Once you complete testing, the team can ensure that the production migration will result in adequate performance to meet the firm’s requirements.  Once completed, the team can plan for the production migration.   Plan carefully -- eliminate the critical dates like the end of the month for a bank or after Halloween for a retail firm or around a launch date for firms working with NASA.  Most enterprises know when the application environment should remain frozen.   Give yourself a week or so on either side of the dates for slip-ups and other unforeseen occurrence, and give yourself time to test the final migration release if you can.

 

 

The world lost a leader this week.  Many of the comments about Steve Jobs point back to this commercial as a key to understanding him.  I agree with them.

Think Differently – RIP Steve Jobs

You’ve done the Proof of Concept (PoC) and now everyone clamors to put the system into production. Some see the application running after the PoC and ask you to link it in to the network.  Yet, you know that you put this system together, and the finishing touches were done with chewing gum and bailing wire.

 

Looking at the production system, everyone can see that the source application has been receiving data since you started the PoC.  How is that new data going to get into the application?  The bottom line is that it really can’t. The referential integrity issues alone would stop you and of course, you can’t go and apply the applications log files (unless they are in character form).

 

What are your options here?  You could export all of the data that has changed during the period, but how do you distinguish or even find the data that has changed?  Not to mention, when you apply the changed data, new data still comes in. What are you going to do with that?

 

The solution is a migration of the actual production system while it runs.  There are various techniques to do this but all require you to assure the boss that everything will be OK.  You know now that you can do the migration (the POC proved that) but how will it be done on the production system. Most importantly, how do you minimize downtime? How long is the downtime going to be?

 

So, now is the time to take those lessons learned in the PoC and apply them to the migration process.  What does this mean?  In the PoC process you probably moved a part of the system, and then moved another part.  You discovered you need to adjust the tuning of the server’s operating system and then the DBA told you that you forgot a link.  This was followed by problems in the execution of the application that required analysis and correction.  (See? bailing wire and chewing gum.)

 

Luckily, you documented all of the changes and determined the proper implementation sequence (or at least you think it is the proper implementation sequence). You also discovered that you can parallelize a lot of the steps and reduce the down time of the application.  Meanwhile, you have management asking when you can have the application migrated to the new platform.  For instance, at one site we were moving a retail program, RETEK, and Thanksgiving was fast approaching.  They needed the extra horsepower to handle the Christmas rush, and they had to input all of the Christmas marketing programs into the system.

 

To answer these questions and address the concerns, execute a rehearsal of what you plan to do for the actual migration.  Plan is the operative word here.

Take what you learned from the PoC and organize it into a coherent plan with the events sequenced correctly.  Plan on running as many tasks as you can in parallel.  Plan on doing as many tasks as you can before you have to shut down the production system for the cutover.

 

Back in the day when export/import was all that was available for the migration of an Oracle database, I would build the target database, create all of the production tablespaces as small I could, and create the users.  Then, I could ‘ADD DATA FILE’ in parallel until each tablespace was big enough for the data. After that, I would create all of the tables with enough extents to hold all the data (we don’t want Oracle to use time having to extend the tables while the production system is down). With the tables, I could create and compile all of the Packages and Procedures.  I then turned off all referential integrity constraints and Triggers. Only then was the production system put into single user mode for the export of the data.  All we needed to export were the rows of data; no indices.  Once the data was inserted, we fired off the index create statements, and ran them in parallel.  Once the indices were completed, the referential integrity constraints were ‘ENABLED’, thus setting off another round of index building.

Now that you've moved the application, you'll need to run your the application in your test harness for regression testing.  The testing will 'prove' that the migration is complete.

 

A final check of the database, and it opened up.    Here is an illustration of the process I just described.

 

Exp-imp Migration.jpg

 

Today, you can use GoldenGate or Streams for Oracle database migration or you can use Replication Server for Sybase. GoldenGate allows you to run the production and the new database in parallel, and even roll the migration back to the original database in the event that something goes wrong.

Wouldn’t it be better if that ‘something that goes wrong’ occurred when you executed the rehearsal of the migration of the Q/A database?   A little patience will result in greater likelihood of complete success.

 

Here are the rehearsal steps in a flow chart:

 

Production Rehearsal.jpg

 

 

Now that you’ve done the rehearsal, documented every step, and developed scripts to automate every process that can be automated, you are ready to plan the actual production migration and answer the question, ‘When is the application going to be ready on the cheaper platform?’

I’ve been writing about executing the Proof of Concept for a RISC to IA migration.  I want to step back with this post and present a context for this activity.  The POC is best for migrating a application running in production and for a one-off migration at that.

 

But what if that’s not what your facing?  What if you are converting your firewall servers from an old RISC based platform to a modern IA server?  The POC seems like a lot of unnecessary work.  I mean, you need to prove out that the use of an IA server in place of the old RISC server works and meets the SLA requirements.

 

Then again, what if your IT Modernization efforts entail moving your old applications from the old Mainframe to a new IA based server?  The POC in this case is a significantly more involved effort what with code conversions and possibly application replacement on top of porting data.

 

I’ve developed a spectrum of the types of migrations.  It doesn’t cover every migration type but I’d bet that your IT Modernization effort falls into this spectrum.   Here’s my graphical representation of this spectrum.

 

R2IA Continuum.jpg

 

The more replicable migration types include the migration of Infrastructure applications from RISC to IA.  These applications are ones with a native IA port from the vendor.  They are diverse and can be backup and restore applications, firewalls, Web Servers, file and print servers, etc.  This migration consists of testing the application on the new target and measuring performance.  If it measures up then develop a plan to replicate the steps and begin migrating the applications somewhat like a cookie cutter.  The next type of light lifting migration is to move an application that is more complex but is commonly found throughout the enterprise.  Applications in bank branches or retail outlets come to mind.  Migrate one of these in test, document the steps, plan the logistics, and train a cadre to go forth and migrate.   This too is similar to a cookie cutter operation.  The following flow chart captures the tasks:

 

Pilot Flow.jpg

 

Next in the spectrum are the more unique migrations that get more complex as the details grow.  From here on out you’ll need to execute a PoC and the migrations tend to be a ‘one-off’ or each migration is a discrete event that frequently doesn’t inform any other migration.   My forthcoming posts to this blog will cover the steps in these migrations in greater depth but right now I just want to briefly touch on the general types that occur.

 

  • Migrate using the same software stack and same versions.  The software stack varies only to the extent that Linux varies from UNIX.  The other software has ports for RISC and IA and the software on each platform is nearly identical.  Just the data unique to the enterprise needs to be transferred.

 

  • Migrate using the same software stack and but versions.  The software stack variance is marked by the different versions of the underlying software, for instance Oracle9i on the legacy server and migrating to Oracle11g.  While the software has ports for RISC and IA and the software on each platform no differs enough to add additional complications.  Migration tools available in Oracle11g aren’t available for Oracle9i.  While the data unique to the enterprise needs to be transferred the application code in the database may need conversion to address the differences in the upgrade.

 

  • Migrations get a lot more difficult with the porting of code written specifically for the enterprise.  This can be C++, C, or even JAVA code that needs to be converted.  Some OEM vendors, HP and IBM, have tool sets for assisting in the migration from Solaris to Linux.  These tools read the source code (you do have the source code, don’t you?) and provide in-line notations where changes need to be made in library calls or syntax.   Often this code conversion effort is long and tedious and is outsourced to a third party.    How many lines of code, 100,000, 400,000, or 1,000,000 is your application you want to migrate?

 

  • Suppose you want to migrate your application from running on DB2 on AIX to running on SQL Server and Windows 2008 R2.  You are changing the platform, the Operating System, you are changing the database, and you have to modify the program code that runs in the database like stored procedures.  Whew!  A lot of variables here and each needs to be carefully monitored and managed.   This is a complex process and I would recommend converting the overall plan into discrete steps, for instance migrate platforms first and insure that everything works correctly and then move to the new operating system and database.

 

  • The far right of the spectrum is mainframe migration to IA.  This doesn’t necessarily mean that it is the most difficult or complicated process.  This could be a migration of Oracle on the mainframe to Oracle on an IA platform and this would follow the process outlined above.  Other applications on Mainframes can be 50 years old or more.  These are the applications that bring the greatest challenges.   This would likely involve a complete application re-write and then we’re into application development which is a topic for another series of blogs.

 

Overall the migration processes are all challenging in their own way but the rewards of lower costs and better utilization of Data Center resources, electricity, cooling and floor space, make the journey worth the adventure.

finish_line.jpgIn my previous blog I described the methodology to run a Proof of Concept when migrating a production database that underlies a mission critical application.  This blog covers the analysis of the results of the Proof of Concept.  In other words, you’ve done the PoC, now what?
Flickr Image: Pat Guiney

 

 

 

The Proof of Concept has three primary objectives:

  • To ‘prove’ that the application can run in the new environment
  • To see how the application performs in the new environment
  • To specify the steps necessary to conduct the actual migration.

 

The proof that the application can run in the new environment seems pretty straight forward.  Sure, if it runs after being ported, it runs.  Check that one off the list.  However, when we start the PoC, there’s no guarantee that it actually will run.  We use the UAT procedures - be it a regression test harness or a select team of users - to bang at the application and ensure that we have ported and hooked up everything.

 

As mentioned before, these tests are run frequently and usually after ‘flashing back’ the application to the initial starting point.   Once this is done, all of the components and set up steps need to be carefully documented.  These steps will be repeated in the subsequent phases of the entire migration.

 

The steps taken for the proof of concept can be seen as covering a continuum.  In some cases, the steps will need to be repeated in each phase, such as porting the data.  Other steps, once done for the proof of concept, like rewriting shell scripts or editing the C++ code will not need to be repeated.  Once these ports are done to the satisfaction of the customer, then that is all that needs to be done and these can be put aside for use in each of the subsequent phases.   Other steps fall in between.  These steps are setting environmental parameters that need to be set for this application.  For each subsequent phase, we need to reset the parameters but we don’t need to determine how to set them.  Documenting these settings is all that needs to be done.

 

So now that we know that the application will run, we then look at how it runs in the environment.   This is where environmental tuning parameters are adjusted.  This is where code may need to be rewritten.

 

Some proof of concepts start off with the application performing on the new target significantly worse than in the original environment.    An application can perform differently when hosted on a different platform.  For instance, an application that was CPU bound on the old platform could start having I/O queues or memory usage issues.    If you’ve been measuring application performance by looking at queues on the original host you’ll likely miss this.  The only queues you really should be looking at are the queues (CPU, I/O, and Network) on the PoC hardware which simulates the eventual target hardware.

 

I once got an application running that was ported from a mainframe into a relational database.  Performance was terrible even though on paper the new environment would significantly outperform the old host.  We looked at tuning parameters and we were optimal.  We looked at performance reports from the operating system and it was OK.  We looked at the performance reports from the database and sure enough, CPUs were barely running but the I/O to disk was pegged.  The customer was looking to us and the application vendor for a fix.  I looked at the code and found that the application vendor had written VSAM style data access methods in SQL!  In other words, instead of using relational set theory to winnow the data, his application read each row in sequence through the entire table.  The PoC stopped right there, and the customer kicked the application vendor out.

 

Like in the story, we need to observe carefully how the application is performing.  We can use the output from the testing harness to give us data on how long each tested task took and compare that to the SLA’s for the application.  We can look at the tools built into the operating systems, like Perfmon for Windows or the ‘stat’ tools, like vmstat, iostat, etc., in Linux.   (Let’s not neglect the application as the primary source of performance problems but we’ll discuss that later.)

 

The data from the tools needs to be analyzed.  The performance times measured by the test harness are pretty obvious as it usually reports response time.  If the task took more than a second to complete but requires sub-second response time, then we know there is a problem that needs to be looked into.  Is the application just running poorly?  Is the hardware inadequate?  Is the operating system or underlying software in need of tuning?  Perhaps the application architecture needs adjusting.  Our performance monitoring tools will provide us clues.  We’re looking for bottlenecks.  Bottlenecks can be found where data travels, like I/O or in the case of a database, in logging the database changes.  Look for queues in network and storage I/O and queues for the CPUs at the operating system level.   At the database level, we are looking where ‘waits’ are occurring.  For instance, in Oracle we are first looking at waits in the single threaded operation of writing to the log files as well as anything affecting the logging process.

 

Now we can more precisely determine the capacity requirements of the target platform.   Here we can project with greater confidence the characteristics of the platform we will be porting to.  For an application requiring bare metal this is critical; for an application that can be hosted in a Virtual Machine this process defines the initial set up requirements.  Remember this new platform will need to handle the projected growth and handle the unexpected black swan.

 

Now the PoC is over.  The results have been converted to foils to be presented to management for planning the conversion.  The documentation we wrote is now the recipe for the next step, the migration rehearsal.

Filter Blog

By author:
By date:
By tag: