In a previous blog on IT Security Metrics as part of a Good IT Security Program, I described the need for a security metrics program that can provide was to show value of the security program to the organization. A simple way to begin with metrics is to define the reason or goal, question, and metric for what needs to be measured. After that, an effort can begin on data collection, analysis, and reporting.

 

Data Collection

The process by which the data is collected should be well understood and documented so that as changes occur in the security processes and controls, these documents can be reviewed for what changes may be necessary in data collection update methods. The data collection methods are vital as there could be the need for measurement units to be transposed in order to provide correlating analysis to other units of measurement. Additionally, if there are qualitative measurements being analyzed, the approach to this transposition of units for measurement must be well documented so that future review can take into considerations changes that could affect the analysis.

 

Analysis

The analysis process for a given metric should be documented so that if necessary, other individuals can take ownership of the metrics analysis to interpret the results and provide a conclusion. Quantitative metric data provides numerical values for the units of measurement and once collected over time, determination of a mean and standard deviation can be calculated to describe a range for the metric. For qualitative data to be used as a metric, the unit of measurements should also be clearly document measurement techniques with security analysts. To avoid bias during analysis of qualitative metrics, a purposeful effort should be made to get agreement from all stakeholders and members of the analysis team on how qualitative data will be analyzed.

 

Presenting Results

When presenting results, it is important to understand the audience to which information will be reported to or presented. Some audiences may not be interested in technical details of the data but more information in the cost to benefit ratio or return on Investment. Indicators can be very valuable to the business group in presenting the current state of security within their area of responsibility. Key Performance Indicators can be presented on a scorecard dashboard by which the team can collectively review the status of security and provide input for more creative analysis from other perspectives.

 

One more important aspect of a security metrics program is that it doesn’t have to be a set it and forget about it approach even if much of the data collection methods are automated. Future efforts can be ongoing to review previously defined metrics along with the attempt to find other relevant metrics for continuous improvement.

Most of the large IT shops are running their applications on computer systems with different CPU architecture. These different systems include not just RISC or IA, but also 32bit and 64bit processors. These are not exactly different CPU architecture all the times, but in most of the cases making a decision to move from one to the other present similar challenges. So why do we decide to move from one architecture to the other, and most importantly what do we need to take in to consideration when making such decision? For start I will just outline some of these challenges as we had to face them in high volume manufacturing area. No organisation that requires large scale computing environment will have same issues and challenges so there is no one solution for everyone. I do believe however that we all have to answer same set of questions before we start planning for migration, executing the plan and subsequently supporting our new environment.


When starting to think about migration to a different architecture make up a list of all these questions and start recording your answers. This will give you sufficient information to figure out if you really have to go through all this work, is it really worth it. And trust me, it is a huge amount of work you will have to go through…

 

  1. Why are we doing this? Try to answer what are your main drivers in simple statements. Nothing technical is required here, really trying to pinpoint your problem statement. It could be as simple as application is EOL on your current architecture, or on the other side it could be “it is a political” decision. Later one is usually easiest to justify and most of below is irrelevant for this type of reasoning.
  2. Is the application going to work on other architecture? Important piece here is to try and outline what is required for your application to be ported to new architecture, how much work is required. It may prove at the end that it is just too much work/cost and it is not worth doing it.
  3. Cost of ownership of new equipment? It is often forgotten by IT professionals that cost of buying new equipment is not the only cost you need to worry about. On-going service maintenance of the equipment can quickly add up to a sum that one never budgeted for, and the worst thing about it is that you have to pay it regularly for the lifetime of the equipment (I plan to write a separate blog on service contracts and maintenance, interesting subject in itself).
  4. Operational requirements post migration? This is where you need to answer series of questions on your on-going support. These are some of the questions you may want to ask yourself: is your operational support model going to have to change once you migrate to new equipment; Is support organisation sufficiently skilled to support new environment; Is the headcount adequate?
  5. How will the migration impact my day-to-day business? This is one of the most important implementation questions, once you already decided that the move to new architecture is going to happen. If your organisation runs an application (or more than one of them) that are not time sensitive and your business can survive happily with few hours of downtime then this is easy. On the other hand if you are running an operation that is highly time sensitive, you cannot afford prolonged downtime because your business will incur huge losses, this is where it is getting interesting and migration strategy is something you have to spend considerable amount of time preparing.

My experience here is in high volume manufacturing area, environment with multiple different CPU architectures running different operating systems supporting highly time sensitive applications. I will try and answer some of the above questions as we addressed them in our environment in the next few blogs, so stay tuned and I hope you will find it useful.

By Joe Sartini

As both and automation engineer and IT Automation manager for many years, I’ve both contributed to and monitored how many IT standard operating procedures [or SOPs] can introduce errors into a system. The challenge for many IT operations teams is how to eliminate the human induced errors and provide closed loop feedback systems to process developers on how to create and maintain more robust project insertions. Every IT engineer I’ve ever known has great intentions to make SOP changes flawlessly; however, we tend to find that a fair proportion of our operational incidents are as a result of human errors during the change process. Intel Factory Automation has strict change control procedures to help engineers through the change process and protect them from the human errors. Aside from all the change control processes that exist in many organizations, I believe, a key to success in this area is to automate as much as possible and where not feasible, is to utilize an automated checklist.

Let me give you an example to illustrate the issues which can be experienced by many IT orgs and a way to avoid or mitigate by putting more IT solutions in the manual processes that will always exist.

The Problem Suppose you have an engineer performing a standard server build or decommission, in each case your engineer would deem this to be a fairly straight forward task, and you as an IT org I’m sure have a documented standard operating procedure depending on hardware model and O/S rev, right?. The problem can arise when our engineers are multitasking on many projects at once, under time constraints. In their mind, the trivial server build/decom SOP needs to be completed before they rush to their next important meeting. So they’re in the data centre[DC] with no access to the SOP instructions unless they print it out or login to a PC in the DC to view it, needless to say, the engineer in a hurry and has performed this task many times in the past, will proceed on memory to perform that same task. However, assuming something has changed in the process since they last performed the build/decom or let’s say, nothing’s changed but they simply forget to perform a task, like, let’s say disabling a SAN switch port for the decommissioned server. Down the road the issue arises where we run into SAN switch port capacity problems which shouldn’t exist. It’s possible that an IT org needlessly purchases more switches to handle the perceived capacity problem or they have another engineer perform capacity analysis comparison of server assets versus active port usage to find that something doesn’t add up. More time gets needlessly spent trying to find the unused ports and disabling them since engineers in the past have forgotten to disable the ports through the Server decommission SOP.

One Solution From my experience as an IT engineer and manager, I focus on IT automated checklists for SOPs. Utilizing simple, easily configurable IT web based solutions, the IT manager/engineer can develop checklists for all your SOPs which require engineers to check the box using an online form which can be centrally tracked via standard/simple IT reports. In this case, the IT manager can monitor the completion/success of his SOPs via %PAS reports. Furthermore, the engineer knows that his name is tracked against the tasks with timestamps, so he/she is more inclined to follow the checklist and complete all tasks. The beauty of an online checklist is that the engineer can access it wherever they have an internet connection, e.g. LAN, Wifi, etc and can utilize any form factor device e.g. PC, laptop, MID, iphone etc. The IT manager can also easily run reports against the time it takes on average to perform each of their SOPs to help them with resource allocation per task and also feedback to development teams on TTM for new project insertions etc. In the example above, the engineer who was in a rush to a meeting and in the datacentre would have accessed their checklist via laptop, phone etc and clicked each box as they completed the task. Say for example he still forgot to de-assign the switch port, or more typically didn’t have time to complete all tasks in one visit to the DC. In this case the checklist would not be 100% complete and in the daily/weekly operational review they’d notice that this SOP in still in flight and would follow-up with engineer to complete the checklist as it’s all been centrally traced and closed loop until actual completion of all tasks.

Let’s take the pen & paper and human guess work out of our IT operations and use our IT skills to develop foolproof solutions to our daily routines. In this way, we’ll have a better chance of removing the human errors.  I’m sure you’d agree we need as much time as possible to handle the h/w & s/w errors that affect our operations availability & reliability.

For a sometime there has been no doubt that cloud computing offers many benefits for the traditional data center. For that reason most of the traditional data centers migrated to cloud computing architecture. In addition, it has become easier to migrate exiting servers to be part of a cloud. So why have all data centers not migrated?
There are some valid reasons why not, those including ROI (which will be discuss on the next blog) especially when we’re talking about production environment that have zero tolerance for downtime. In this blog I’ll talk about the risk and downtime
Here are some of the challenges we face when migrating production environment:


1. As I mention above – why migrate? Most stakeholders will reject that change; for them “if it works, leave it”. What action needs to be taken to satisfy their needs  after the change


2. Of course you do not migrate all servers, so which servers will you?


3. How do we do this migration transparent to the stakeholders? After all, we want the stakeholder to have same level of support….
4. How can we avoid downtime?


5. How can we prevent the migration from being the scapegoat for unrelated failures after the migration?


There are no clear answers to those question, but I’ll try to give some tips that can answer some of those, or at least give a direction
When someone wants to migrate existing production physical servers to be virtual, they should consider the following for planning design and implementation:


1. Note that virtualizing production servers is a major change, so consider in advance what to migrate, enroll your stakeholder in the process to understand business impact and get buy in.


2. Do not migrate every server by default. Choose well in advance the server to be migrate and avoid unwanted migrations. Start with the server criticality; with the application owners, you should define if the server is critical enough and should it get “personal” treatment and not be part of the farm


3. For the same reason as #2, check if to migrate server by resource utilization - in case the server utilization is too high up to the host resource capabilities, the host will probably host only that server, and there is no real reason for that.


4. When designing the virtual environment to host the migrated servers, leverage capacity planning process and understand the resource requirement for each application (capacity planning is a process to check overall capacity usage of physical servers). Although the capacity planning results with low resource utilization you must take into account the current resource of the servers and the server’s owner requirements. There might be a reason for the amount of physical resource, and we don’t want to have lack of resource in production server, not even for 1 minute.


5. As we’re talking about the production environment, we don’t want to be surprised, add to your plan the future growth of your factory and add resources accordingly. Check with your management the production forecast for the years ahead, and together with the servers\applications owner check future resource needs and design the virtual environment accordingly.


6. Although it shouldn’t be a consideration, note that the migration process will require system downtime. Although the migration can be done on-line (some operating system require a server restart) it’s preferable to have schedule server\application downtime for each migration. So understand with your management the possibilities for downtime and plan your migration accordingly.


To summarize, like every technology improvement, when we’re talking about production environment, we need to look at all considerations and find the answer to them.
I hope you find these tips helpful. Please share any tips you have or let me know of any additional concerns


Have a fun and safe migration

Security is a tough and elusive nut to sell.  Everyone wants to be secure, but few can articulate what they want.  It is almost like buying insurance, but not quite.  It can be technical and behavioral.  It exists, but only in a transitive state.  It can be measured, but mostly in a relative way.  History has shown using fear is not the right strategy to sell security.  Customers may not even accept the need for it, if they have never had a security breach.  So how do you sell security?

The answer sounds simple, but it is not.  - Make it ‘Meaningful’.  

In order for security to be meaningful, a problem must be recognized by customers, they must be in the ‘action’ state of mind, the solution must be effective to a desired level, and the economics need to be right. 

If you are struggling, you are in good company.  Right now, the entire industry has problems in all of these areas. 

Making security meaningful to customers:
1. Recognizing a problem exists: Most people don’t recognize the problem, until they feel the pain.  This was true for the longest time in the medical and dental industries.  People only went to the doctor/dentist when they felt pain.  Over time we have embraced preventative medicine.  Security is in the same early stages with people begrudgingly investing when they feel the pain or believe it is imminent.  Basically “security is not relevant, until it fails”.

Recommendation: Timely education and awareness, without propagating false fears, is key.

 

2. Action state of mind: We are creatures of habit.  We rarely diverge from our mental framework of choices.  In order to make a change, our brains must reach a tipping point to decide a different path.  Here is a great article about key life events which drive changes in consumer spending and how the retail industry targets these moments in our lives to sell products.  In security, the same holds true.  We must be in a proper state of mind to invest in security.  In most cases, it is when we become a victim or are forced to change due to external requirements.

Recommendation: Be in the minds of people at the point when they move into the ‘action’ zone.

 

3. Effective solution: There is no single ‘fix’ to security, it is a gradient.  Any solution may provide a better level of security to some aspects, but will not solve all potential problems.  In a cost/benefit analysis, it is important to know the benefits.  This is difficult as the threats, environments, and customer expectations are difficult to quantify and will likely change over time.  The key for the user is achieving whatever they believe is the right level of security.

Recommendation: Have a well thought out solution, coupled with accurate/realistic and clear messages of the benefits to users.  Design and sustain with a defense-in-depth model for longevity. 

 

4. Positive Economics: Security costs.  In one way or another, the customer will pay.  It may be money, time, system performance, annoyance, or any combination thereof.  On the positive side, it also provides some level of benefit, which may include better confidentiality, integrity and availability.  This can lead to a better emotional state and satisfaction.  Measuring the benefit and costs are extremely difficult and as a multitude of factors which contribute are constantly changing in radical and unpredictable ways.  Just because you institute a protection mechanism, it does not mean you would ever be attacked in that manner.  Investing in strong security against one threat, may seem a waste when attacks come from a different direction.  Even if a control does a spectacular job at preventing loss, will you know?  It is hard to measure something which does not occur.  Instituting a security control may make you feel strong today and less so tomorrow.  Right now, the industry does not have a standard for measuring Return on Security Investment (ROSI).  This becomes a difficulty for consumers who want to know they are getting a good value for the cost(s).

Recommendation:  Leverage one of many different methods to determine security value.  Use the best model for the specific security capabilities and user environment/expectations.   Make it real for the consumer, in terms they understand and cherish.

Can you use ONE WORD to describe the biggest challenge facing information security today?

Ambiguity.jpg

I was asked this very question this morning.  After a few minutes of pondering the vast possibilities with coffee in hand, filtering out inappropriate language choices, and digging deep to find a constructive perspective, I declared my one word which depicts the current challenges in the security industry.

 

Ambiguity.  In one word it states the grand breadth of the challenges and great diversity of perspectives for those involved.  What security is, what it encompasses (i.e. emotions, beliefs, states, events), what it is trying to deliver (no, not invulnerability), how to achieve it (e.g. technical, behavioral, process), maintain/sustain it, what drives it (threat agents, losses, opportunities, fears, etc.), how to measure it (Risk Assessments, ROI/ROSI, compliance, value across tangible/intangible losses, etc.), who is involved (attackers, defenders, victims, and bystanders) and how/why the landscape and equation changes so drastically over time (complexities of factors which create the ever changing fabric of security)?

 

There exists both a lack of understanding as well as an overabundance of inconsistent concepts of the above items.

 

Defining the problem is the first hurdle.

Recently, I read an article on Harvard Business Review, entitled: Look to IT for Process Innovation? - Brad Power -HBR

 

http://blogs.hbr.org/cs/2012/03/look_to_it_for_process_innovat.html

 

The blog article poses: Are companies missing out on the product and process ingenuity of IT people? The author thinks many companies are missing out by not including IT in business strategy and process innovation discussions. He provides examples in the blog about what key tools IT organizations can bring to the table. The examples in the blog include: process improvement ("lean") development framework, rapid ("agile") development techniques, as project teaming.

What other IT competencies can help drive process innovation?

 

And what else are you doing to make sure your IT organization has a seat at the business strategy table so that you can help drive real process innovation?

I just read this really interesting article on Forbes: "CIO's Love-Hate Relationship with the Cloud Revealed".  The article captures both the risks and opportunities that IT leaders face with more DIY, easily accessible services available to business leaders.  Providing some interesting data based research to back it up, these trends are real and each IT group will deal differently with the challenges.

 

At Intel IT, we are embracing these trends and focused on improving business partnerships and knowledge, accelerating our pace, employing new technology to drive both efficiency and growth, while changing the culture and roles within IT.  We see it as an exciting opportunity that must be seized, despite the risks and unknowns.  Our initiatives are captured in the Intel IT Annual Report

 

What do you think?

I think Yes.  IT consumerization is a trend that trancends industries and geographical boundaries.  Personally, I had a discussion about this with one of my local school board reps last year who shared his vision of removing text books from the k-12 school system and using BYO technology and other solutions to streamline learning and cut costs.  His vision made perfect sense to me and I discussed some of the things we were looking at inside Intel IT back then – now I’m in a position to share more details.

 

Recently, some of Intel IT’s senior engineers (Dave Buchholz, Alan Ross, John Dunlop) co-authored a paper sharing their insights on how Improving Security and Mobility for Personally owned devices in an enterprise IT environment can be applied to schools and classrooms.

 

Although the classroom does differs from an enterprise office, when it comes to BYO technology, there are some significant similarities:

  • Security is a top concern
  • Both environments benefits from device configuration and management
  • Expanded access to data helps employees and students
  • A consistent user experience across a range of devices is required
  • Shared concern that internet enabled devices tempt users off task
  • Controlling support costs is important

 

Central to finding a solution that addresses the requirements above are four innovation areas that include HR/Legal, Device Management, Technical Infrastructure and Training (across users, service desk and developers)

 

I invite you to learn more about how some Intel IT best practices for “BYOD to work” can help support “BYOD to school”.

"I think execs are bought into the concept but do not know how to execute."

- a senior leader, when asked about sponsorship for enterprise social collaboration in his organization.

 

Looking back at our journey as well as our current challenges at Intel, 3 things stand out in my mind as essential elements of a successful social collaboration program.

 

1. Know your goals

Is increasing adoption of your enterprise social capabilities a goal for your organization? Think again. Having adoption as a leading success indicator could drive the wrong behaviors (over-collaboration, social ‘butterflies’ – M.Hansen).

 

Social tools are a means to an end goal – and the end goal is always business results. So start with your business goals and see how social collaboration can help you to achieve them.  For example – Your division needs to reduce time to market new products. Evaluate if social tools can cut down time for hand offs and knowledge transfer across teams – or if forums can help you to report and act on issues in a timely manner.  Your responsiveness in closing customer issues is poor and your support costs are high - maybe Social CRM can help. Such changes will shift focus from adoption for adoption sake, to true drivers of business value.

 

Adoption is a great lagging indicator that tells you the story of how social capabilities helped to achieve organizational results through crowdsourcing of ideas, timely communication, etc.

 

2. Serve in bite sizes

Your organization probably has a varied demographic mix, where many employees are overwhelmed with concerns such as Do I need to start blogging now? What if nobody “likes” or comments on my post? Can’t I just go back to email?

 

You can make the transition easy for everyone in the company by embedding social collaboration capabilities into standard operating procedures. For example – a team might decide upon wikis as the means to update status or as the document for best known methods. They might use forums for discussions, or to seek feedback on interim work products. Using these tools helps to improve productivity and collaboration within the team, and is not seen as one more thing that employees need to do. Also, it helps them to understand what is expected of them.

 

3. Make it safe

The easiest thing to do to ensure security of your knowledge assets is to lock everything down, and then padlock it some more. That way, nothing ever gets stolen – but then, nothing ever gets used either. The difficulty of determining the right level of controls is what makes information security such a challenging and valuable function in any organization.

 

Generations of office workers have been told to “err on the side of caution” when it comes to safeguarding information. Now they are being told to foster open collaboration so that their organization can derive the full value of their social enterprise investments. This is a big behavioral shift and you need to enlist information security professionals in your organization to help employees adopt the new paradigm. Empower employees with knowledge on information security (data classification, policies and guidelines) that will help them to make the right calls confidently. It is also important to make it safe for individual users by plugging any infosec holes in your systems (access to contract workers, ability to report abuse, logging and traceability, encryption etc.). Also recognize behaviors that foster an open and collaborative culture – such as freely sharing knowledge, and contributing valuable ideas on community forums.

 

Well, that's my list based on what I have experienced and learnt. Do share your insights, or get in touch to discuss.

Security metrics are a highly discussed topic within IT these days but mainly due to the need of understanding how well security controls are working to protect against threats to Confidentiality, Integrity, and Availability or the (CIA) Triad. But one of the most important aspects of such effort to collect data and present it properly is to understand the goal or purpose of the effort. If data collection is just for the sake of collection, there can be a great deal of wasted time into something that is not beneficial.

 

Good metrics is the difference between information and data. Good information is more important than a large amount of data that is not meaningful. Too often, there is a bias towards an expected result that comes out of security metrics to provide information on demonstrating the need or justification for a certain control. This can actually be detrimental to a security program. Security related metrics can be used not only for the purpose of providing good information about security events, but also to help make better risk-management decisions. Security Metrics benefit the organization if they are well understood, used, and provide value and insight. Benefits can describe a business value for what is being spent on an IT Security controls or identifying a security process that can use some improvement.

 

Taking a simplistic approach to defining security metrics, one may use a simple 3 step process known as Goal-Question-Metric or GQM. The GQM concept has been taken from software engineering metrics collection that can also be used for security metrics to provide for a direct link between metrics and goals. The goal setting effort also provides a clear understanding from all stakeholders when a requirements of data collection for metrics crosses business group boundaries.

 

Goal – A goal or objective is specific and should relate back to some system, process, or characteristic of your security program. It should also relate to something measurable and verifiable. A goal should be defined before the start of any effort to collect metrics so that all stakeholders can provide input and agree that the purpose is clearly understood.

Example:

  • Outcome (decrease, understand)
  • Element 1 -  Malware detections (Anti-Virus software)

 

Question – A goal can be translated into a series of questions that allow for the information to develop into attributes and targets. This enables the components of the goal to be achieved or evaluated for success.  There could be multiple questions with multiple metrics for a specific goal.

Example:

  • What is the current percent of systems with malware detections?

 

Metric – After questions have been developed to define the goal operationally, the goal can begin to be characterized at a data level, and metrics can be assigned that will provide answers.

Example:

  • % of systems with malware detections found during weekly Anti Virus scans over one month period.

 

This concept can also help for an annual review of security metrics and indicators to ensure they all are well understood, used and provide insight and value to the organization.

After Defining the Goal, Question, and Metric, it will be the next steps in the process that should include the data collection, analysis and reporting strategy. Those concepts will be my next blog on this topic.

Filter Blog

By author:
By date:
By tag: