ITSM Basics: How to Do Configuration Management – Part 1

tribal-knowledge-1

One of the biggest pain points I’ve encountered during my years of working in IT service management (ITSM) is the fallout from the absence of a properly-defined configuration management process. Whether it’s related to change, incidents, problems, service levels, or service costing, the lack of configuration management and accurate configuration data can cause an organization significant harm.

Implemented properly, not only can configuration management help you to resolve incidents by the identification of what’s actually broken, it can also help to prevent them by assisting change owners in properly assessing their changes.

The IT service architecture, for even small organizations, can be extensive and complex. And without knowing which services and systems are related, and affected by each other, within the architecture, you’re opening yourself and your organization up to a great deal of uncertainty and risk. Thus configuration management is critical to maintaining order and keeping all hardware and software, and IT services, performing at optimum levels.

The Value of Configuration Management

It baffles and bewilders me when senior managers say “We don’t have the money for expensive configuration management systems (CMSs).” They must be oblivious to all of the money being consumed by: delays to restoring IT and business services; implementing changes without really knowing what services are going to be affected and the potential for adverse business impact; managing, and optimizing, the IT estate costs; and managing complex releases without knowing the full impact.

Besides the financial costs, what about all the resources wasted on paying people who sit around unable to work because a system they require is down? Perhaps because nobody realized that it was connected to another IT service that was just stopped.

What about the time taken explaining to the business why a key business service was down due to a change-related oversight? Or explaining to customers why it took longer than it should have to identify a faulty infrastructure component due to the lack of configuration management data? Or explaining to end users why a release went in with defects as a result of poor estate management?

Then there’s the additional money potentially lost on: downtime, as a result of failed changes/releases; fines, as a result of failed software license audits; service credits, given to irate customers; and customer churn.

What I’m trying to say is: You can’t do change management without it being underpinned by effective configuration management. As without it you’re effectively bouncing a server/network device and praying that no customers will be adversely affected. Is this really an acceptable way to do IT in 2015?

Is it also still acceptable that many organizations rely upon “tribal knowledge” when evaluating the risk and impact of a change? When the information about a service or system is passed verbally from person to person, or even if written down, but is only accessible by a select group of people, then you’re in dangerous territory.  Tribal knowledge causes huge problems for IT teams, yet still remains the preferred method for knowledge sharing in organizations avoiding configuration management.

The Configuration Conversation

So what can be done about it?

I’d suggest starting with a mature conversation about configuration management, and how to use it properly, with everyone that has a vested interest including system owners, network managers, etc. That the configuration management team, or those responsible for configuration management data, must be able to supply accurate current states to the teams solving incidents and problems, and those planning changes. Similarly, as teams make changes to production, that the changes are communicated to the configuration team so that updates are made to the CMS or the configuration management database (CMDB). All sides need to do their part for configuration management to succeed. And so it’s important that all involved parties understand what’s at stake.

I’ve had many such conversations over the years and, as a result, have created a method for doing configuration management that I’d like to share with you.

Getting Started With Configuration Management

The first thing to start with is how you plan to approach it. You need to have a plan. As someone much wiser than me once said “failing to plan, is planning to fail.” Doing configuration management in the real world means that you have to look at the following areas:

  • Planning
  • Identification (or baselining)
  • Control
  • Status accounting
  • Verification & audit

But we need to walk before we try to run – as configuration management will be the foundation of services and service delivery, we need to get it right and to do that we need a plan.

What’s the Scope of the Plan?

Is it the whole IT estate all at once (good luck with that) or just key IT services? For what it’s worth, I would always start with the most critical service and systems. You know the ones, if they even think of falling over, your service desk gets overrun with calls and senior management starts looking anxious. If you start with your most critical services, you’ll get a quicker and more visual realization of the available benefits, and the rest of the business will quickly notice the positives such as quicker incident resolution, less failed changes, and less downtime.

When considering scope, there are different levels to consider:

  • Configuration
  • Asset
  • Inventory

As per the following diagram:

diagram-blog

Where:

  • Configuration Items (CIs) are the building blocks for your critical services, including servers, routers, software platforms, and their underpinning systems. It takes a lot of time, money, and effort to manage these effectively so you need to focus on key services, otherwise your CMS or CMDB will become so overrun with “stuff” that it will become almost impossible to keep it updated.
  • Asset Items include PCs, laptops, anything you care about and need to keep track of for financial reasons but don’t need to control to the Nth
  • Inventory Items are the small stuff – peripherals such as keyboards, mice, etc.

Consistency Is Key

Your plan will also need to set out the naming conventions for CIs. And each CI should have a unique name, whether that’s generated by the CMSB, an asset tag, or by site. An example could be: Type of CI – Location – Support Team. Thus for example, MSserver-Manchester-Level 3 would immediately tell a service desk analyst that a Windows Server at the Manchester site was unavailable and needs support from the 3rd line support team. By having this level of information, at the point of incident logging, not only helps the service desk analyst notify all users of the server of the issue, but also simplifies the ability to assign the ticket directly to the correct support team, potentially saving a couple of functional escalations along the way.

Another way to name servers is to have themes. I’ve worked at companies where servers were named after Terry Pratchett characters, e.g. Rincewind, Vimes, Weatherwax, etc. Whatever you decide in terms of naming conventions, ensure that it’s applied consistently such that incident, problem, and change records can be updated with the correct information.

Finally, Your Configuration Plan Must Be Scalable

When you map out your key services, only document the critical details needed to provide management and support. This way, when you expand your scope to include more of your infrastructure it’s an easier task. You can always add in more detail later.

That’s it for now. So far we have covered planning; stay tuned for Part 2 where we discuss identification, control, status accounting, and auditing. If you have any questions, or something to add, please leave a comment below.