Measuring success: Data centre management

Published: Oct'08 Financial Sector Technology

What gets measured gets done. If you want to improve the efficiency and performance of your data centre therefore, to cope with rising demand, then one of the best things you can do, suggests Tony Dennis, is to introduce effective monitoring equipment and processes to enhance its management

It has never been more important for financial institutions to get the most out of their data centres with IT budgets falling; rising demand for computing power, due to algo trading, compliance and real-time processing requirements; high electricity prices; a lack of space in the City of London; and environmental concerns about carbon dioxide emissions, all the driving the need for more and more efficiency and effective cooling. That means good management – in terms of technology, planning, product selection, business continuity and getting the best out of staff – is an essential consideration, and it requires effective measurement and metrics to ensure optimum performance.

Various financial institutions and vendors are attacking the capacity and cost demands on data centre management from different angles. For some, introducing virtualisation is a ‘no brainer’; because it reduces the number of required servers; for others information lifecycle management (ILM) and data de-duplication reduces the demands on storage capacity in the first place; while others look to energy management systems that promise power and space savings, via enhanced operational efficiency. Taking a holistic approach and absorbing the best practices from multiple disciplines, offers the best solution in meeting today’s highly challenging data centre environment. Measuring the results of your chosen policy and learning from the metrics you put in place should, however, be part of whatever strategy you put in place.

As Liam Newcombe, secretary for the Data Centre Specialist Group at trade body, the British Computer Society, points out, technologies like virtualisation and ILM are all inter-connected in the data centre. However, at VirtualizeIT, a consultancy and vendor, sales director, Ben Stollard, argues that: “Lifecycle management is not just a concept, but a must in any data centre. The whole process of provisioning and retiring servers can be fully automated, even down to a single server and an end user. In fact, automation and lifecycle management are key to deriving the most benefit from virtualised date centre environments.”

The process of virtualisation, however, brings its own management problems. For instance, instead of a physical sprawl of servers, there is of course a greater propensity for servers to sprawl, because of the fact that they are now virtual. Power demand, for cooling purposes, can also conversely rise dramatically if higher performing machines are packed into small spaces. It is a common misconception that a virtualised environment can be managed in the same way that it was managed in its physical state. It actually requires a completely different approach to operational management.

According to Dr Jim White, a consultant with Managed Objects, the key metric is business service management (BSM). “This can enable data centres managers at banks or insurers to understand dependencies and the impact of change, using metrics to create better availability of services and constant ‘up-time’,” he says. Managed Objects is seeing an increasing trend towards BSM as topical issues like energy costs and environmentalism bring the challenge of optimising your data centres to the fore. “Techniques such as BSM show strong potential in the data centre context and, in my opinion, it’s becoming widely recognised that there are many good practises from the service management arena that could help to improve data centre optimisation,” adds Dr White. He has advised clients such as Credit Suisse, Fidelity Investments and JPMorganChase on issues surrounding service management, metrics and business dashboards, frequently urging them to study data flows, air flows and input/output factors to achieve optimum performance.

Energy costs
Recent work carried out by Gartner Research found that while energy costs have traditionally accounted for approximately 10 per cent of most IT budgets, this percentage could soon rise to more than 50 per cent of departmental expenses. The consultancy points to many contributing factors such as rising energy costs; increasing demand for computing capacity; the threat of green taxes; and the continuing adoption of high-density computing strategies, such as virtualisation, as drivers.

The key question is how can data centre managers at financial institutions measure and improve the energy efficiency of a data centre? Migration Solution claims to provide one easy answer with its recently launched Environmental Report and Audit. Although ERA is not yet three months old (first being offered back in June), the company is already close to full capacity conducting five to six audits a week for clients. It says that ERA provides businesses with a score by cross referencing and analysing the dependencies of over 120 different data points. This data covers power and cooling management; technology deployed; the layout of the building; operations and corporate policies, and can be used to track progress or plan more effectively.

Another viable option is to partner with a company with expertise in this particular area. HSBC has picked Digital Realty Trust to help build its global computing data centre in London’s suburbs – a project that is due for completion by Q1 2009. “Having a partner for this data centre project with the experience and resources that Digital Realty Trust offers is a major asset for us,” explains Roy Adorni, global head of data centre services with HSBC. “During the design and planning phase for this data centre facility, Digital’s team has been an extension of our own, and the result is a plan that we feel very confident in.” The two companies have collaborated closely in developing specifications to support HSBC’s security requirements and environmental goals, and appropriate metrics for them. “Our proven ability to work with local utility companies and identify sites that have or can be provisioned with sufficient power to support today’s data centre power and cooling requirements gives our customers a significant advantage,” claims Chris Crosby, a senior vice president with Digital Realty Trust.

Effective measurement and monitoring to gauge performance are an essential part of the design. The vendor uses the power usage effectiveness (PUE) metric as its methodology for measuring and reporting energy efficiency in all its facilities, including the HSBC one. Digital Realty Trust says PUE is an emerging standard promoted by The Green Grid trade body to provide a simple and consistent method of measuring the ratio of power delivered to IT equipment, to the total amount of power used by the data centre facility. The company also claims PUE provides reliable information about the energy efficiency of data centre facilities by calculating how much power is devoted to driving the actual IT components, such as servers, versus the ancillary support elements, such as cooling and lighting. PUE is therefore potentially one of the most useful data centre management tools available.

Case study
At Aviva Norwich Union, infrastructure solutions architect, Steve Houghton, was faced with the typical data centre manager’s dilemma of finding a solution that would allow his company to greatly reduce the number of physical servers running in its data centre and retire ageing hardware. The presence of over 2,500 power-hungry physical machines running in the data centre was consuming valuable power and cooling resources. “We are committed to green IT/efficiency initiatives and to finding ways to reduce power consumption in the data centre,” he explains. Many of the workloads being converted took the form of workflow solutions and quotation systems – all critical to the day-to-day operations of the business. Houghton picked PowerConvert from PlateSpin to provide Aviva Norwich Union with a solution that enables his team to break down, reconfigure and restart workloads, servers or whole environments. This flexibility can improve performance as it allows facilities managers to take a more active role in running data centres.

The workload migration success rate at Aviva Norwich Union increased from 20 per cent in the initial project stages to an impressive 80 per cent. Presently, the virtual to physical server consolidation ratio has gone up to 15:1. This has resulted in noticeable reductions in power and cooling usage, at least in the short-term. The ability to quickly migrate workloads from physical to virtual machines has made the insurer’s data centre much more efficient, and the company also expects to see increases in data centre floor and rack space, as old hardware is retired. “Since implementing PlateSpin PowerConvert, we have realised tangible benefits and savings, and it has given us the opportunity to reduce carbon dioxide emissions, cut costs and increase efficiencies across the board,” Houghton concludes.

Storage
While the main focus in improving data centre efficiency is typically placed on servers, storage systems shouldn’t be overlooked. A survey carried out by Pillar Data Systems, in conjunction with the National Computer Centre, discovered that the majority of UK-based companies are unable to forecast their future storage demands – metrics could definitely help here. “These data storage problems are not limited to any one sector. Around 69 per cent of respondents from the finance sector cited major problems with forecasting their future requirements,” claims Chris Jones, vice president, Pillar Data Systems. “UK companies are struggling with over-complex storage and data environments now, so it’s not surprising they’re finding it difficult to plan for future needs and predict requirements and costs.”

“The industry is partly to blame,” adds Jones. “It has been heavily pushing an ILM strategy using different physical storage tiers. In my opinion, this only serves to accentuate the problem by encouraging more devices, more points of management and more costs.” Data centre managers at financial institutions should be challenging their storage vendors regarding the total cost of ownership; plus the cost of adding more storage or software licences as the company grows. Challenge your vendors on their energy efficiency credentials and their system flexibility. “Can they be re-purposed and assets sweated,” asks Jones.

Blade servers
Over the last five to six years, there has been a very definite trend for financial institutions to opt for high density data centres (blade servers and reduced foot print devices) plus virtualisation to consolidate server estates and maximise precious space. One snag is that when the data centre team clears some computing capacity, using consolidation techniques, the spare capacity is then immediately absorbed by development teams or other users. This can prevent the reduced power usage originally intended from being delivered. It’s not the only snag, though. One well-known side-effect of swapping over from traditional servers to blades has been a massive increase in the requirement for cooling the racks. “Blades don’t change the laws of physics,” says Liam Newcombe, He points out that they might consume less space, but they still require power and generate heat, which can then in-turn require more energy for cooling.

According to Shri Karve, director of business development, with APC: “The IT hardware guys – like IBM and HP – often didn’t tell the electro-mechanical guys, like ourselves, about the changes that blades introduced.” Consequently a typical rack which could accommodate five traditional servers couldn’t offer the same heat dissipation if five blades were fitted. This has lead to racks being designed specifically for blades that offer liquid cooling capabilities, which tend to demand more energy. One wonders how companies account for that when calculating their carbon footprint? Planning, monitoring and measurement is the only way to get on top of the challenge and ensure optimum performance.

UPS burnouts
One particular ticking time-bomb for data centre managers caused by the increasing popularity of blades is highlighted by Chris Smith, marketing director with on365 (an APC reseller). “New server infrastructure equipment is ‘power factor corrected’ in line with reduced energy demand policies,” he explains. “On paper, these blade servers are acting more efficiently. However, there have been reports of such power connectors burning out as a result of new ‘green’ IT estate builds. The harsh truth is that many currently installed legacy uninterrupted power supplies (UPS) were never actually designed for this kind of ‘full-on’ power scenario. One answer is to ‘de-rate’ these power supplies to provide a margin of error.” Smith claims that a legacy UPS might actually have to be de-rated by as much as 20 to 25 per cent of its theoretical maximum load. “We’ve actually started to offer a service to customers where we go in and check that their UPS systems are actually man enough for the task,” he reveals. “If such discrepancies were to suddenly show up during peak trading times at wholesale or retail banks, then IT staff would be left with serious egg on their faces.”

APC’s Shri Karve believes that problems are resulting because, “the IT guys are going ahead and spending their budgets on blades without talking to the facilities management (FM) people.” Traditional backup diesel generator sets can’t cope with the power factor correction used by blades and could shut themselves down. Karve described this as another “nasty surprise” for the FM people. Naturally, APC provides a solution – in the shape of the MGE Sinewave product family – which it says neutralises the effects of power factor correction and enables the diesel set to function normally. Karve’s recommendation is for data centre managers to carry out a regular power audit – or energy ‘MOT’ – to ensure that backup equipment such as UPS and diesel sets are ‘blade-friendly’. He claims that leading UPS vendors like APC now offer products that are ‘blade-aware’ or can offer advice on how to modify existing UPS systems. In another move designed to help data centre managers get the metrics they need to enhance performance APC has also recently announced that it will be integrating its software with IBM’s Tivoli monitoring energy management software. This integration will enable APC’s InfraStruXure Central management platform, and TAC’s building management system, to share key data points with IBM’s Tivoli software.

The BCS’s Liam Newcombe urges data centre managers at banks or insurers to be cautious, given that many vendors are “desperately spraying green paint onto their existing product ranges” and to “ensure any claimed efficiency gains are backed up with proven measured results.” He reveals that the BCS itself is moving towards calculating saved revenues and power savings on a ‘per service’ basis, not on a per device basis. That’s because virtualisation of shared resources, such as storage, networking and processing, makes it almost impossible to allocate costs and efficiencies to physical inventory. It’s also the reason why the BCS is working with the Carbon Trust on a simulator that can calculate what effect a new service might have on the carbon footprint – cut that and you invariably improve efficiency as well. “Physical metering is dead and – worse still – storing data from such meters merely increases the amount of storage required,” Newcombe warns. A more dynamic alerts-led monitoring approach seems to be the way of the future.