Better Data for Better Service Assurance

More than ever, an operator’s choice of network management software can truly spell success or failure. When an operator lacks an advanced service assurance solution, it sets itself up for service quality failure, which lowers the customer experience — and may ultimately cause the business itself to fail.

And getting it right is a big challenge thanks to the great diversity and complexity of today’s networks. A good assurance system must not only keep pace with business change, it must also be efficient. Gone are the days when you could afford to hire a large team of engineers or integrators to keep the system accurate and up-to-date.

Well, one company delivering a service assurance system that fits today’s market need is Centina Systems, and their Vice President of Business Development and Marketing, Gregg Hara, joins us. Gregg explains the catalyst for starting Centina, the continuing network challenges they address, and innovative business model they’ve set up to put advanced service assurance within easier reach.

Dan Baker: Gregg, thanks for joining us. Maybe we could begin by hearing a bit about Centina’s mission.

Gregg Hara: Thanks for the invitation, Dan. The founders of Centina hail from the major equipment providers in communications — places like Nortel, Alcatel, Fujitsu and Cisco. And their background was in building Element Management Systems (EMSs) for those hardware vendors, so these guys are pretty network-savvy, especially in transport and optical networks running the TL1 protocol.

Now having spent a good deal of time in the Network Operation Centers (NOCs) of telco customers, they noticed the immense problems the NOC users were having. Here’s a quick rundown of the issues they noticed:

  • Fighting fires as opposed to doing prevention — What NOC people were basically doing was fighting fires — reacting to events, but doing little in the way of proactive maintenance.
  • Lack of automation to isolate problems and determine customer impact — Having to navigate through multiple systems made it impossible to quickly determine which customers were affected by problems. The current systems could not even point the user in the right direction. If they could, they’d be able to head off problems before they impact customers.
  • Users are overwhelmed by the workload — Changes need to be constantly made in the assurance system, yet those changes are very time consuming to implement because lots of scripting and programming is involved. As a result, users can’t keep up and are forced to cut corners.

So seeing all these service assurance headaches, Centina’s founders figured there must be a better way. That’s why they decided to go out on their own and build a better product.

What do you believe is the main source of problems in the legacy service assurance systems?

Well, to begin, there’s very little you get out of the box. When you first install legacy products, they don’t do anything because they are merely toolkits. You have to spend months and months — perhaps years — customizing them to monitor, correlate and do the tasks you want them to do.

Take for example integrating all the devices in the network. Every provider usually has between 50 and 250 different device types, and if you want to monitor all of those different device types in one system, that’s a lot of hard work.

First, you have to build out probes to handle all the alarms and traps. And if you want to monitor device performance, you have to define polling and write formulas for all the performance data. And once you’ve established that baseline, you’ve also got to maintain it because engineering upgrades the network devices all the time.

This is very cumbersome, yet this is exactly what life is like in a legacy service assurance deployment. People have dedicated their careers to becoming proficient in these toolkits, and it’s all about writing Perl script, coding and dealing in lots of technical detail.

Now this situation is only going to get worse as SDN and NFV services roll out because there will then be multiple points you have to monitor and correlate, from the application to the virtual network to the physical network.

How do you get around these network integration and maintenance headaches?

Well, at Centina we feel strongly that the only way an operator or MSO can manage service assurance successfully is to get some major help from their software provider.

A service provider shouldn’t have to manage the updating of network device integrations. Seems like that should be the responsibility of the solution vendor they bought their tool from. And that’s exactly what we’ve done. Centina created what we call a Smart Plug-In for every vendor and model number that we have interfaced to at our customers. Today, that’s over 1,000 different plug-ins across 135 different vendors.

This library works almost like an app on your phone. The software goes out and discovers all the devices in your network and matches them up to the appropriate plug-in and just starts monitoring. It will process alarms and handle every performance metric that the device supports. Then it’s a simple matter of configuring which alarms and metrics you want to collect, how often you want to poll the performance data, and then define any thresholds.

Now when we sign up a new customer, they will invariably ask us to support some new devices not yet in our library. It takes us between one and three weeks to build a brand new plug-in. After that, when a new device release comes out, the operator is fully covered under our ongoing maintenance.

What about the service visibility you need to quickly isolate problems and find out what customers are affected? And what about service level management? To what degree is that concept real today?

Dan, if you talk to any service provider managing large networks, they’ll tell you that they have very little visibility into their end-to-end services — how the services are layered. That’s very hard to deliver because these are multi-vendor, multi-technology environments.

When you buy a high speed internet service from a Cox, Time Warner, Level 3, AT&T, or Verizon, you can get a pipe into the internet via Ethernet, but that Ethernet is running over some other technology. It might be running over SONET or DWDM. Many different layers may make up that service. So when a network problem or performance issue occurs, it’s tough for a service provider to figure out where the root cause of the problem lies.

When you first get the alarm or a customer calls to complain about the issue, you need to figure out how the different layers are all being managed by different systems. So the user is forced to swivel-chair through all these different systems to figure out what’s going on. What we’ve done at Centina is to give the user the ability to visualize end-to-end services, hierarchies, topological layers, plus provide various charts and alarm views. And, since we have all this information in our system, it can automatically suppress all the noise in the network and highlight the root-cause of the problem to improve the reliability of the network.

It’s a big help because it pulls all the information you need to see all the layers visually. For instance, it lays out in a tree the various Ethernet components, and mixed in there are the performance and alarm data at each level displayed in a vendor- and technology-independent way.

We have such a system up and running at WOW, Wide Open West, who is the 9th largest cable provider in the US, with about one million subscribers.

Gregg, what’s your solution look like from the user (operator) perspective?

Well, as I mentioned before, the catalyst for launching Centina in the first place was user problems in the NOC, so we’ve devoted a lot of time thinking how to improve the user experience.

Number one, everything is browser based, making it easy to learn and configure. Being browser-based reduces on-going administration time and costs, too.

Now in a legacy assurance system you usually get a couple of alarm dashboards, but the ability to interact with the alarm system is very limited. Users don’t have a lot of control over visualization and don’t have the ability to change the configuration of the system. With our NetOmnia product, however, the user can create new views and filter the results without touching the underlying database or doing any scripting. So if the user wants to look at performance in five separate regions or five separate technologies, it’s easy to create that view.

Say you want to automate action on an event. Well, you can set up a little workflow that says, “If I get a critical alarm and it’s been active for more than 10 minutes and nobody has acknowledged it, then send an SMS to everybody in the NOC to let them know somebody needs to get on this problem fast.” A workflow like that only takes a couple of minutes to create.

Service providers and operators are looking at SDN and NFV to transform their business, so how has Centina prepared to meet the demands of virtualized networks?

Well, our solution can successfully support the current and evolving assurance needs of virtualized networks and we have been seeing an increasing interest in service assurance. Through virtualization, service providers are not only decreasing their network’s maintenance and associated costs but are also increasing their service’s innovation capabilities and improving customer satisfaction — provided they have the right proactive, service assurance solutions in place to monitor network lifecycle events. However, though the benefits of SDN and NFV are significant, service assurance is much more challenging and even more necessary than with traditional networks.

What is needed today is a comprehensive and adaptable service assurance solution that provides holistic service health and performance visualization across hybrid legacy and next-generation SDN/NFV networks. Ultimately, having an end-to-end, strategic service assurance solution in place will help providers successfully transition to virtual networks.

Whereas most vendors are working towards supporting virtual services and presenting mostly slideware at this point, NetOmnia is already integrated with OpenStack and Open Daylight to support the assurance of SDN and NFV services.

Thanks, Gregg. It’s nice to know that an affordable and full-featured assurance solution like this is available. And for large telcos who are wedded to their current solution partner, awareness of these nice capabilities should help move some elephants.

This article was originally published by Black Swan and has been reproduced with their permission.

Dan Baker
Dan Baker
Dan is a founder of the Technology Research Institute (TRI), which has published studies about the telecom software market since 1994.

As a journalist, Dan wrote for B/OSS magazine and recorded webinars with VanillaPlus before launching his own publication, Black Swan Telecom Journal.