Home > Blogs > VMware Telco Cloud Blog > Author Archives: kdahlke

Author Archives: kdahlke

Drive Down OpEx and Drive Up Efficiency

Want to reduce the time and money you spend on network operations? Start with smarter automation.

It’s Monday morning, and your network operations team is already buried in alarms. You know there’s a real problem, because the trouble tickets keep pouring in. Your customers don’t want excuses, they just their services back up and running. But you’re still trying to figure out what exactly went wrong.

A network topology issue? A device misconfiguration or faulty software update? A problem with the application? Maybe the virtual machine (VM) that application is running on? Or the server hosting the VM? Or the SD-WAN edge? It may take hours to pinpoint the problem. In the meantime, you need to figure out which customers the issue will affect and make sure to resolve it in time to meet your contracted service-level agreement (SLA). The worst part: it’s still early. You’ve only seen a fraction of the hundreds of thousands of network events that hit your NOC every day.

It’s not a pretty picture. Unfortunately, it’s not a rare one either. As networks get more complex, managing services across heterogeneous network resources (physical, virtual, SD-WAN, NFV) keeps getting harder and more expensive. But it doesn’t have to. You can cut through the complexity and get to the root of network problems more quickly. You can automate many of the tasks that take hours to accomplish now. You can deliver excellent customer experience with less effort.

To do any of this though, you’ll need a new kind of service assurance—one that’s a lot more automated and intelligent.  One that can span the legacy networks in place and the new virtual networks and services you and your enterprise customers are deploying.  One that is designed for operator networks as they are today, not the way they were a decade ago.

Automate Everything

Too much of current network operations still depends on human decision-making, even as it’s become impossible for human beings to keep up with the complexity of service provider environments. And looking at the latest trends for global IP and mobile networks, the problem is only going to get worse.

According to the Cisco Annual Internet Report, over the next three years:

  • There will be 5.3 billion internet users by 2023
  • The number of devices connected to IP networks will be more than 3X the global population (29.3 billion networked devices by 2023)
  • Machine-to-Machine connections will represent 50% of the globally connected devices (approximately 14.7 billion M2M connections by 2023)

And that’s just raw growth in data and devices. The major technology transformations happening in networks will make service delivery even more complex:

  • The global edge computing market is expected to grow from USD 2.8 billion in 2019 to USD 18.36 billion by 2027 per Fior Markets
  • According to IDC Research, the SD-WAN infrastructure market is poised to reach USD 5.25 billion in 2023, growing at a rate of growing at a 30.8% CAGR from 2018 to 2023

The only way to get a handle on this massive growth in the scale and complexity of network services: automate.  VMware Telco Cloud Operations put assurance automation in the hands of network operators. Telco Cloud Operations automatically tracks device configurations across your entire multivendor network—transport, physical, virtual and services layers. It continually maintains a stable configuration, monitoring for any configurations that are out of compliance and can take action to bring the devices into compliance.  When other issues are detected and alarms raised, Telco Cloud Operations correlates information and status from the entire network to determine the root cause and generates an alert or remediation workflow—all in an automated fashion, without the need for human intervention.

What does all that automation mean for your network teams? Much faster incident detection and response, and a lot less manual effort and “firefighting.” By automating a lot of manual effort, operations personnel can be elevated from incident response to more strategic functions, working to resolve more complex issues and prevent future incidents rather than reacting to routine ones.

Take Network Intelligence to a Higher Level

Network teams already spend less time on mundane, repetitive tasks than they used to, because the many of these tasks have been automated. What operators really need is a way to streamline the more complex processes that currently require human judgement and decision-making. This is where Telco Cloud Operations intelligence makes a significant leap.

  • Automate root-cause analysis: Current network management systems (NMS) bury teams in information. They provide massive amounts of data about the “symptoms” of an issue but leave it to human operators to translate those thousands of alerts into something that makes sense. Telco Cloud Operations correlates all active, inactive and unknown alarm statuses together with the network topology and relationship between devices to quickly uncover the root cause of the problem. It determines not only knows what’s causing the issue, but which services and customers are affected. Your network operations team see only the alerts that actually matter.
  • Adapt to changes dynamically: Many event management system (EMS) tools use rule-based engines to suppress redundant alarms and reduce the alarm storm. However, this requires a significant amount of time by skilled operators to first create the rules based on the network topology and relationship between devices and then continuously update these rules as new devices and services are changed or added. VMware Telco Cloud Operations updates itself automatically. It uses an advanced, multi-dimensional deterministic model-based engine that continually adapts to dynamic networks—saving thousands of personnel hours per year.
  • Address the most important problems first: Conventional NMS and assurance solutions are designed to solve technical problems, not business ones. If there’s a problem affecting multiple services and customers, for example, it’s still up to human beings to figure out how best to triage the response. Telco Cloud Operations business impact analysis tools can automate even this process. By assigning business impact scores to your various tenants and services, you can automatically prioritize incidents affecting your most important services, your highest-profile customers and the problems most likely to lead to costly SLA violations.

There’s a Better Way to Run Your Network

How much time and money could you save if your network operations teams didn’t have to function as a human correlation engine? For large service providers and enterprises around the world using VMware Telco Cloud Operations, this is not an academic question. These operators are automating network discovery, device configurations, and compliance management. They’re automating the process of identifying true problems versus symptoms and driving down the mean time to detect and repair them. Most importantly, they’re focusing their time and effort where it matters most—on areas that directly impacts revenues—instead of constantly putting out fires.

Want to learn more about what VMware Telco Cloud Operations can do for your organization? Visit: https://www.vmware.com/products/telco-cloud-operations.html


Blog by Karina Dahlke

Image source: stock.adobe.com



Fragmented Assurance Doesn’t Cut it Anymore

Today’s networks are too complex to manage domain by domain. It’s time for a unified view with unified assurance.

Imagine you’re the head of a crack team running video surveillance at a big department store. You have all the latest tools and security systems at your fingertips. Unfortunately, each department has its own separate system. Want to see what’s happening in Housewares? Just open this tool. How about Men’s Clothing? Swivel over to a different screen and open this other tool. Now Women’s Shoes? Swivel yet again—new screen, new tool. And by the way, each system is totally different, using different UIs, terminology and processes. So, if a problem crops up that spans multiple departments, hope you’re not in a big hurry to fix it.

Of course, this is ridiculous—nobody would design a surveillance system in such a fragmented way. It would take forever to investigate problems (not to mention dealing with the constant whiplash caused by all that swiveling). Unfortunately, it’s not that far from how communications service providers (CSPs) monitor and manage their networks today. One system for managing legacy physical devices. Another for virtualized networks. Another for SD-WAN.  Another for NFV.

It’s not that operators set out to make their lives more complicated. It’s just that, as CSP networks have evolved, service assurance hasn’t really kept pace. The result is an increasingly fractured view of the network and the services running on top of it. How does this lack of visibility impact CSPs and their customers? And what kind of capabilities do they need to stay on top of their networks as they are today, instead of a decade ago? Let’s take a closer look.


CSP Networks Keep Getting More Complicated

Not long ago, service provider networks were (from the perspective of assurance, at least), fairly uniform. Every IP service was delivered over physical devices that operators could monitor and manage, usually with just a few tools. Today, CSP networks are a complex mix of virtualized network functions (NFV), legacy physical devices, software-defined wide-area network (SD-WAN) overlays, often provided on a customer-by-customer basis, and more. What customers view as a “service” may now traverse multiple domains and both physical and virtual devices. But within CSP organizations, each area typically has its own set of monitoring tools, with specialized experts supporting them.

As Anil Rao, principal analyst for Analysys Mason notes in Reimagining service assurance for NFV, SDN and 5G, “Existing physical networks will coexist with new NFV networks for the foreseeable future, creating a complex network environment and introducing a new dimension of assurance and operations complexity. New-age automated assurance systems must provide monitoring and operations automation capability for hybrid physical, virtual and cloud native networks and services.”


The Costs of Fractured Monitoring

Why is it imperative to gain a more unified view of the network? Because escalating complexity, paired with diminished visibility, leads to a host of negative business outcomes. As Appledore Research details in Rapid Automated Service Assurance in the NFV and SDN Network, running multiple assurance systems in “silos” leads to:

  • High operating costs as networks require more and more specialized tooling and expertise
  • Inability to quickly translate device- or domain-specific alarms into real-world business impact to customers
  • Slow, inefficient workflows to detect, diagnose and repair issues
  • Need to maintain multiple duplicate data sets and models, further increasing costs and complexity
  • Higher license fees for tools and software dedicated to each separate domain

These issues can carry a significant cost. According to some analysts, service providers lose $11,000 for every minute of downtime while they try to piece together what’s happening, where, and which customers it’s affecting. And that doesn’t include the hit CSPs take to customer satisfaction, which can be even more significant. As Analysys Mason analyst Terry van Staden notes, “We know from our related business services research that customer satisfaction has a substantial impact on churn and cross-selling potential. Our data supports the common-sense assumption that satisfied customers are far less likely to churn and more likely to purchase additional services than those that are unsatisfied.”

Addressing this fractured network view is important for current networks. But in the near future,as you roll out 5G services, it will become absolutely essential. 5G places demands on the network unlike anything CSPs have had to contend with before: Exponentially higher traffic density. Up to 77,000% higher throughput. Latency requirements up to 6,000x lower than previous-generation services. In a 5G world, network issues that introduce just a few milliseconds of delays can render critical applications (autonomous vehicles, remote telemedicine, industrial automation) completely unusable.

Anatomy of a Modern Approach to Service Assurance

These are serious issues, but they’re not intractable. To fix them, you need a more holistic way to monitor and manage services across your entire environment—physical, virtual, SD-WAN and more—in one place. A modern approach to service assurance should enable you to:

  • Bridge physical and virtual worlds:  If you’re going to deliver better service experiences and meet more stringent SLAs, future service assurance tools should allow you to monitor and manage services traversing both physical and virtual domains through a single solution. It should provide a unified view centered on your customers and their services—not on the various infrastructure domains.
  • Visualize everything:  In today’s sprawling, dynamic operator networks, just knowing what’s out there is a huge challenge. A next-generation service assurance solution should automatically discover the topology of the entire multivendor network—including the transport, physical, virtual and services layers. It should automatically recognize when something in the network changes and update its relational map, so you’re never working from out-of-date information.
  • Manage all networks through a single pane of glass:  Next-generation assurance solutions should integrate service monitoring and network management end to end. That means correlating devices such as hosts, switches and routers with VMs, NFV, SDN and SD-WAN environments. It’s this correlation that empowers operations teams to identify faults and performance issues quickly. It’s also a necessary prerequisite for systems to respond to issues automatically and remediate the actual service rather than just a portion of the network.
  • Support multi-tenant, multivendor environments:  Just as you don’t want to have to rely on different tools for different network domains, you don’t want to have to use different tools for different customers. Instead, you should be able to monitor and proactively manage multiple customers, even with diverse environments, in one place. Your operations team should be able to visualize, analyze and optimize your environment to accelerate resolution times, assure high availability and meet stringent SLAs.

Fortunately, capabilities like these are no longer science fiction. You don’t have to rely on network monitoring and management tools built for an earlier time. It’s time to step up to holistic, multi-layer service assurance.

To learn how VMware can help, visit us at VMworld 2019 or: VMware Smart Assurance

To learn how Dell EMC can help, visit: Dell EMC Service Provider Solutions

Blog by Karina Dahlke, Telco and Edge Cloud Business Unit, VMware