Telco Cloud

Small Wins, Big Impact: Seeking Early Wins in Telecom Sustainability

Much of the work happening in telco sustainability today has the potential to deliver groundbreaking new capabilities. New tools for real-time emissions tracking, carbon-aware workload scheduling, deep cooling, and more will make tomorrow’s telecom networks much more energy-efficient and less expensive to operate. In many cases, however, these projects are still in early stages, and communication service providers (CSPs) won’t realize those outcomes for several years.

There are, however, steps that CSPs can take right now—the “low-hanging fruit” of sustainability interventions—that can be implemented starting today. These interventions don’t involve huge technological or architectural change, but they do require CSPs to think differently about how they optimize and run their networks.

If you’re exploring green initiatives for your organization and looking for some early wins, start by revisiting the strategies you employ for redundancy, protection, and workload prioritization. You may find they’re based on assumptions that made sense in yesterday’s telecom networks, but that no longer apply. You might even find opportunities to realize significant power-efficiencies and cost savings right now—even as you wait for the more advanced sustainability initiatives of tomorrow.

Changing Ideas of Risk

CSPs have long prioritized “carrier-class reliability” above all else—doing everything possible to avoid failures. Just as networks and data centers have evolved, however, the way we think about concepts like risk and redundancy should evolve too. CSPs can gain significant efficiencies by embracing modern cloud approaches to resiliency and fast recovery used in hyperscale data centers. Instead of emphasizing failure avoidance, they may find it’s far less expensive—and better for customers—to focus on minimizing the impact of failures and quickly recovering, rather than trying to prevent them.

Rethinking how you assess risk doesn’t have to mean sacrificing 99.999% availability. For example, it’s now possible to assure “five-nines” reliability at the application stack level—where it matters most to customers—without necessarily building that level of reliability into the underlying infrastructure stack too. Ultimately, services need to be carrier-class reliable, not infrastructures. Customers don’t care if a host fails, or even an entire data center. They care that they can they still make calls, send texts, stream video, and access business applications.

More risk-averse operators have been slow to explore these possibilities. But the truth is that many concepts that used to be considered risky are quite safe in today’s virtualized, cloud-based environments. In fact, some longstanding ideas about the “right” way to build telco networks are now little more than myths. By busting some of these myths—recognizing when risk mitigation strategies are holdovers from legacy networks—you can achieve some significant sustainability wins.

Optimizing Power Consumption with Host and Distributed Power Management

Myth: Every part of the CSP infrastructure must be designed and run to deliver 100% reliability.

Reality: Modern clouds optimize for fast recovery, not nonstop availability.

Today, many CSPs still architect their networks for total protection in the event of a failure. This allows them to promise carrier-class reliability, but it also means that most of the time, just a fraction of the full capacity of the network is used, and large amounts of resources sit idle. Since an idle host—even when not running a VM—consumes 20-50% of its peak utilization power, those idle resources add up to significant energy consumption and costs.

In many cases, you can reduce wasted resources by consolidating workloads and using VMware Distributed Power Management (DPM). Using vSphere power management tools, you may be able to run some workloads more efficiently or, depending on your environment, shut hosts off entirely. The first step is to identify workloads and environments that can be optimized. You can then configure power profiles for various hosts that are better aligned with actual usage, all controlled by the VMware ESXi hypervisor.

The level of optimization possible will depend on the workload. For signaling and packet core workloads, for example, it can be difficult to gain much efficiency, and you’ll need to work with your vendors to determine where such efforts make sense. IT and OSS/BSS workloads, however, are excellent targets, as are dev/test and staging environments. These environments may be sized for full-load testing, with hosts continually consuming full power even though they only perform actual testing a few times per year.

You can also de-risk the environment to run at lower power states at baseline, while retaining full control to ramp up to high-power states in seconds when needed. Effectively, you use software-defined data center (SDDC) controls to configure a “break-glass-in-case-of-emergency” mechanism. Under normal conditions, the environment prioritizes power efficiency and savings. But if you have increased load or demand, you can flip a switch (or have ESXi automatically flip a switch) to maximum performance, and quickly revert to getting every last cycle or bit possible. You can also schedule these interventions, optimizing for performance during certain periods or times of day, or running at max performance at baseline and dialing back to low-power states during known periods of inactivity.

Minimizing Overprotection

Myth: Telco networks must be built with up to 200% capacity to assure against faults.

Reality: Modern clouds can manage capacity much more efficiently.

Many CSPs still employ the N+1 or N+2 model of building spare capacity, keeping up to 200% capacity in reserve, so that it’s available if needed. They might maintain hot or cold standby resources, depending on application and regulatory requirements, or use a mixed approach across different hosts. Regardless, this approach entails significant power consumption for resources that rarely get used.  

You don’t want to sacrifice service reliability for efficiency, but you can certainly look closely at your redundancy approach to assess whether the benefits are worth the cost. Do you really need that much space capacity? Or are there places where fast horizontal scaling could protect you just as effectively with less wasted resources?

Look for easy wins in dimensioning as well. It’s common to find CSPs with multiple layers of spare capacity and failover redundancy operating in parallel. When you’re using multiple vendors and services, it’s easy to end up double- or triple-booking redundancy at the infrastructure, cloud, and application layers. Effectively, you’re paying for “N+2+2” redundancy (and the energy to power it), with little additional benefit. 

Prioritizing Workloads

Myth: All VMs should be running full throttle at all times.

Reality: Some workloads are less critical than others and can be optimized for efficiency with little risk.

Another holdover from telecom networks past is an unstated belief that every VM must be implemented to deliver maximum performance at all times. But you can use modern cloud and traffic engineering techniques to prioritize those VMs that really are critical, while overcommitting and load-balancing those that can tolerate on-demand activation.

To take advantage of these interventions, you’ll first need to analyze your environment to determine which VMs do need to be running at 100% performance. You can then employ targeted mechanisms to ensure that those critical workloads are protected. Meanwhile, it may be possible to consolidate less critical workloads during periods of low utilization and shut down excess unused hosts. That consolidation alone can yield significant power savings. (Though again, optimizations available will depend on the type of workload, level of overprovisioning, and vendor sizing requirements.)

Utilizing SDDC Capabilities

Myth: Everyenvironment should be built to the same minimum specifications.

Reality: Different environments have different real-world needs.

One of the easiest quick sustainability wins for CSPs is to just take better advantage of the flexibility and control of their software-defined data center. Simply using SDDC virtualization and abstraction capabilities as intended, you can achieve significant efficiencies.

For example, there’s no need to use dedicated storage arrays in all environments. Here again, test/dev environments can present excellent targets for optimization, especially when using tools like vSAN storage abstraction. Using just a few disks in a host for test/dev data will consume significantly less energy than an entire storage array, while reducing both CapEx and OpEx.

Keep Your Eye on the Numbers

Clearly, CSPs don’t have to wait for tomorrow’s innovations to start optimizing sustainability. By embracing modern cloud and SDDC approaches, it’s possible to achieve real wins in power consumption and operational efficiency now. Even with the interventions discussed here, however, there are opportunities for even bigger gains in the future.

In many cases, it’s not actually CSPs themselves insisting on building their environments to outdated specifications. Often, it’s application vendors requiring maximum performance at all times as a condition of support contracts. There’s no malice with this prescribed overprovisioning. It’s just vendors playing it safe, and it happens with every major technology evolution. In the early days of virtualization, for example, vendors demanded minimum resource requirements far higher than what was actually needed. It took years of real-world deployments before everyone understood what applications actually required in virtualized environments.

The same phenomenon is happening now in telco clouds. But CSPs can accelerate this process by using the tooling they’re already paying for to capture data about how applications perform. Using tools like Aria Operations, you can show your vendors hard data about the resources their workloads actually require, versus what they’re asking you to provision. As sustainability becomes more important to operators, enterprises, and vendors alike, these conversations will help everyone move to a greener future more quickly.

Read to talk to us to learn more? Contact us here.