Performance and Security Optimizations on Intel Xeon Scalable Processors – Part 2

Contributors

Manish Chugtu — VMware

Ramesh Masavarapu, Saidulu Aldas, Sakari Poussa, Tarun Viswanathan  — Intel

Introduction

Intel and VMware have been working together to optimize and accelerate the microservices middleware and infrastructure with software and hardware to ensure developers have the best-in-class performance and low latency experience when building distributed workloads with a focus on improving the performance, crypto accelerations, and making it more secure.

In Part 1 of this blog series, we looked at how Tanzu Service Mesh uses eBPF (in a non-disruptive manner) to achieve network acceleration by bypassing the TCP/IP networking stack in the Linux kernel and we loved the interest shown and feedback we got for that. In this Part 2, we will deep dive and showcase how Intel and VMware have been working together to accelerate Tanzu Service Mesh (/Istio) crypto use-cases (mutual TLS use-case) and improve the performance of asymmetric crypto operations by using Intel AVX-512 Crypto instruction set that is available on 3rd Generation Intel Xeon Scalable processors.

Security is one of the key areas that service mesh addresses. In Tanzu Service Mesh, there are multiple security features that are provided which include authentication and encryption, through TLS and mTLS. A large number of these TLS connections, either at Ingress Gateway or mTLS connections between all the microservices can affect the performance and also the availability of proxies serving these connections.

Solution Design with Tanzu Service Mesh

Crypto operations can be both symmetric and asymmetric in nature. Asymmetric crypto operations are CPU intensive when implemented in software. TLS handshakes that happen in the cloud-native environment (like Tanzu Service Mesh) use asymmetric crypto operations leading to performance challenges.

The solution utilizes the following libraries

IPP (Intel® Integrated Performance Primitives) Crypto Multi Buffer Library  Intel AVX-512 utilizes Single Instruction Multiple Data (SIMD) vector instruction capabilities in the CPU. Recently crypto instructions have been added to the vector instruction set called Intel AVX-512 Crypto. Multi-buffer cryptography implemented with (Intel AVX-512) instruction sets using a SIMD mechanism gathers up to 8 RSA or ECDSA operations and processes them at the same time. Using this the TLS handshakes in Tanzu Service Mesh when accelerated with Intel AVX-512 are executed in parallel and thus improving performance.Library Location: Crypto Multi Buffer Library.

Figure 1: (Asynchronous TLS) using Intel® AVX-512 Crypto

Tanzu Service Mesh Data Plane (Envoy – “BoringSSL”) Acceleration Data plane envoy proxies in Tanzu Service Mesh use “BoringSSL” as the default TLS library. BoringSSL supports setting private key methods for offloading asynchronous private key operations, and Envoy implements a private key provider framework to allow the creation of Envoy extensions that handle TLS handshakes private key operations (signing and decryption) using the BoringSSL hooks.

Graphical user interface Description automatically generated

Figure 2. Envoy with Intel® AVX-512 Crypto

CryptoMB private key provider is an Envoy extension that handles BoringSSL TLS RSA operations using AVX-512 multi-buffer acceleration. When a new handshake happens, BoringSSL invokes the private key provider to request the cryptographic operation, and then the control returns to Envoy. The RSA requests are gathered in a buffer. When the buffer is full or the timer expires, the private key provider invokes AVX-512 processing of the buffer. When processing is done, Envoy is notified that the cryptographic operation is done and that it may continue with the handshakes.

Figure 3: TLS Handshake with BoringSSL and CryptoMB PrivateKeyProvider

In Tanzu Service Mesh, CryptoMB private key provider configuration can be applied mesh-wide, gateways-specific, or pod-specific configurations.

Tanzu Service Mesh – Performance and Benchmark Topology

We have been testing Tanzu Service Mesh (which is based on Istio) with this capability (introduced in Envoy 1.20 and Istio 1.14) enabled. Our measurements use different client tools (k6 and Fortio), different setups (client, gateway, and server running on separate nodes), and we create a new TLS handshake with every HTTP request.

Diagram Description automatically generated with low confidence

Figure 4: Benchmark Topology

The potential performance benefit depends on many factors. For example, the size of the ‘cpuset’ the proxy is running on, incoming traffic pattern, the encryption type (RSA or ECDSA), and key size. It is important to note that the focus was not on TSM’s general performance, but more on relative performance with and without the crypto acceleration enabled.

Figure 5: Performance numbers with AVX-512 acceleration on Istio [Source]

The results that we got in out test environments are pretty impressive. Use of the AVX-512 based acceleration bring TLS/SSL handshake and connection establishment improvements by ~25% of latency and ~30% of performance (on Istio/Envoy), and we are expecting similar or better performance numbers in our test environment on TSM.

In the first 2 blogs of this series, we showcased the work both Intel and VMware have been doing to accelerate the data path performance (Part 1: TCP/IP bypass using eBPF and Part2: TLS handshake acceleration) in service mesh. In the next part of this series, we will talk about another important aspect of Service Mesh, which is a challenge around security (with respect to the service mesh private key protection mechanism) and discuss our solution around that.

NOTE: The integration of the TLS acceleration feature in Tanzu Service Mesh is in development/testing phase. Also, the underlying code for various OSS projects (including Istio and Envoy) mentioned in this Blog is opensource (references below) and we look forward to more contributions in this space.

References

  1. Cryptography for Intel® Integrated Performance Primitives Developer Reference – https://www.intel.com/content/www/us/en/develop/documentation/ipp-crypto-reference/top/multi-buffer-cryptography-functions.html
  2. Istio CryptoMB – https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/private_key_providers/cryptomb/v3alpha/cryptomb.proto
  3. Crypto MB Source Code: https://github.com/intel/ipp-crypto/tree/develop/sources/ippcp/crypto_mb
  4. Istio 1.14: https://github.com/istio/istio/tree/release-1.14
  5. Envoy 1.20: https://github.com/envoyproxy/envoy/tree/release/v1.20
  6. https://www.intel.com/content/www/us/en/architecture-and-technology/crypto-acceleration-in-xeon-scalable-processors-wp.html.