A factory worker using a handheld barcode scanner on a large pallet, surrounded by augmented reality data overlays, industrial setting, high-tech, hyper-realistic, cool tones.--style raw
VMware Edge Compute Stack Edge AI

Edge AI Success Demands Technology Curated for the Edge

While the media’s spotlight has been on generative AI and large language models (LLMs), these technologies are often impractical for industries that require real-time, local decision-making. Edge AI applications are driving transformation by leveraging computing power closer to where enterprise data is being produced and consumed. With real-time data processing at the edge, AI workloads can operate autonomously, creating opportunities for faster insights, improved customer experiences, and operational efficiency.

It’s crucial to understand the distinct requirements of edge AI applications and how to successfully deploy, run, and manage them to yield maximum business results.

According to Gartner, by 2029, at least 60% of edge computing deployments will utilize composite AI—both predictive and generative—compared to less than 5% in 2023 [Gartner, Market Guide for Edge Computing, March 2024]. This statistic highlights AI’s huge potential, but it’s far from a plug-and-play solution. To unlock key outcomes, you need to consider your application requirements, the locality of its data, and the timeliness of its processing and decision making.

Low latency, for example, is critical in computer vision use cases. In nearly all sectors, security and privacy must be addressed to some degree. Merchants in the retail sector, for instance, must typically protect cardholder data by following the Payment Card Industry Data Security Standard (PCI DSS) when they process credit card transactions. The right hardware is also essential. A smart factory, for example, requires robust edge computing capabilities to run on ruggedized industrial computers. 

Understanding what you intend to do with AI—whether using it in a private data center for IT workloads like large language models (LLMs) or deploying it at the edge for operational technology (OT) tasks like loss prevention in retail—is key to determining where AI will be most effective and efficiently deployed. Careful planning and execution are essential to achieving success in either environment. This blog post briefly looks at some high-level factors for distinguishing AI models and expectations, in order to hone in on the key considerations for deploying AI successfully. 

Artificial intelligence challenges

AI use cases come with distinct challenges, particularly in how applications are built and managed. Each application often involves separate logic and models, developed by different teams using different tools, each with its own lifecycle. This complexity makes integrating these diverse workflows into your technology portfolio a significant challenge. The complexity increases when considering the diverse hardware and devices involved, especially at the edge. As you expand to multiple locations, this complexity multiplies, making visibility and observability crucial. Understanding how the model is performing at each location becomes even more challenging as each site has unique environments. These visibility challenges only grow as the scale of deployment increases, requiring careful planning and robust tools to manage AI applications effectively.

AI use cases present distinct challenges, especially in the development and management of applications. Often, each application is built with separate logic and models by different teams using various tools, each with its own lifecycle. This diversity makes integrating these workflows into your technology portfolio a significant challenge. 

The complexity deepens when factoring in the wide range of hardware and devices involved, particularly at the edge. As deployments expand across multiple locations, this complexity only escalates, making visibility and observability critical. Monitoring and measuring model performance at each unique site further adds to the challenge due to the unique environments of each location. A consistent orchestration platform for frictionless management of edge applications, devices, and infrastructure coupled with intent-based planning leads to effectively addressing AI challenges.

LLMs run in data centers for a reason

When it comes to using large language models (LLMs) like ChatGPT, resource-intensive data centers are typically the ideal environment. LLMs require immense compute power, memory, and storage, which high-performance GPUs or TPUs in data centers are designed to provide. These facilities are equipped with the necessary infrastructure, including robust cooling systems, redundant power supplies, and high-speed networking, to support such demanding workloads. Additionally, data centers offer scalability, allowing training and inference processes for LLMs to run across many servers in parallel, handling large data sets efficiently. The flexibility to scale resources up or down based on demand is another key advantage of data centers.

Latency is also an important factor to consider. Data centers excel in high-throughput batch processing, particularly when handling data already located centrally. However, when real-time data from various locations needs to be processed, latency can become a significant concern.

Data privacy and security are also critical, especially when training large models on proprietary data sets. Data centers have advanced security protocols and can more easily apply compliance measures, ensuring sensitive data is protected throughout the AI development process.

AI at the edge is a different ballgame

For AI users, edge computing presents a different set of opportunities and challenges compared to data centers. Edge devices such as IoT and edge servers operate with resource constraints, with limited computational power, memory, and storage compared to data centers. These devices are often optimized for energy efficiency and must work within their hardware limitations. However, their proximity to the data source enables lower latency, making edge computing ideal for real-time AI use cases such as computer vision, business intelligence, and augmented reality. This local processing speeds up response times and improves real-time decision-making, which is critical in these scenarios.

Edge computing also reduces the need for data to be transmitted to centralized data centers, conserving bandwidth and lowering costs associated with data transmission. By processing data locally, edge devices minimize the volume of data that needs to be sent across networks. Privacy and security are also enhanced because sensitive data can be processed on the device itself, reducing the risks associated with transmitting it to a central server or a cloud. This is particularly important in applications for retail and other fields that process personal or sensitive consumer data, including computer vision tasks that log information such as facial data. Although some edge devices might have more limited security capabilities than those available to servers in a data center, the local processing helps to mitigate risks of data interception.

Edge AI vs. data center AI

Overall, edge computing is most suitable for applications that require real-time processing, low latency, and localized data analysis, making it ideal for operational technology (OT) workloads. These scenarios demand immediate data processing and decision-making, which thrive at the edge where devices are optimized to operate within hardware constraints. This is particularly relevant for applications like real-time analytics, predictive maintenance, and other localized AI tasks. In contrast, IT workloads, such as data-intensive processing and analytics, are best suited for the centralized power of data centers. Data centers provide the necessary resources for large-scale processing, offering the scalability and computational power required for tasks that don’t rely on real-time interaction. 

Thus, while data centers remain the backbone for more extensive, resource-heavy IT operations, edge computing excels in environments where immediate action is crucial, and VMware Edge Compute Stack enables you to rapidly deploy and replicate edge AI applications at scale. With the power of Edge Compute Stack, you’re emboldened to orchestrate AI applications with zero touch and share resources like accelerators while gaining the flexibility to run different kinds of AI applications, such as computer vision, machine learning, or the use of small language models.

Fitting edge AI into your enterprise’s strategy

In conclusion, driving business value with AI requires envisioning the right strategy and building the solution that best fits your enterprise needs. Just like any strategic initiative, AI demands careful planning and deployment. It’s crucial to ensure that the solution aligns with your specific requirements, rather than forcing your needs into a predefined framework. 

For organizations looking to build private AI platforms in data centers or the cloud, the software-defined data center solutions from VMware are ideal, especially when your AI journey involves leveraging proprietary information for LLMs. 

On the other hand, if your focus is on consuming AI at the edge—whether it’s for video inferencing, smart factory technologies, or customer engagement—VMware Edge Compute Stack is the optimal choice. By aligning your AI strategy with the right infrastructure, whether in the data center or at the edge, you can unlock the full potential of AI to drive your enterprise forward. 

To learn more about how the software-defined edge from VMware can help you unlock new opportunities, see our blog Get Ready: Edge AI Will Transform Your Business.