Next Gen Application Delivery: Getting Started with Intelligent Apps Powered by AI

By Serdar Badem and Camille Crowell-Lee

Optimized intelligent applications powered by AI are the future of user-centric technology. With the evolution and use of Generative AI (GenAI) application experiences are expanding at an aggressive pace, and organizations representing a wide spectrum of industries are struggling to find the right balance between user experience, innovation and operational efficiency. Safely taking advantage of these rapidly evolving technologies requires careful planning and a clear understanding of the foundational elements that will help you succeed in these efforts.

Foundational Considerations

In early 2023, the VMware Tanzu team developed a Retrieval-Augmented Generation (RAG) application for enterprises (more technical details on how we built this application will be forthcoming). This RAG-app enables users to ask questions about their cloud environments—including resources, metadata, configuration, and status—using natural language. Based upon this experience building intelligent applications, and on our numerous customer engagements building and advising organizations on GenAI app delivery, we’d like to share our recommendations for enterprises that are considering building these apps.

Choosing the right models

Not all language models are equal. It’s important to consider the licensing and cost, evaluate inference speed and precision for performance needs, and consider context length and model size. With our RAG application, we initially relied heavily on proprietary models because they had better training and larger model sizes and made it easier to get started. This approach helped us quickly validate our use case and generate accurate GraphQL queries. However, our initial choice of LLM proved to be both slow and costly, prompting us to search for models that might better suit our application. We experimented with faster and cheaper models and conducted rigorous testing to ensure their reliability. The key takeaway is that models are not one-size-fits-all, so you should be prepared to switch models as the situation warrants.

Open source vs. proprietary models

Using open source models can reduce latency and costs and provide more control, especially for on-premises deployments. Enterprises should consider LLM safety when you are weighing your model choices. Initially, when we started our RAG-app journey, the quality of open source models was limited. However, with the release of more advanced OSS LLMs, there are now several robust open source options for enterprise developers to consider.

Malicious actors have been known to exploit LLMs and force them to produce misinformation or damaging propaganda, which can cause damage to your organization’s brand if not checked. This is why tools for AI monitoring are developing (more on that in another forthcoming blog).

Java enjoys first class citizenship

Although a great number of intelligent apps powered by AI are written in Python, they don’t need to be. There is nothing technical in LLMs that require using Python to deliver this new breed of intelligent apps. Java is still the leading programming language for enterprise applications based on its stability, security, extensive library support, and robust performance. Our numbers show that 65% of all enterprise developers use Java, making it essential for business-critical applications. Development frameworks, like Spring AI, enable Java developers to interact with AI models and vector databases through the framework, rather than having to learn new skills.

While we couldn’t use Spring AI for building our RAG-LLM solution because it wasn’t formally launched until November 2023, Java should be considered moving forward for the reasons mentioned above. Spring AI can help your Java developers quickly create intelligent applications.

New types of operational frameworks

In addition to foundational implications, you will need to reconsider your operational models for implementing GenAI intelligent apps. Though we’ve mentioned some of the risks around LLM safety, you should also consider the changes that intelligent apps will require to your overall, day-to-day operations.

Working with large language models (LLMs)

Working with LLMs presents unique challenges, one of which is evaluating the output. Unlike traditional software, LLMs can produce text with high confidence, even if it’s incorrect. This necessitates implementing extensive validation logic to check the accuracy of generated text for both syntactic and semantic before running any queries.

Validation is crucial

Robust mechanisms are needed to validate and ensure that any generated GraphQL or SQL queries are correct. Standard software engineering techniques should be used to verify the accuracy of the generated scripts.

Breaking down complex problems into smaller tasks

One of our first and most important learnings was the necessity to break down complex tasks into smaller, manageable steps. In our case, directly generating a GraphQL query from a prompt was challenging, so decomposing this task into smaller steps improved accuracy and performance.

Handling latency

Serially running multiple steps can introduce latency. To address this, you can optimize by running independent steps in parallel. However, some operations may still require further optimization. Replacing certain language model interactions with local software solutions can significantly reduce latency.

These are just a few things to consider when you are getting started with the development of AI powered intelligent applications. However, the AI space is vast and ever changing so we’ll be dedicating the next several weeks to topics related to this journey.

Moving towards mastery

The future of intelligent app delivery holds immense potential, and being at the forefront of this technological wave is both a challenge and an opportunity. The path to mastery has some known pitfalls, but VMware Tanzu can take some of the guesswork out of developing this next generation of applications.

In addition to Spring AI mentioned above, we just launched our public beta of GenAI on Tanzu Platform, which is open to existing Tanzu Platform customers who are interested in exploring how developers can leverage GenAI for their applications. The existing features are merely a fraction of the comprehensive functionality slated for release by VMware Tanzu, and we will have some exciting announcements leading up to VMware Explore. Watch this space for more to come!.

Download the beta now to get started building intelligent applications powered by AI on Tanzu Platform!

Also, check out our latest Cloud Foundry Weekly Podcast where we dive into GenAI Topics on Cloud Foundry.

Next Gen Application Delivery: Getting Started with Intelligent Apps Powered by AI

Foundational Considerations

Choosing the right models

Open source vs. proprietary models

Java enjoys first class citizenship

New types of operational frameworks

Working with large language models (LLMs)

Validation is crucial

Breaking down complex problems into smaller tasks

Handling latency

Moving towards mastery

Related Posts:

Related Articles

Recap – VMware {code} Theater Sessions at VMware Explore Las Vegas 2024

VMware Explore Las Vegas 2024 {code} Hackathon: A Celebration of Innovation

Revolutionize JVM Startup Times with Class Data Sharing (CDS)

5 Reasons VMware Tanzu Platform Maximizes your Investment for Faster App Delivery

Why Every Java Developer Should Attend SpringOne

Analyzing the ‘State of Cloud Native App Platforms’ report with the Tanzu Vanguards

The New CloudHealth Experience & the Operate Phase of the FinOps Framework

Don't miss these VMware Tanzu sessions at Explore 2024

Unleashing the Power of Spring and AI at SpringOne 2024

Cloud Foundry in Action: Real Customers Stories from Cloud Foundry Day

Broadcom’s VMware Tanzu CloudHealth Named a Leader in Forrester Wave™: Cloud Cost Management and Optimization, Q3 2024

Spring AI Enables Quick Delivery of Intelligent Apps in Java

LLM Inference Sizing and Performance Guidance

Upgrade and Patch a VCF 5.2 Workload Domain in One Operation

Unveiling the Future of Sovereign Cloud: Key Insights from the EU Sovereign Cloud Day in Brussels, 2024

Avi Load Balancer Sessions for VMware Explore 2024 Barcelona – Part One

Your Favorites Revealed: Celebrating Barcelona’s People’s Choice Winners

Schedule Sessions Now for VMware Explore 2024 Barcelona

Data Services Manager: The Hidden Gem Powering Modern Workloads

Terraform VMware Cloud Director Provider v3.14.0