GenAI apps, also known as Intelligent Apps, are the latest formulation of AI-capable applications. Unlike “Traditional” AI apps that utilize pre-defined processes in a chain to deliver predictive answers, GenAI apps create new and unique interactions with end users. Due to their creative and autonomous nature, integrating generative AI models into apps is like embedding a living brain into a piece of software. Therefore, GenAI models can offer some delightfully rich responses, and also create some unique challenges.
As development teams rush to build and deliver intelligent apps, platform engineering and operations teams are faced with new problem areas. Managing these new applications becomes significantly more demanding due to the rapid pace of change. Also, when integrating GenAI models into apps, the attack surface increases. This post outlines things to consider when evaluating app platforms for this new set of challenges.
1) Navigating Faster Iteration Cycles in Intelligent App Development
One of the defining characteristics of GenAI app delivery is their rapid iteration cycles. Unlike traditional AI models that operate on set conditions, GenAI models generate new content, requiring frequent updates and experimentation. This demand for rapid iteration can be overwhelming for developers who are used to more deterministic data sources that enable more predictability for outputs and testing. The amount of hands on shifts for model changes might be an adjustment for a developer’s workflow.
To manage this, it’s important to establish a robust version control system and agile development practices. Incorporating flexible model update mechanisms through an app platform allows developers to quickly adapt to changes. Additionally, fostering a culture of experimentation encourages developers to explore new use cases and push the boundaries of what GenAI models can achieve. By proactively managing iteration cycles with an app platform, developers can keep pace with the dynamic nature of GenAI applications and deliver innovative solutions to end users.
2) Reskilling Teams to Deliver Intelligent Apps
According to Stack Overflow’s 2024 Developer Survey,enterprise developers tend to lean towards Java. Generative AI applications often rely on Python due to its popularity and extensive library support for AI tasks. However, a significant challenge for enterprise developers is the prevalent skills gap in Python. For organizations looking to leverage GenAI quickly, this presents a dilemma. Should they reskill their teams- or find a way to incorporate their existing skill sets?
Fortunately, new projects like Spring AI offer a solution enabling Java developers to utilize familiar frameworks to deliver GenAI in their aps. By bridging the gap between Python and Java, organizations can leverage their existing teams’ strengths without the need for extensive retraining. Data science teams can continue to utilize Python but Dev teams can collaborate more easily with Spring AI. This approach not only accelerates the development process but also ensures a smoother integration of GenAI capabilities into existing or new applications.
3) Integrating and Managing Model Accuracy and Safety
Model accuracy and content safety are paramount in GenAI applications. Since GenAI models produce new content, there is a risk of generating incorrect or nonsensical information, known as model hallucinations. Furthermore, changes in data patterns, or the age of a model, can lead to model drift or stale models that impact the reliability of the outputs.
Figure 1. AI middleware is a category of capabilities to safely operate and monitor models in GenAI applications
To tackle these challenges, organizations need to implement comprehensive monitoring and evaluation frameworks. By establishing constraints and employing safety checks, the risk of AI-generated hallucinations can be mitigated. Model observability becomes scalable when integrated through an app platform. When considering which app platform to use, features that help automate organizational controls and streamline model audit and evaluation using AI middleware can help significantly reduce manual efforts to manage models.
4) Managing Model Costs and Calculating ROI
The cost of running GenAI models can quickly escalate, especially when hosted externally. Understanding and managing these expenses is crucial for achieving a positive return on investment (ROI). Tokens -which represent fragments of words in the model’s processing- play a significant role in determining usage costs. While there have been recent cost reduction announcements for several pre-trained models, the consumption-based pricing approach, in general, continues to create budget risks. Consequently, consider using an app platform that regulates access to GenAI models to prevent unintentional budget overruns during experimentation.
Once the app has been pushed to production, it’s important to focus on optimizing model spend. Organizations need to establish governance policies that define usage patterns and access controls. App platforms should enable consumption limits in order to protect against spikes in usage. In addition, models can be vulnerable to adversarial attacks that can result in overuse which can incur unexpected fees. In such scenarios, rate limiting through an app platform can mitigate ROI risks. By strategically managing model operating costs, organizations can maximize the value of their GenAI apps in production, while maintaining financial sustainability.
5) Optimizing GPU Usage for Compute Efficiency
GPUs are the backbone of a lot of GenAI applications, providing the computational power needed to process complex models. However, efficiently utilizing these resources presents a challenge for platform engineering roles such as Site Reliability Engineers (SREs) and Infrastructure & Operations (I&O) leaders.
To optimize GPU usage, it’s essential to leverage high-performance CPU chipsets for experimentation phases while reserving GPU resources for production environments. This approach not only reduces costs, but also allows for thorough experimentation without overburdening GPU resources. By carefully managing compute resources, organizations can ensure optimal performance and scalability of their GenAI applications.
Minimize GenAI App Delivery Risks with Tanzu Platform
Delivering intelligent applications presents a unique set of challenges for leaders to consider, from bridging skills gaps to optimizing resources. However, these challenges also offer opportunities for innovation and growth. Discover how Tanzu AI Solutions can help leaders to safely harness the differentiating value of generative AI to propel business growth, while mitigating the associated risks.