Artificial Intelligence futuristic HUD background-CGI. Neural Network AI processing data on digital screen. Future technology background - stock photo - stock photo,Digital Generated image of Artificial Intelligence futuristic background render CGI. AI Neural Network processing data on digital screen. Future technology background glowing futuristic semiconductor, digital data flowing and network structure. Innovation, AI and cybersecurity concepts.
Technical Private AI VCF Automation

How to Connect your VMware Private AI Services Agents to OpenWeb UI

This article was written by Vincent La, Senior R&D Engineer in the Broadcom engineering team on VMware Private AI and edited by Justin Murray, Marketing Product Engineer in the VMware Cloud Foundation business unit within Broadcom.

Goals

Our overall goal with VMware Private AI Foundation with NVIDIA is to make the development and deployment of AI applications easier for everyone on the VCF platform. We do this by giving you (1) a Model Store for governance of your on-premises models, (2) a Model Runtime Service for bringing one or more of those models to life, (3) a Data Indexing and Retrieval Service for placing your private data into a knowledge base accessing a vector database, and finally (4) an Agent Builder Service that ties all of these together into a running Agent that is called by separate application logic to get the AI part of the application done. Once you have built your agent, using Private AI services, the next question is how to use that agent via its APIs. This last question is the subject we go through in this article at the code level.

Pre-requisites

  • Your supervisor has Private AI Services (PAIS) installed
  • Your supervisor namespace has the PAISConfiguration Custom Resource Definition (CRD) running, and the API Server & UI are reachable from your network.
  • Your supervisor namespace has a completions Model Endpoint deployed. This is done using the Model Runtime service within the set of Private AI Services.

Creating an Agent

We’ll start by creating an agent in the Private AI Services Agent Builder UI. Here we can specify the underlying model supporting the agent, setting system instructions, enable tooling such as RAG, and define session information. For purposes of this demo, we will be creating a simple agent that will respond as if he were William Shakespeare.

First navigate to Agents in your Agent Builder and click on Create Agent.

Provide a Name and select the completion Model Endpoint from the Model endpoint dropdown. This will be the underlying model generating the responses.

Next toggle Add Instructions, and add this following instruction to the agent:

Respond to all requests as if you were the second coming of William Shakespeare

Add a Chat history max length of 10000 and click on Create.

After saving, you can test out your agent responses on the right chat window before deciding to deploy this to an upstream service. Once you’re satisfied with the results, scroll to the bottom and you’ll see a Chat Interaction Sample Code which can be used by the upstream service.

 

Connecting your Agent to an Upstream Service

We will be leveraging Openweb UI, an open source framework for interacting with local or externally hosted models, as our upstream service. This will be installed as a pod in our namespace with an nginx proxy to the Private AI OpenAI API endpoints.

Setting up Openweb UI

Note: Before proceeding, you will need access to the internal PAIS Services and certificate secrets.

Here we first set some environment variables and then generate two yaml files.

export PAIS_NAME=$(kubectl get paisconfiguration -o jsonpath='{.items[0].metadata.name}’)
export PAIS_UID=$(kubectl get paisconfiguration -o jsonpath='{.items[0].metadata.uid}’)
export PAIS_NGINX_IMAGE=$(kubectl get deployment pais-api-$PAIS_UID -o jsonpath='{.spec.template.spec.containers[1].image}’)

export PAIS_SERVICE_TYPE=ClusterIP
export PAIS_STORAGE_CLASS_NAME=$(kubectl get storageclass -o jsonpath='{.items[0].metadata.name}’)

The first file contains a deployment with Openweb UI running along with the nginx proxy. The second file contains an ingress yaml that will expose the deployed service via ingress.

Take the above file that you create at /tmp/open-webui-components.yaml and apply it to your namespace.

Next, you can expose the service via ingress by creating the yaml below. Create a file named /tmp/open-webui-ingress.yaml with the following contents.

Now you use “kubectl apply filename” with the created file at /tmp/open-webui-ingress.yaml to apply it to your namespace.

Your Openweb UI should now be reachable at the defined ingress https://open-webui.local.

Proceed with creating an administrator account in Openweb UI, and now you have Openweb UI deployed in your namespace with a proxy at /mtls-proxy/ pointing to the Private AI Services OpenAI API endpoints.

Setting up the Pipe Function to work with Private AI Service Agent

Note: If you don’t want to customize your pipe, you can download a prebuilt one here.

Out of the box, Openweb UI doesn’t work with the OpenAI Assistants API, it works directly with the model. However you can extend Openweb UI’s capabilities through Functions which are plugins for Openweb UI. These Functions are built-in and run within the Openweb UI environment. We’ll be writing a Pipe Function to direct our completion requests to our Private AI Agent.

From Openweb UI’s website:

“A Pipe Function is how you create custom agents/models or integrations, which then appear in the interface as if they were standalone models.”

We will start off by creating a Pipe Function. Login as the administrator to your Openweb UI. Click the user icon on the left corner and select “Admin Panel”.

From the Admin Panel screen, select the “Functions” tab.

Click the “+” Icon and select “New Function”

It will generate a python file for you which you can modify from the browser. It’ll look like this:

The generated example is for a Filter Function, but what we need is a Pipe Function. So we will need to make a few modifications to this Filter Function so it’ll suit our needs.

Start by defining the python packages we need. You can define the packages you need in your Pipe function by adding a “requirements” line inside of the multi-line comment at the top of the file. Packages are comma separated. For example, to install numpy and httpx packages we’d add to the multi-line comment:

“requirements: httpx, numpy”

We can remove inlet and outlet functions on the class and change class to Pipe. We’ll also include two new functions pipe and pipes on the renamed Pipe class. In the end it should be something like this.

Next we’re going to support multiple Private AI Agents in Openweb UI by defining them in the pipes method of the Pipe class.

When the Pipe Function is activated, this will call the pipes function which makes a request to the Private AI Services proxy to get a list of assistants we have from our Private AI Services Agent Builder and save it to the assistants property of the pipe class.

By default, when chatting with models in Openweb UI they will direct requests to the /completions endpoint, but in our case we want to direct chat requests to the agent. We can do this by defining a pipe method on the Pipe class.

You’ll see in this section of the pipe function we’re adding in our Chat Interaction Sample Code we got from the Agent Builder service, but with a few modifications.

For the model, we will read this from the assistants list we had saved previously when the pipes function was called. For messages, we read from the Openweb UI chat window which we can access from the body parameter.

After saving the file, make sure to activate the Pipe Function by toggling it on in the Functions tab of the Admin Panel.

Once activated, you are now able to select your Agents from the models dropdown when chatting with a model in Openweb UI.

References


Discover more from VMware Cloud Foundation (VCF) Blog

Subscribe to get the latest posts sent to your email.