Purdue GenAI Studio
Purdue GenAI Studio is an GenAI service that makes open-source models like LLaMA accessible to Purdue researchers. It is hosted entirely with on-prem resources at Purdue, not only providing democratized access but adding another layer of control compared to commercial services. Chats, documents, and models are not shared between users nor used for training.
There are two modalities for interacting with GenAI Studio, UI, and API, and additional functionality for both modalities is under active development. This system is integrated with a PostgreSQL vector database in the backend to enable retrieval-augmented generation (RAG) functionality.
Note this service is a pilot and provides only limited safety measures. Models may hallucinate or generate offensive content. GenAI Studio should not be used for any illegal, harmful, or violent purposes.
Do not input, by any method, any data into these systems that your research institution would consider sensitive or proprietary. Do not input, by any method, any data into these systems that is regulated by State or Federal Law. This includes, but is not limited to, HIPAA data, Export Controlled data, personal identification numbers (e.g. SSNs) or biometric data.
Link to section 'Access' of 'Purdue GenAI Studio' Access
Navigate to https://genai.rcac.purdue.edu/ and log in using CILogon and your standard Purdue SSO account.
User Interface
Link to section 'Chat Interface' of 'User Interface' Chat Interface
Link to section 'Model Selection' of 'User Interface' Model Selection
The chat interface allows you to select from a list of available models to interface with. This list includes both base models available to all users as well as any custom models you have created (covered below).
You may also select multiple models to compare the output for any prompt. In this case, your prompt will be sent to all selected models and the results displayed side-by-side.
If there are additional open-source models you would like to use that are not available in the drop-down, please send in a ticket request. We are able to provide access to most open-source models.
Link to section 'Sending a Message' of 'User Interface' Sending a Message
Sending a prompt is as simple as adding text into the message bar. By clicking the microphone or headset buttons you can also use a speech-to-text service or have a “call” with the model where you can speak it the model will speak back.
You can also add a document to give the model specific context to respond to the prompt with. If you would like this context to be persisted, upload it to documents or add it to a custom model as discussed below. To add an existing document that was previously uploaded into the document interface to the context of a chat, use ‘#’ and then the name of the document before entering your prompt.
When you get a model response, you can take various action including editing, copying, or reading the response out loud, and view statistics about the generation with the options available at the bottom of the response.
Link to section 'Other Controls' of 'User Interface' Other Controls
From the top left of the screen, you can access various controls which enable you to tweak the parameters and system prompt for the chat. To persist this information, use a custom model. From these options you are also able to download chats and change other settings related to the UI.
On the left of the screen you can create new chats, access your chat history, and access the workspace where you can upload documents and create custom models.
Link to section 'Workspace' of 'User Interface' Workspace
From the workspace you can upload documents and create models with a RAG functionality.
Link to section 'Documents' of 'User Interface' Documents
In the documents tab, use the ‘+’ button to upload documents. Select documents using the upload functionality then add it to a collection if you wish by specifying a name. The collection can be used to create different groupings of documents, for example to create custom RAGs focusing on different tasks. Documents will not be shared across users, but please do not upload documents with sensitive information or that are subject to regulations.
Link to section 'Models' of 'User Interface' Models
To create a custom model, navigate to the Models tab in the workspace and click “Create a model”.
From here, you can customize a model by specifying the base model, system prompt, and any context documents or collections of documents, among other customizations. Documents/collections must be uploaded in the document tab before they will be accessible here. Once the model has been created it will show up in the model list in the chat interface.
API
To use the API, you wil first need to find your API key, accessible from the UI. Click on your user avatar and navigate to Settings, and then the Account page in settings. Here you will be able to see the API Keys. Expand this section and copy either the JWT token or create an API key.
To connect to the API, there are a few endpoints you can use depending on your use case. The primary way these endpoints differ is in the format of the response.
The Ollama endpoint is https://genai.rcac.purdue.edu/ollama/api/chat.
For a streaming response, this will return results formated like:
{"model":"llama3.1:latest","created_at":"2024-11-15T18:19:55.588019874Z","message":{"role":"assistant","content":"I"},"done":false}
For an OpenAI API-compatible response, use https://genai.rcac.purdue.edu/api/chat/completions
For a streaming response, this will return results formated like:
data: {"id":"chatcmpl-808","object":"chat.completion.chunk","created":1731694445,"model":"llama3.1:latest","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"I"},"finish_reason":null}]}
To use either of the endpoints, insert your chosen endpoint and API key in the example Python code below:
url = {insert chosen endpoint here}
headers = {
"Authorization": f"Bearer {jwt_token_or_api_key}",
"Content-Type": "application/json"
}
body = {
"model": "llama3.1:latest",
"messages": [
{
"role": "user",
"content": "What is your name?"
}
],
"stream": True
}
response = requests.post(url, headers=headers, json=body)
if response.status_code == 200:
print(response.text)
else:
raise Exception(f"Error: {response.status_code}, {response.text}")
This will return output in a JSON format along with metadata.