By default, all functions deployed on Cerebrium are a REST API that are accessible through an authenticated POST request. We have made all these endpoints OpenAI-compatible, whether they useDocumentation Index
Fetch the complete documentation index at: https://cerebrium-assembly-ai.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
/chat/completions or /embedding. Below we show you a very basic implementation of a streaming OpenAI-compatible endpoint.
We recommend you check out a full example of how to deploy an OpenAI-compatible endpoint using vLLM here.
To create a streaming-compatible endpoint, we need to make sure our Cerebrium function:
- Specifies all the parameters that OpenAI sends in the function signature.
- Returns
yield data, whereyieldsignifies we are streaming anddatais the JSON-serializable object that we are returning to our user.