OpenAI API Overview: ChatGPT, DALL-E, Whisper and more.

Options
Lachlan
Lachlan Administrator

ADMIN

edited June 2023 in ? Working with APIs

Table of Contents

  • How to Leverage OpenAI with Xano
    • What is OpenAI?
    • What does OpenAI do?
    • How can I utilize OpenAI with Xano?
    • API Pricing
    • Authentication / Headers
      • Requesting organization
    • Making requests
    • Models
      • Chat Models
      • Vector Representations and Embeddings
        • What are Embeddings?
        • What are Tokens?
        • Token Limits
      • Audio Models
        • Transcribe
        • Translate
      • Image Models
        • Generate
        • Edit
        • Variation
      • Completions
      • Edits

What is OpenAI?

OpenAI is an AI company based in San Francisco. They are best known for developing GPT-3, a powerful LLM trained on billions of words from the Internet. GPT-3 shows how LLMs can be applied in many creative ways without needing to code. With a bit of guidance, these models can build simple apps, generate content, converse, and more.

OpenAI Overview Slides:
https://photos.app.goo.gl/aAaryNb6DBG7TnUaA

What does OpenAI do?

OpenAI has developed several powerful AI models that are available to the public, including GPT, a natural language processing model. GPT can write coherent and convincing text, answer questions, and even generate code. Other models developed by OpenAI include DALL-E, which can generate images from textual descriptions, and Whispers which can transcribe audio files to text. They also have an embedding API that allows for converting text/documents into vector representations which are useful for adding context to prompts with larger datasets.

How can I utilize OpenAI with Xano?

Open AI has made their models accessible via API, meaning we can directly interact with the models using Xano. These models are mostly powered by human language text inputs which makes it extremely accessible and easy to work with. With a small amount of effort, huge results can be achieved.

Getting Started

To get started you can familiarize yourself with OpenAi’s API reference docs here: API Reference

If you don’t know what you are looking at that’s ok we’ll explain how you can use the reference and models throughout this article.

OpenAI’s API Pricing

Unlike ChatGPTs web app, utilizing OpenAI’s API is a paid service. You can find a full breakdown of the API pricing here: OpenAI Pricing

API Cost Warning

Models such as GPT-4 can be expensive. Providing GPT4 with its biggest possible input prompt (32k tokens) can cost $1.92 USD for the input processing. And then if the output is at it's maximum output size (32k tokens) it will cost $3.84 for the response totaling $5.76 USD for a single API call.


Be sure to understand the costs of the model prior to using it.

Authentication / Headers

The OpenAI API uses API keys for authentication. Visit your API Keys page to retrieve the API key you'll use in your requests. You’ll need to register for an OpenAI account if you haven’t already.

Create an environment variable via the Settings section in Xano and add your OpenAI API Key.

All OpenAI API requests should include your API key in an Authorization HTTP header as follows:

Authorization: Bearer OPENAI_API_KEY


How can you do this in Xano?

When adding an external API request to your function stack you will notice there is an input field for headers.

Headers are an Array so in order to add something to the headers we can use the PUSH filter which adds an item to the end of an array.

We need to push our Authorization string through in order to Authenticate our requests you can paste the example in. Authorization: Bearer $OPENAI_API_KEY

However, we want to update the string to dynamically include our OpenAI API key which we stored as an environment variable. We can do this using the REPLACE filter.

It will look like this:

Don't forget to save your changes!

Requesting organization -

For users who belong to multiple organizations, you can pass a header to specify which organization is used for an API request. Usage from these API requests will count against the specified organization's subscription quota. (Skip this if you are only part of one organization).

Example curl command:

curl <https://api.openai.com/v1/models> \\
  -H "Authorization: Bearer $OPENAI_API_KEY" \\
  -H "OpenAI-Organization: org-c0vZYfhzt6L7XJqSl9ZysuSL"


Making requests

You can make requests to OpenAI's API endpoints using Xano. For example:

curl <https://api.openai.com/v1/chat/completions> \\
  -H "Content-Type: application/json" \\
  -H "Authorization: Bearer $OPENAI_API_KEY" \\
  -d '{
     "model": "gpt-3.5-turbo",
     "messages": [{"role": "user", "content": "Say this is a test!"}],
     "temperature": 0.7
   }'


These example curl requests can be found throughout the API reference docs and can be copied and imported into Xano saving a heap of time. Start by copy the example curl request from the model you would like to utilize:

When adding an external API request to your function stack you’ll see the IMPORT CURL button in the top right-hand corner. Pasting in the curl will populate the required input fields as per the API docs specifications. You will just need to update the prompt input and add your API key via the steps shown above.

Try it out for yourself using the example curl provided above. 💪🔥🤖

continued..

Comments

  • Lachlan
    Lachlan Administrator

    ADMIN

    edited June 2023
    Options

    Models

    List models

    GET https://api.openai.com/v1/models

    Lists the currently available models, and provides basic information about each one such as the owner and availability.

    Example curl:

    curl <https://api.openai.com/v1/models> \\
      -H "Authorization: Bearer $OPENAI_API_KEY"
    
    

    Chat Models

    ChatGPT 3.5 & 4

    The OpenAI Chat API is a powerful tool that allows developers to integrate AI-powered conversational capabilities into their applications. The API uses models like gpt-3.5-turbo to generate responses in a chat-like format.

    Here's a breakdown of how to use the API:

    1. Endpoint: The endpoint to create a chat completion is POST <https://api.openai.com/v1/chat/completions>.
    2. Body Parameters

      Required:
      • model: The ID of the model to use "gpt-3.5-turbo" is a recommended model.
      • messages: An array of message objects that describe the conversation so far. Each message object should have a role (either "system", "user", or "assistant") and content (the content of the message).

        Optional:
      • temperature and top_p: These parameters control the randomness of the model's output. You generally should alter one or the other, but not both.
      • n: The number of chat completion choices to generate for each input message.
      • stream: If set to true, partial message deltas will be sent as they become available. (Not supported with Xano currently)
      • stop: Sequences where the API will stop generating further tokens.
      • max_tokens: The maximum number of tokens to generate in the chat completion.
      • presence_penalty and frequency_penalty: These parameters can be used to control the model's tendency to introduce new topics or repeat itself.
      • logit_bias: This allows you to modify the likelihood of specified tokens appearing in the completion.
      • user: A unique identifier representing your end-user.
    3. Response: The response from the API will include the ID of the chat completion, the created timestamp, the generated message from the assistant, and usage information.

    Here's an example curl to try yourself:

    curl https://api.openai.com/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer $OPENAI_API_KEY" -d '{
    "model": "gpt-3.5-turbo",
    "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello!"}]
    }'

    Here are four sample applications that could be built using this API:

    1. Customer Support Chatbot: You can create a chatbot that can handle customer inquiries, provide information about products or services, and help resolve common issues.
    2. Virtual Assistant: You can build a virtual assistant that can help users with tasks like setting reminders, sending emails, or finding information online.
    3. Interactive Storytelling: You can create an interactive storytelling application where the user can have a back-and-forth conversation with characters in the story.
    4. Language Learning App: You can build an app where users can practice conversing in a new language with an AI assistant.

    Tutorial Video:

    Vector Representations (Embeddings)

    Vector representations, or embeddings, are a way to represent words or pieces of text as mathematical vectors that capture their semantic meaning. OpenAI offers an embeddings API that allows you to generate vector representations for input texts. These embeddings can then be used to add context to prompts for OpenAI's models or stored in a vector database for later use.

    For example, you could generate embeddings for your product documentation or knowledge base using OpenAI's API. Then you could store those embeddings in a vector database like Pinecone and use Xano to query the database to create a chatbot trained on your own data. The chatbot could handle questions about your products and documentation using the semantic information captured in the embeddings.

    Why are Embeddings needed?

    The reason why embeddings are important is due to limitations with the amount of text that can be give to a model or the “prompt size”. Each model has a token limit which defines how large the prompt can be.

    What are tokens?

    Tokens can be thought of as pieces of words. Before the API processes the prompts, the input is broken down into tokens. These tokens are not cut up exactly where the words start or end - tokens can include trailing spaces and even sub-words. Here are some helpful rules of thumb for understanding tokens in terms of lengths:

    • 1 token ~= 4 chars in English
    • 1 token ~= ¾ words
    • 100 tokens ~= 75 words

    Or

    • 1-2 sentence ~= 30 tokens
    • 1 paragraph ~= 100 tokens
    • 1,500 words ~= 2048 tokens

    Model Token Limits

    gpt-3.5-turbo - 4,096 tokens - (Approx 3072 Words)

    gpt-4 - 8,192 tokens (Approx 6144 Words)

    gpt-4-32k - 32,768 tokens (Approx 24576 Words)

    Embeddings can extend the memory capabilities of models

    Embeddings and the semantic search capabilities they provide are for when you need to work with data that exceeds a model's token limit. You are instead able to search a vector database for related content (context) returning only relevant or related components from your dataset effectively extending the memory capabilities of the model.

    Create embeddings

    POST https://api.openai.com/v1/embeddings

    Creates an embedding vector representing the input text.

    Request body

    model: ID of the model to use.

    input: Input text to embed, encoded as a string or array of tokens.

    Other parameters.

    user: A unique identifier for the end-user.

    Example Curl:

    curl <https://api.openai.com/v1/embeddings> \\
      -H "Authorization: Bearer $OPENAI_API_KEY" \\
      -H "Content-Type: application/json" \\
      -d '{
        "input": "The food was delicious and the waiter...",
        "model": "text-embedding-ada-002"
      }'
    
    

    ResponseThe response will include the embedding vector for the input text.

    You can store this vector in a vector database such as Pinecone.

    Example App Workflow - Knowledge Base FAQ Bot

    You could generate embeddings for your product documentation or knowledge base using OpenAI's API. Then you could store those embeddings in a vector database like Pinecone and use Xano to query the database to create a chatbot trained on your own data.

    Step one generating embeddings and storing them in Pinecone

    You would then be able to create a chatbot workflow that would leverage OpenAIs Embeddings and Chat Completions APIs via querying Pinecone for related information.

    continued…

  • Lachlan
    Lachlan Administrator

    ADMIN

    edited June 2023
    Options

    Audio Models

    Whisper Audio Translation Model by OpenAI

    • What is Whisper?
      • Whisper is a neural net developed by OpenAI that provides high-level accuracy in English speech recognition.
      • Whisper is trained on 680,000 hours of multilingual and multitask supervised data collected from the web, allowing it to be robust against various accents, background noises, and technical language.
      • The architecture of Whisper is a simple end-to-end approach, implemented as an encoder-decoder Transformer. It can handle tasks such as language identification, phrase-level timestamps, multilingual speech transcription, and to-English speech translation.
    • Example Applications:
      • Whisper could be used to develop robust voice interfaces for applications in various industries.
      • It can be utilized in transcribing multilingual speeches, making it useful in global conferences, online classes, and more.

    Create transcription

    POST https://api.openai.com/v1/audio/transcriptions

    Transcribes audio into the input language.

    Request body

    file: The audio file object to transcribe.

    model: ID of the model to use. Only whisper-1 is currently available.

    Other parameters to control the model's output.
    language: The language of the input audio.

    Response

    The response will include the transcription of the audio file.

    Example Request:

    curl https://api.openai.com/v1/audio/transcriptions -H "Authorization: Bearer $OPENAI_API_KEY" -H "Content-Type: multipart/form-data" -F file="@/path/to/file/audio.mp3" -F model="whisper-1"
    

    An example application using the transcription endpoint

    You could build an audio transcription service.

    Workflow Overview:

    Now, let's go over the step-by-step instructions on how to build this application:

    1. Set up your Xano account: Sign up for a Xano account and create a new project.
    2. Create the API endpoint: Use Xano's visual API builder to create a new endpoint /transcribe. This endpoint should accept POST requests and the request body (input) should be an audio file.
    3. You can import the following curl via your external API request, being sure to update your API key and adding the file path to be an audio file/file resource input

      curl https://api.openai.com/v1/audio/transcriptions -H "Authorization: Bearer $OPENAI_API_KEY" -H "Content-Type: multipart/form-data" -F file="@/path/to/file/audio.mp3" -F model="whisper-1"



    4. Transcribe the audio: When a request is received at the /transcribe endpoint, pass the audio file to the Whispers model for transcription. The Whispers model will convert the speech in the audio file to text.
    5. Return the transcribed text: The text output from the Whispers model should be returned in the response from the /transcribe endpoint.
    6. Set up the database: Create a database table in Xano to store the audio files and their corresponding transcriptions. You will need to create a table with columns for the audio file and the transcribed text.
    7. Store the audio and transcriptions: After the audio has been transcribed, store both the audio file and the transcribed text in the database.
    8. Test the application: Finally, test the application by sending an audio file to the /transcribe endpoint and checking that the transcribed text is returned in the response and stored in the database.

    Create translation

    POST https://api.openai.com/v1/audio/translations

    Translates audio into English.

    Request body

    file: The audio file object to translate.

    model: ID of the model to use. Only whisper-1 is currently available.

    ResponseThe response will include the English translation of the audio file.

    You can use Whisper to develop voice interfaces or transcribe multilingual speech.

    Image Generation

    What is DALL-E?

    • DALL-E is a neural network developed by OpenAI that generates images from text descriptions.
    • It is a 12-billion parameter version of GPT-3 trained to generate images from text descriptions, using a dataset of text–image pairs.
    • DALL-E has capabilities like creating anthropomorphized versions of animals and objects, combining unrelated concepts in plausible ways, rendering text, and applying transformations to existing images.

    Create image

    POST https://api.openai.com/v1/images/generations

    Creates an image given a prompt.

    Request body

    prompt: A text description of the desired image(s).

    n: The number of images to generate. Must be between 1 and 10.

    size: The size of the generated images. Must be "256x256", "512x512", or "1024x1024".

    Other parameters.
    user: A unique identifier for the end-user.

    Example Curl - (Generate 2 images)

    curl https://api.openai.com/v1/images/generations -H "Content-Type: application/json" -H "Authorization: Bearer $OPENAI_API_KEY" -d '{
    "prompt": "A cute baby sea otter",
    "n": 2,
    "size": "1024x1024"
    }'

    Be sure to update your input prompt and API key.

    ResponseThe response will include URLs to the generated images. (2)

    You can use DALL-E to generate images for advertising, digital art, education, and more.

    Example Workflow:

    Image Edit

    Image Edit endpoint: POST https://api.openai.com/v1/images/edits

    This endpoint accepts an original image and a prompt and generates an edited version of the image based on the prompt. For example, you could provide a picture of a red ball and a prompt of "change the ball to blue” and get back an edited image with a blue ball.

    Example Application: An ecommerce product customizer.

    A user could upload a product image like a t-shirt or phone case and enter a text prompt to customize the design, color, or look of the product. The image edit endpoint would generate an edited version of the product image with the customizations, allowing the user to preview the changes before purchasing.

    Example Request:

    curl https://api.openai.com/v1/images/edits -H "Authorization: Bearer $OPENAI_API_KEY" -F image="@otter.png" -F mask="@mask.png" -F prompt="A cute baby sea otter wearing a beret" -F n=2 -F size="1024x1024"
    

    Image Variation

    Image Variation endpoint: POST https://api.openai.com/v1/images/variations

    This endpoint takes an input image and generates stylistic variations of that image. For example, you could provide a landscape photo and get back multiple variations that adjust the brightness, color palette, cropping, etc. The output images are creatively adapted versions of the original photo.

    Example Application: A social media content generator.

    With an image variation API, you could build an app to generate curated social media content for influencers or brands. The user would provide a photo they want to post, and the image variation endpoint would return multiple variations of that photo with different stylings. The user could then select the variation they like best to auto-post to their social media profiles, saving time and ensuring high quality, unique content. This type of application could work with photos of products, lifestyle shots, portraits, food, etc. The image variation endpoint is able to creatively adapt images in many domains.

    Example Request

    curl https://api.openai.com/v1/images/edits -H "Authorization: Bearer $OPENAI_API_KEY" -F image="@otter.png" -F mask="@mask.png" -F prompt="A cute baby sea otter wearing a beret" -F n=2 -F size="1024x1024"
    

  • Lachlan
    Lachlan Administrator

    ADMIN

    edited June 2023
    Options

    reserved for additional content

  • Baloshi69
    Baloshi69 Member
    Options

    You are awesome, just skimmed it, but surely make a bookmark, and return to it, to get all.

  • Gepeto
    Gepeto Member
    edited June 2023
    Options

    Great post, thanks! Bookmarked.

    Maybe fine-tuning will be part of an upcoming content, but I wondered if you could share insights.

    I've built a dataset.jsonl file. I've imported the Curl request to my Xano endpoint.

    curl https://api.openai.com/v1/files 
    -H "Authorization: Bearer $OPENAI_API_KEY" 
    -F purpose="fine-tune" 
    -F file="@mydata.jsonl"
    

    But I have no clue how to use my file which sits on my desktop now. I can't put such file in your File Manager and I can't save its content inside your text editor since the file must be a jsonl. Any insights appreciated!

  • Chris Coleman
    Chris Coleman Administrator

    ADMIN

    Options

    Hi, @Gepeto — as far as I'm aware, you should be able to upload this to your Xano instance as long as you're on a paid plan. Is this not the case?

  • Gepeto
    Gepeto Member
    edited June 2023
    Options

    Got it, thanks. I'll upgrade - although it's quite early in my learning process since I'm a rather new account… You're a good sales man 😉

    Then here how do I bind to the file saved in the file manager?

  • Chris Coleman
    Chris Coleman Administrator

    ADMIN

    Options

    @Gepeto You'll need to use a Create File Resource step and provide the URL to the file you've uploaded, and then use a Get File Resource Data on the variable output from the previous step. This will let you access the raw file data inside of the function stack.

  • Gepeto
    Gepeto Member
    edited June 2023
    Options

    Getting close… Now I get this response from https://api.openai.com/v1/files

    {message: Additional properties are not allowed ('file[data]', 'file[mime]', 'file[name]', 'file[size]' were unexpected),
     type: invalid_request_error,param: null,code: null}
    

    How can I remove the metadata and keep the lines of objects only?

    {
    name: dataset.jsonl,
    size: 571653,
    mime: application/octet-stream,
    data: {"prompt": "Use…"}
    {"prompt": "blah…"}
    }
    

  • Chris Coleman
    Chris Coleman Administrator

    ADMIN

    Options

    @Gepeto When you use Get File Resource Data, this provides a JSON object which includes file metadata. The raw file data is located under the path 'data', so try appending '.data' to wherever you're referencing the output of this function.

  • Gepeto
    Gepeto Member
    Options

    Nope. No matter how hard I try here, I don't understand the Get File Resource Data function. The API request to https://api.openai.com/v1/files won't accept .data apprended to the output of that function.

  • jmotz
    jmotz Member
    edited June 2023
    Options

    Hey there, @Lachlan and @Chris Coleman!

    I just wanted to say that this thread is absolutely fantastic, and I appreciate both of your contributions.

    I was wondering if either of you, or anyone from the community, could share some recommendations for Vector DB or insights regarding Pinecone, Supabase Vector, and Weaviate. I'm evaluating the options and would love to hear your thoughts or experiences with these platforms.

    Thanks again for your input!

  • Lachlan
    Lachlan Administrator

    ADMIN

    Options

    Hi @jmotz I'm actually working on a video at the moment explaining how to use Xano & Pinecone together. Pinecone is a purpose-built vector database so if extremely high performance / huge datasets are required then pinecone might provide advantages, however, Supabase is likely a suitable solution for most use-cases. If you'd like a sneak peek or a copy of a snippet that can help you get started feel free to let me know.

    We will also be exploring further how we can bring these into Xano directly, but there aren't any firm timelines on this that I can offer at this stage.

  • Lachlan
    Lachlan Administrator

    ADMIN

    Options

    @Gepeto I haven't yet played with file management with OpenAI, I'll look to explore this soon and provide an overview of how this can be done including the file upload management.

  • Inayet
    Inayet Member
    Options

    Hi @Lachlan McPherson @Lachlan saw your latest how to video regarding ChatGPT, and where I noticed my weak spot was your awesome use of database, would love to see more advanced topics starting from not knowing anything to advanced enterprise use case of database design and use with Xano including add-ons, referencing data from other tables, and so on.
    Right now to compensate for my limited knowledge am using JSON datatype a lot to store information that I need to access and use. I do realize that it is not the best use of database fully.
    Keep up the great work and hopefully you can do a modern fresh update on what @Prakash and @Michael have done in the past.

    Pinged you on two usernames as was not sure which one was active.

  • mustafaejaz
    mustafaejaz Member
    Options

    Amazing work @Lachlan!

  • alobato
    alobato Member
    Options

    I feel so blessed to be part of this community. Thank you very much @Lachlan for this.

  • NicelyPutEllie
    Options

    This is incredible info. I'd love to see the info on using Pinecone!

  • Lachlan
    Lachlan Administrator

    ADMIN

    Options

    Thanks @mustafaejaz, @alobato & @NicelyPutEllie!

    NicelyPutEllie - I have some further information available that I'll send through as a dm :)

  • prestonmiller
    Options

    Hi @Lachlan , I'd love to receive any further info you have on this as well. I duplicated the api's you created in the video below which was great and have added and been looking at this snippet….but if you have any more material that would help toward building a chat bot trained on custom data that would be much appreciated.

    https://www.xano.com/snippet/x3pSSEJD/

  • forgelab
    forgelab Member
    Options

    @Lachlan incredible job and awesome resource. 🙌

    i have our pinecone integration in place and working. was curious if you have any recommendations on how to count token length and then split appropriately. facilitating document uploads (embedding) and in-house ChatGPT build using Xano and need to tackle this next to avoid hitting the limit.

  • NicelyPutEllie
    Options

    I thought I'd ask here in case it's a useful question for others who are interested, given this is such a fantastic resource for people working with OpenAI.

    If we're using ChatGPT as a function within an app with multiple users, how we do safely store and use their API keys in Xano? Is there a 'how to' video that will put me/others on the right track?

  • Lachlan
    Lachlan Administrator

    ADMIN

    Options

    Hey @NicelyPutEllie there is a video on encrypting database fields here →


    Applying encryption like this for your API keys would be recommended to store your API keys safely.