QUERY: What are the main models you offer?
Models - OpenAI API
The content lists various API endpoints and their corresponding latest models:
-...[/]
26/02/2024, 17:58
Models - OpenAI API
The Moderation models are designed to check whether content co...[/]
The content describes various models provided by OpenAI, focusing on moderation models and GPT base ...[/]
REPLY:
The main models we offer include:
- For completions: gpt-3.5-turbo-instruct, babbage-002, and davinci-002.
- For embeddings: text-embedding-3-small, text-embedding-3-large, and text-embedding-ada-002.
- For fine-tuning jobs: gpt-3.5-turbo, babbage-002, and davinci-002.
- For moderations: text-moderation-stable and text-moderation.
Additionally, we have the latest models like gpt-3.5-turbo-16k and fine-tuned versions of gpt-3.5-turbo.
--------------
QUERY: Do you have a speech recognition model?
The content describes various models related to text-to-speech, speech recognition, embeddings, and ...[/]
26/02/2024, 17:58
Models - OpenAI API
MODEL
DE S CRIPTION
tts-1
New Text-to-speech 1
The latest tex...[/]
26/02/2024, 17:58
Models - OpenAI API
ENDP OINT
DATA USED
FOR TRAINING
DEFAULT
RETENTION
ELIGIBLE FO...[/]
REPLY:
Yes, the Whisper model is a general-purpose speech recognition model mentioned in the content, capable of
multilingual speech recognition, speech translation, and language identification. The v2-large model, referred to
as "whisper-1", is available through an API and is optimized for faster performance.
--------------
QUERY: Which embedding model should I use for non-English use cases?
The content describes various models related to text-to-speech, speech recognition, embeddings, and ...[/]
26/02/2024, 17:58
Models - OpenAI API
MODEL
DE S CRIPTION
tts-1
New Text-to-speech 1
The latest tex...[/]
26/02/2024, 17:58
Models - OpenAI API
Multilingual capabilities
GPT-4 outperforms both previous larg...[/]
REPLY:
For non-English use cases, you should use the "V3 large" embedding model, as it is described as the most capable
for both English and non-English tasks, with an output dimension of 3,072.
--------------
QUERY: Can I introduce new knowledge in my LLM app using RAG?
What is RAG
Retrieve information to Augment the model’s knowledge and Generate the output
“What is y...[/]
When to use RAG
Good for ✅
Not good for ❌
●
●
Introducing new information to the model
●
Teaching ...[/]
Technical patterns
Data preparation: augmenting content
What does “Augmentingcontent” mean?
Augmenti...[/]
REPLY:
Yes, you can introduce new knowledge in your LLM app using RAG by retrieving information from a knowledge base or
external sources to augment the model's knowledge and generate outputs relevant to the queries posed.
--------------
QUERY: How many examples do I need to fine-tune a model?
What is Fine-tuning
Public Model
Training data
Training
Fine-tunedmodel
Fine-tuning a model consists...[/]
When to fine-tune
Fine-tuning is good for:
- Following a given format or tone for the output
- Proce...[/]
Overview
Fine-tuning involves adjusting theparameters of pre-trained models on aspecific dataset or t...[/]
REPLY:
We recommend using 50-100 examples for fine-tuning a model, even though the minimum is 10.
--------------
QUERY: Which metric can I use to evaluate a summarization task?
Technical patterns
Metric-based evaluations
ROUGE is a common metric for evaluating machine summariz...[/]
Technical patterns
Metric-based evaluations
Component evaluations
Subjective evaluations
●
●
Compari...[/]
Technical patterns
Metric-based evaluations
BLEU score is another standard metric, this time focusin...[/]
REPLY:
ROUGE is a common metric you can use to evaluate a summarization task.
--------------
QUERY: Give me a detailed example for an evaluation process where we are looking for a clear answer to compare to a
ground truth.
What are evals
Example
Our ground truth matches the predicted answer, so the evaluation passes!
Eval...[/]
What are evals
Example
An evaluation contains a question and a correct answer. We call this the grou...[/]
Technical patterns
Metric-based evaluations
What they’re good for
What to be aware of
●
●
A good sta...[/]
REPLY:
The content provided is relevant and offers a detailed example for an evaluation process comparing to a ground
truth. Here's a concise explanation based on the content:
In the given example, the evaluation process involves a question-and-answer scenario to verify the accuracy of
information retrieved by a tool or system in response to a query. The question posed is, "What is the population of
Canada?" The ground truth, or the correct answer, is established as "The population of Canada in 2023 is 39,566,248
people." A tool labeled "LLM" is then used to search for the answer, which predicts "The current population of
Canada is 39,566,248 as of Tuesday, May 23, 2023." This predicted answer matches the ground truth exactly,
indicating that the evaluation passes. This process demonstrates how an evaluation can be used to verify the
accuracy of information retrieved by a tool, comparing the predicted answer to the ground truth to ensure
correctness.
--------------