Generative AI providers and models
AIsuru supports multiple AI providers (OpenAI, Anthropic, Mistral) with language models of various sizes and capabilities. Choose the one that best fits your performance, budget, and feature requirements. This guide will help you navigate the available options and make an informed decision.
Provider overview
AIsuru integrates with several generative AI providers:
OpenAI: use your own API key to access GPT models (including fine-tuned ones);
Anthropic: use your own API key to access all models;
Mistral AI: use your own API key to access all models;
Microsoft Azure: use your own API key to access OpenAI's GPT models via Azure (including fine-tuned ones);
Amazon Bedrock (AWS): use your own API key to access Anthropic models via Amazon Bedrock;
Google Vertex AI: use your own API key to access Anthropic models via Vertex AI;
Custom (Ollama, LM Studio, ...): connect to custom providers or local servers that are compatible with the OpenAI protocol.
The choice of provider depends largely on which model you want to use: some models are only available through certain providers.
Model overview
Currently, all AIsuru users can access these models:
Vertex Anthropic
claude-3-7-sonnet-20250219
Large
200,000
Vertex Anthropic
Claude 4.5 Sonnet
Large
200,000 or 1M
Vertex Anthropic
Claude 4.6 Sonnet
Large
1M
Vertex Anthropic
claude-haiku-4-5-20251001
Small
200,000
Mistral
mistral-large-2407
Large
128,000
OpenAI
gpt-4o
Large
128,000
OpenAI
gpt-4o-mini
Small
128,000
OpenAI
gpt-5
Large
128,000
OpenAI
GPT-5.3 istant
Large
128,000
Reasoning models
Some models (such as claude-sonnet-4-20250514 and gpt-5 reasoning) have advanced reasoning capabilities. These models show their reasoning process only in the "Conversations" tab by default.
This default behavior is designed to protect sensitive information: the reasoning process can expose internal data, processing logic, or content that shouldn't be visible to end users.
You can make the reasoning visible to end users by modifying your sharing layout.
What is context
Context represents the maximum number of tokens that can be used in each request (or question) sent to the language model. To simplify: 1 token is roughly equal to 4 characters in English.
If you're not experimenting with particularly long instructions or functions, don't worry about this value.
How to choose a model
The choice of language model depends on several factors. This page provides some general guidance, but you'll need to test different models with your own Agent — that's the only way to make sure it behaves correctly across all your use cases.
Here's how to pick the right model for each configuration.
Q&A configuration and expert groups
For user interactions, consider the following:
Response complexity:
Complex, detailed responses: large models;
Simple, direct responses: small models.
Budget:
Tight budget: prefer small models;
Flexible budget: you can go with large models.
Response speed:
Immediate responses: small models;
Higher accuracy: large models.
Reasoning capability:
For tasks requiring complex reasoning: use models with reasoning (
claude-sonnet-4-20250514orgpt-5reasoning).
Import/export configuration
For document import, consider:
Document complexity:
Unstructured documents: you'll need to use larger models;
Well-structured documents: small models deliver excellent results.
Document volume: if you need to import a large number of documents, you might use a small model to keep costs down.
Deep Thinking configuration
For managing conversation memory, we always recommend using a large model.
Last updated