Generative AI providers and models

AIsuru supports multiple AI providers (OpenAI, Anthropic, Mistral) with language models of various sizes and capabilities. Choose the one that best fits your performance, budget, and feature requirements. This guide will help you navigate the available options and make an informed decision.

Provider overview

AIsuru integrates with several generative AI providers:

  • OpenAI: use your own API key to access GPT models (including fine-tuned ones);

  • Anthropic: use your own API key to access all models;

  • Mistral AI: use your own API key to access all models;

  • Microsoft Azure: use your own API key to access OpenAI's GPT models via Azure (including fine-tuned ones);

  • Amazon Bedrock (AWS): use your own API key to access Anthropic models via Amazon Bedrock;

  • Google Vertex AI: use your own API key to access Anthropic models via Vertex AI;

  • Custom (Ollama, LM Studio, ...): connect to custom providers or local servers that are compatible with the OpenAI protocol.

The choice of provider depends largely on which model you want to use: some models are only available through certain providers.

Model overview

Currently, all AIsuru users can access these models:

Provider
Model
Size

Vertex Anthropic

claude-3-7-sonnet-20250219

Large

200,000

Vertex Anthropic

Claude 4.5 Sonnet

Large

200,000 or 1M

Vertex Anthropic

Claude 4.6 Sonnet

Large

1M

Vertex Anthropic

claude-haiku-4-5-20251001

Small

200,000

Mistral

mistral-large-2407

Large

128,000

OpenAI

gpt-4o

Large

128,000

OpenAI

gpt-4o-mini

Small

128,000

OpenAI

gpt-5

Large

128,000

OpenAI

GPT-5.3 istant

Large

128,000

Reasoning models

Some models (such as claude-sonnet-4-20250514 and gpt-5 reasoning) have advanced reasoning capabilities. These models show their reasoning process only in the "Conversations" tab by default.

This default behavior is designed to protect sensitive information: the reasoning process can expose internal data, processing logic, or content that shouldn't be visible to end users.

You can make the reasoning visible to end users by modifying your sharing layout.

What is context

Context represents the maximum number of tokens that can be used in each request (or question) sent to the language model. To simplify: 1 token is roughly equal to 4 characters in English.

If you're not experimenting with particularly long instructions or functions, don't worry about this value.

How to choose a model

The choice of language model depends on several factors. This page provides some general guidance, but you'll need to test different models with your own Agent — that's the only way to make sure it behaves correctly across all your use cases.

Here's how to pick the right model for each configuration.

Q&A configuration and expert groups

For user interactions, consider the following:

  • Response complexity:

    • Complex, detailed responses: large models;

    • Simple, direct responses: small models.

  • Budget:

    • Tight budget: prefer small models;

    • Flexible budget: you can go with large models.

  • Response speed:

    • Immediate responses: small models;

    • Higher accuracy: large models.

    Reasoning capability:

    • For tasks requiring complex reasoning: use models with reasoning (claude-sonnet-4-20250514 or gpt-5 reasoning).

Import/export configuration

For document import, consider:

  • Document complexity:

    • Unstructured documents: you'll need to use larger models;

    • Well-structured documents: small models deliver excellent results.

  • Document volume: if you need to import a large number of documents, you might use a small model to keep costs down.

Deep Thinking configuration

For managing conversation memory, we always recommend using a large model.

Last updated