Prices & Costs
Transparent guidance for your AI project

AI projects can rarely be priced as a flat rate, because scope, technical requirements, and data foundations vary greatly from company to company. Our price overview therefore shows you which cost components you can generally expect and how a project is typically structured.
The total costs are made up of the appropriate software license, possible services for setup, customization, or data preparation, as well as usage-based fees for AI models, storage, or data access.
Which components are actually relevant depends on your specific use case.
This gives you clear guidance in advance, without forcing your project into a rigid package. We will prepare a binding quote after an initial conversation, once requirements, data situation, desired functions, and technical effort can be realistically assessed.
If you have any questions about our prices, please feel free to contact us at any time.
Our offer is aimed at business customers, therefore our prices are to be understood in euros (€), net plus statutory taxes.
Software licenses
Which software license is right for your company depends primarily on how you want to use AI.
For a quick start, smaller teams, and clearly defined applications, Vimmera Now is usually the right choice. If you want to integrate AI permanently into business processes, use multiple assistants, knowledge bases, role models, usage statistics, or scalable extensions, Vimmera Studio is better suited. In a joint discussion, we will clarify which platform fits your goals, your number of users, your data, and your security requirements.
Vimmera Now
The entry into secure enterprise AI.
Vimmera Now is suitable for companies that want to get started quickly with AI assistants, knowledge bases, and simple AI tools. The platform includes the most important core functions of Vimmera Studio, but with a reduced scope of features.
€39.00 / month
Included:
- 1 included user
- Up to 10 users per company account
- Updates, server operation, and software maintenance
- Chat history up to 50 threads per user
- Max. 30 days retention period for chat history (excluding statutory retention obligations in the background, invisible to the user)
- Up to 5 GB storage for chat history
- Enterprise GPT included (plus usage-based fees)
- Hosting in Europe or by individual arrangement
Optional extensions:
- Additional user: €39.00 / month
- Setup per user: €99.00 one-time
- Vimmera Cortex: €39.00 / month per Cortex
- Vimmera Assist: €39.00 / month per Assist
- Vimmera Tool: €39.00 / month per Tool
- Vimmera Webchat (chatbot for intranet or internet): €69.00 / month per Webchat
- Additional storage for chat history: €9.00 / month per 10 GB
- Additional Cortex storage: €9.00 / month per started 1 GB
Note: When using AI systems (e.g. Assist, Tools, Webchats,…) usage-based fees are incurred (see below) which are not included in the license costs.
Vimmera Studio
The scalable platform for custom AI systems, enterprise knowledge, and scalable AI processes.
Vimmera Studio is the central software platform for AI systems, assistants, knowledge bases, and tools from Vimmera AI. The platform is suitable for companies that need multiple users, structured knowledge work, custom AI assistants, usage statistics, role models, and expandable AI processes.
€499.00 / month
Included:
- 5 included users
- Unlimited maximum number of users
- Updates, server operation, and software maintenance
- 1 Vimmera Cortex included (plus any customization and data preparation costs)
- 3 custom Vimmera Assist included (plus any customization costs)
- 3 custom Vimmera Tool included (plus any customization costs)
- Usage statistics included
- Maximum and minimum retention period for chat history according to legal requirements or individually configurable
- 10 GB storage for chat history included
- Unlimited number of Vimmera Assist, Vimmera Cortex, and Vimmera Tools possible
- Enterprise GPT included (plus usage-based fees)
- Meeting assistant (Vimmera Meeting Tool) included (plus usage-based fees)
- Translation Studio (AI translations) included (plus usage-based fees)
- Transcribe (transcription tool) included (plus usage-based fees)
- Hosting in Europe or by individual arrangement
Optional extensions:
- Additional user: €29.00 / month
- Each additional Vimmera Cortex: €39.00 / month
- Each additional Vimmera Assist: €39.00 / month
- Each additional Vimmera Tool: €39.00 / month
- Vimmera Webchat (chatbot for intranet or internet): €69.00 / month per Webchat
- Additional storage for chat history: €9.00 / month per 10 GB
- Additional Cortex storage: €9.00 / month per started 1 GB
Price tiers:
For companies with many users, assistants, tools, or Cortexes, we offer tiered pricing. We would be happy to discuss the options and exact pricing tiers with you.
Note: The software licenses do not include any customization or adjustments in the price. Adjustments, customizations, data preparation, or other services are billed separately.
The license for Vimmera Webchat (chatbot for intranet or internet) also includes the provision of the JavaScript code for a chatbot UI. Technical integration into the website, adjustments to CMS, consent or tracking systems, as well as design adjustments are not included and are offered separately.
Note: When using AI systems (e.g. Assist, Tools, Webchats,…) usage-based fees are incurred (see below) which are not included in the license costs.
Comparison: Vimmera Now and Vimmera Studio
| Feature | Vimmera Now | Vimmera Studio |
|---|---|---|
| Monthly price | 39,00 € | 499,00 € |
| Included users | 1 | 5 |
| Maximum number of users | 10 | unlimited |
| Additional users | €39.00 / month | €29.00 / month |
| Usage statistics | not included | included |
| Private mode | not included | available as an option |
| Vimmera Cortex | optional | 1 included |
| Vimmera Assist | optional | 3 included |
| Webchat (chatbot for intranet or internet) | optional | optional |
| Tools | optional | 3 included |
| Chat history | limited | expanded according to specifications |
Note: When using AI systems (e.g. Assist, Tools, Webchats,…) usage-based fees are incurred (see below) which are not included in the license costs.
Services
For individual customizations, technical extensions, interfaces, data preparation, and project-related implementations, we charge based on time and effort.
| Service | Billing | Net price |
|---|---|---|
| UI adjustments, adjustments in Vimmera Studio and Vimmera Now, general customizations | per hour | 149,00 € |
| Data preparation, data verification, interface programming, data digitization, programming of vector store structures, prompt engineering, customization of Vimmera Assist, Vimmera Webchat, Vimmera Tools and data maintenance | per hour | 189,00 € |
Workshops & training
For the successful use of AI, careful preparation is crucial. In workshops, we analyze processes, identify sensible use cases, review existing data, and prepare the implementation.
In addition, we offer training and workshops on relevant framework conditions and areas of application, in particular on the GDPR, the EU AI Act, and the responsible and secure use of AI in companies.
| Content | Duration | Net price |
|---|---|---|
| Analysis meetings, training, data compilation and project preparation | 1 day / 8 hours | 1.650,00 € |
| Analysis meetings, training, data compilation and project preparation | 1/2 day / 4 hours | 950,00 € |
The price includes the necessary preparation and follow-up as well as documentation and materials, for example training materials or checklists.
Travel expenses are billed by individual agreement.
Usage fees for AI models, tools and data access
In addition to license and project costs, usage-based costs may arise. These depend in particular on the AI model used, the number of requests, the amount of data, image or audio processing, and the chosen infrastructure.
Typical usage-based costs:
- Text processing based on input, output and storage tokens (cached tokens)
- Access to Vimmera Cortex or vector stores
- Code interpreter sessions
- Image analysis and image generation
- Video generation
- Audio transcriptions
- Realtime speech input
- Caching of thread or context information
As a guideline, token costs depending on the model are approximately in the following ranges:
| Usage | Guideline net |
|---|---|
| Input tokens | from €0.36 per 1 million tokens |
| Cached input tokens | from €0.10 per 1 million tokens |
| Output tokens | from €2.45 per 1 million tokens |
| Code interpreter (code execution in the background) | from €0.05 per session |
| Vector store access (or other tool calls) | €3.80 per 1,000 accesses (calls) |
| Audio upload / transcription | €0.02 per audio minute |
| Realtime audio input | €0.04 per audio minute |
| Image generation | from €0.10 to approx. €0.50 per image, depending on model and resolution |
We will agree on the specific model and infrastructure choice with you. From this, the resulting costs are determined individually. Token costs depend both on the selected/used AI model (e.g. LLM) and on the hosting location, infrastructure, and other factors.
Price example for higher-priced LLMs: Very powerful LLMs with reasoning algorithms and very large context lengths are billed for input tokens from approx. €1.82 per 1 million tokens, cached input tokens from approx. €0.40 per 1 million tokens, and output tokens from approx. €13.30 per 1 million tokens.
The usage-based fees are billed monthly based on the usage determined via Vimmera Now or Vimmera Studio.
Explanations of the technical terms
To help you better understand the pricing overview, we briefly explain the most important components of our AI platforms.
The Code Interpreter is an execution environment for AI applications in which calculations, data analyses, and file-based tasks can be performed directly by the AI. It enables an AI assistant not only to formulate answers, but also to actively work with data, for example by evaluating tables, processing files, performing calculations, generating charts, or carrying out technical intermediate steps in a traceable way. This allows the AI assistant to handle more complex tasks more reliably and deliver results based on actual calculations or file contents.
A Session is a time-limited working environment in which an AI application can retain information, intermediate results, and the current processing status during use. Within a session, the AI assistant can work coherently, refer to previous inputs in the current process, and continue tasks step by step. This means information does not have to be transferred again with every single request, but remains available during the ongoing session. A session therefore helps to keep conversations, calculations, file processing, or other work steps together in a meaningful and consistent way.
Input Tokens are the units of information that are passed to an AI model so that it can understand and process a request. These include, for example, the user input, system instructions, conversation history, inserted texts, file contents, or additional context information from connected knowledge sources. The AI does not process these contents as whole words, but in smaller text units called tokens. The more information is passed to the model, the more input tokens are used. Input tokens therefore determine how much context an AI assistant can take into account when processing a request and also form an important basis for the technical processing and billing of AI requests.
Cached Input Tokens are input tokens that an AI model can reuse for a request because the same or very similar context components have already been processed previously. This applies, for example, to recurring system instructions, longer unchanged prompts, fixed context blocks, or content that is sent identically with multiple requests. Instead of processing these contents completely anew each time, they can be technically cached and taken into account as “cached.” This allows repeated requests to be processed more efficiently and usually more cheaply than completely new input tokens. Cached input tokens are therefore particularly relevant for AI applications that frequently work with the same basic instructions, document excerpts, or recurring context information.
Output Tokens are the units of information that an AI model generates as a response. These include, for example, written texts, summaries, explanations, table contents, code, structured data, or other content output by the model. While input tokens describe the information passed to the model, output tokens describe the result returned by the model. The longer or more detailed a response is, the more output tokens are generated. Output tokens are therefore important for technical processing, possible response length, and the billing of AI requests.
Vector store accesses are queries to a knowledge or document database in which content is stored not only as normal text, but as so-called vectors. An AI system uses such accesses to find information matching a user question from stored documents, knowledge articles, product data, or other sources. The search is not based only on exact keywords, but on semantic similarity. This allows an AI assistant to retrieve relevant text passages or data records even if the question is phrased differently from the stored content. Vector store accesses are therefore particularly important when an AI application is intended to work with company-specific knowledge and generate answers based on suitable sources or document excerpts.
Private mode is a variant of an assistant/agent or tool that operates without any external access options, viewing, or storage. Data is only used at the moment it is actually needed. No history is created. This ensures maximum data protection and even the most confidential data can be processed securely in the AI systems. For example, private mode can be used by development teams, NGOs, or works councils when only a specific person or group of people is allowed to access data and even administrators must not have access (no backdoors).
Realtime audio input refers to audio data that is transmitted directly and with virtually no delay to an AI system during ongoing use. Speech is not uploaded as a finished file, but processed continuously as an audio stream. This allows an AI assistant to recognize spoken input directly, evaluate it, and respond to it.
Audio upload for transcriptions refers to the ability to pass existing audio files to an AI system so that text is automatically generated from them. These can be, for example, recordings from a meeting assistant, interviews, phone notes, training sessions, voice memos, or external audio sources.
The AI processes the uploaded audio file and converts the spoken content into a written transcript. This transcript can then be reused, for example for summaries, minutes, task lists, knowledge articles, documentation, or structured further processing in an AI application. Audio uploads for transcriptions are therefore particularly useful when spoken information should be made retrospectively analyzable, searchable, and documentable.
Vimmera Cortex is the knowledge base for your AI applications. Company information, documents, product data, service knowledge, or other content can be stored there in a structured way and made usable for AI systems. A Cortex ensures that an AI assistant does not use only general knowledge, but can access relevant information from your company.
Vimmera Assist is a custom-configurable AI assistant for specific tasks, departments, or use cases. An Assist can, for example, answer internal questions, support employees, prepare service requests, create texts, or use information from a Cortex. Depending on the need, an Assist can be equipped with specific roles, rules, knowledge sources, and functions.
Examples of Assist:
- Company GPT
- Web search assistants
- First-level support
- Document editing / document creation
- Document research
- Knowledge management assistants
- Onboarding and offboarding assistants
Vimmera Tool is a special AI function or extension with which specific tasks can be automated or supported. Tools can, for example, analyze data, structure content, prepare documents, access interfaces, or facilitate certain recurring processes. They extend one or more AI assistants with additional capabilities.
Examples of tools:
- Meeting assistant (meeting recordings)
- Translation assistant (translations)
- Transcription assistant (knowledge management)
- Invoice booking assistant
Vimmera Webchat is a chatbot for intranet or internet that can be embedded on a website. It enables employees, customers, or other user groups to interact directly with an AI system via a chat interface. Depending on the configuration, a webchat can access company knowledge from a Cortex, be connected to an Assist, or use certain tools.
Examples of webchat:
- Mera on our homepage
Simply put: The Cortex provides knowledge, the Assist uses this knowledge in dialogue, a Tool extends the AI with specific functions or process steps, and the Webchat makes the AI usable via a website.
Data protection and hosting
Data protection and security are part of every solution. Depending on the project requirements, we individually agree on hosting location, access rights, role model, logging, and data processing.
Custom hosting models, hosting in Germany, or on-premise solutions are possible after technical review and by separate agreement.
Data preparation for better AI results
Company data can be stored directly in Vimmera Cortex and made usable for AI applications. However, with larger amounts of data or poorly readable data formats, confusion, inaccurate results, or mixed information can occur.
That is why Vimmera AI offers advanced data preparation. In this process, company knowledge is not only stored, but structured, linked, and optimized for AI systems. This allows assistants to use more relevant contexts and generate answers more reliably.
We particularly recommend this data preparation for complex documents, extensive product information, technical documentation, service knowledge, or large internal knowledge bases.
Data preparation services are billed based on time and effort (see services).
Contract term and billing
Unless otherwise agreed, the minimum contract term per product is 12 months.
Typical billing:
- Services, workshops and training based on time and effort
- Software licenses monthly
- Usage fees monthly according to actual usage
Which solution is right for you?
Vimmera Now is suitable for getting started, smaller teams, and clearly defined AI applications. Licenses for Vimmera Now can be converted into licenses for Vimmera Studio at any time.
Vimmera Studio is suitable for companies that want to integrate AI permanently into processes, knowledge, support, communication, or internal workflows.
In an initial conversation, we will work together to clarify which platform, which AI assistants, which data sources, and which security requirements make sense for your company.
Test access
Would you like to get to know Vimmera AI first? Upon request, in addition to detailed explanations, we will be happy to provide you with a free and non-binding test access. This allows you to try selected functions, gather initial impressions, and check whether Vimmera Now or Vimmera Studio fits your requirements.
The test access is for orientation purposes only and is not associated with an order or contractual commitment.
The scope, duration, and available functions of the test access are agreed individually.
Notes on pricing and billing
Our offer is aimed exclusively at commercial customers.
All prices quoted are net plus statutory VAT. The prices are for guidance only and may vary depending on the scope of the project, data volume, data quality, desired infrastructure, hosting location, and the AI model used.
Only individual offers and contracts are binding. Unless otherwise agreed, payment is due within 14 days net from the invoice date. Subject to change and errors.
The General Terms and Conditions of Vimmera AI Solutions GmbH apply. Individual offers and contracts always take precedence.
Arrange a non-binding initial consultation
Would you like to know what costs will arise for your specific AI project?
Then talk to us. After an initial analysis meeting, we can assess which solution is suitable, which data is required, and what level of effort is realistic.