8

Sorry I keep posting about Mistral but if you check: https://chat.mistral.ai/chat

I duno how they do it but some of these answers are lightning fast:

Fast inference dramatically improves the user experience for chat and code generation – two of the most popular use-cases today. In the example above, Mistral Le Chat completes a coding prompt instantly while other popular AI assistants take up to 50 seconds to finish.

For this initial release, Cerebras will focus on serving text-based queries for the Mistral Large 2 model. When using Cerebras Inference, Le Chat will display a “Flash Answer ⚡” icon on the bottom left of the chat interface.

top 1 comments
sorted by: hot top controversial new old
[-] HenriVolney@sh.itjust.works 3 points 1 week ago

Is this a big deal? Definitely sounds like a big deal

this post was submitted on 08 Feb 2025
8 points (100.0% liked)

LocalLLaMA

2585 readers
10 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 2 years ago
MODERATORS