submitted 6 days ago by lynx@sh.itjust.works to c/localllama@sh.itjust.works

3 comments fedilink hide all child comments

I've been using Qwen 2.5 Coder (bartowski/Qwen2.5.1-Coder-7B-Instruct-GGUF) for some time now, and it has shown significant improvements compared to previous open weights models.

Notably, this is the first model that can be used with Aider. Moreover, Qwen 2.5 Coder has made notable strides in editing files without requiring frequent retries to generate in the proper format.

One area where most models struggle, including this one, is when the prompt exceeds a certain length. In this case, it appears that the model becomes unable to remember the system prompt when the prompt length is above ~2000 tokens.

you are viewing a single comment's thread
view the rest of the comments

[-] brucethemoose@lemmy.world 1 points 5 days ago

That's because ollama's default max ctx is 2048, as far as I know.

this post was submitted on 24 Nov 2024

15 points (94.1% liked)

LocalLLaMA

2274 readers

2 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 2 years ago

MODERATORS

SkySyrup@sh.itjust.works

pax@sh.itjust.works

noneabove1182@sh.itjust.works