overview for hok

Llama 3.3 70b - End of open-weight pretrained models from Meta or just a better Llama 3.1 405b finetune? by hok in c/localllama@sh.itjust.works

[-] hok@lemmy.dbzer0.com 2 points 1 week ago

Thank you so much, that exactly answers my question with the official response (that guy works at Meta) that confirms it's the same base model!

I was concerned primarily because in the release notes it strangely didn't mention it anywhere, and I thought it would have been important enough to mention.

17

Llama 3.3 70b - End of open-weight pretrained models from Meta or just a better Llama 3.1 405b finetune? (lemmy.dbzer0.com)

submitted 1 week ago by hok@lemmy.dbzer0.com to c/localllama@sh.itjust.works

6 comments fedilink

People are talking about the new Llama 3.3 70b release, which has generally better performance than Llama 3.1 (approaching 3.1's 405b performance): https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_3

However, something to note:

Llama 3.3 70B is provided only as an instruction-tuned model; a pretrained version is not available.

Is this the end of open-weight pretrained models from Meta, or is Llama 3.3 70b instruct just a better-instruction-tuned version of a 3.1 pretrained model?

Comparing the model cards: 3.1: https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md 3.3: https://github.com/meta-llama/llama-models/blob/main/models/llama3_3/MODEL_CARD.md

The same knowledge cutoff, same amount of training data, and same training time give me hope that it's just a better finetune of maybe Llama 3.1 405b.

Discussing jury nullification is against the ToS of lemmy.world and will get you banned, according to worldnews mod by hok in c/yepowertrippinbastards@lemmy.dbzer0.com

[-] hok@lemmy.dbzer0.com 15 points 1 week ago

On Lemmy, everything is a bit leftist at the moment.