The problem is they have no idea about the internal structure of the tokens they use, except what's present in the data set. The model sees "Kenya" as 8473 299 = Ken ya or something, and how is it supposed to know token 8473, often used for the name of Barbie's boyfriend, starts with K?
The problem is they have no idea about the internal structure of the tokens they use, except what's present in the data set. The model sees "Kenya" as
8473 299
=Ken ya
or something, and how is it supposed to know token8473
, often used for the name of Barbie's boyfriend, starts with K?Also they love to make up Fun Facts.