95
LLMs have a strong bias against use of African American English
(arstechnica.com)
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
All the people here saying "well of course because they weren't trained on AAVE":
THAT'S THE WHOLE POINT
It's the same reason facial recognition and voice recognition software have a difficult time with anyone who isn't white or a speaker of perfect, uninflected standard english. The bias is created by the developers, conscious or not, because they only train it on what's in their own bubble. If you don't have diverse teams behind the development and training, you will create this bias, whether you want to or not. This is well known.
There's also just the issue of the fact that there's significantly more books, articles, etc. written in standard english vs AAVE so that's gonna be a huuuge barrier to overcome regardless of diversity of development and training teams. Not to say diversity isn't important, but also that there's just certain challenges surrounding finding adequate amounts of high quality training data, especially for less mainstream concepts. It's the same reason an AI couldn't give a summary of a book that has almost no info abt it on the internet.