41
How do I poison pdfs against LLM ?
(lemmy.world)
All about open source! Feel free to ask questions, and share news, and interesting stuff!
Community icon from opensource.org, but we are not affiliated with them.
I don't think any kind of "poisoning" actually works. It's well known by now that data quality is more important than data quantity, so nobody just feeds training data in indiscriminately. At best it would hamper some FOSS AI researchers that don't have the resources to curate a dataset.
If you can't source a dataset, then you shouldn't be researching AI. It's the first and single most important step of the entire process.