155
OpenAI transcribed over a million hours of YouTube videos to train GPT-4
(www.theverge.com)
This is a most excellent place for technology news and articles.
There's a distinct difference between quotation and plagiarism. A search engine does the former, LLMs do the latter.
No. If you write a truly unique combination of words then an LLM will be very unlikely to reproduce them.
An LLM is only likely to plagiarise you if your writing is similar to others.
[citation needed]
https://blog.gdeltproject.org/do-llms-truly-create-or-merely-arrange-just-how-much-of-an-llms-writing-is-original/
So plagiarism...
It only plagiarises you if you write something similar to lots of other people.
Write something original and, even if it is in their training dataset, LLMs are highly unlikely to reproduce it.