One prominent author responds to the revelation that his writing is being used to train artificial intelligence.
Stephen King: My Books Were Used to Train AI::One prominent author responds to the revelation that his writing is being used to coach artificial intelligence.
Problem is, how are you gunna run it? Meta has already open sourced an LLM that rivals GPT-4 with only 65B parameters, but you can't even come close to running it with a top of line GPU.
We don't have the legal framework for this type of thing. So people are going to disagree with how using training data for a commercial AI product should work.
I imagine Steven King would argue they didn't have licenses or permission to use his books to train their AI. So he should be compensated or the AI deleted/retrained. He would argue buying a copy of the book only lets it be used for humans to read. Similar to buying a CD doesn't allow you to put that song in your advert.
I would argue we do have a legal precedent for this sort of thing. Companies hire creatives all the time and ask them to do things in the style of other creatives. You can't copyright a style. You don't own what you inspire.
Lol. AI training is more like human awareness of a subject or style. LLM's are not Artificial General Intelligence. They have no persistent memory. They are just a complicated way of categorizing and associating subjects mixed with a probability of what word comes next. The only thing they are really doing is answering what word comes next. Thar be no magic in them thar dragons
This is just another market hype article. AI can't reproduce a work or replace the author. It can write a few lines that may reflect a similar style just like any human also familiar with the work and style.
Unless you want to go back to the medieval era of thought policing, all of these questions about AI training are irrelevant.
It’s an interesting philosophical question to ask whether we humans, when writing something, based on the sum total of all the things we’ve seen, heard, read, etc., aren’t just also working out which is the most likely next word to make a good story. *
One question that could be worth asking though is whether this should have been done without permission. From experience talking with authors, that’s a bigger concern than whether they’ll be replaced.
*Totally agree with you that current LLMs are a long way from that. And humans don’t work at the word level either, so the abstraction is different, but the principle might be the same.
People believing this stuff is AGI also makes me think of how my poor, illiterate, provincial grandmother who when she moved to live with us "in the big city" used to get really confused when she saw the same actor in more than one soap opera on TV: she confused the immitation of real life which is acting (soap operate acting, even, which is generally pretty bad) with actual real life.
You seem to imply that AI has perfect memory. It doesn't.
Stable Diffusion is a 4GB file of weights. ChatGPT's model is of a similar size. It is mathematically impossible for it to store the entire internet on a few GBs of data, just like it is physically impossible for one human brain to store the entire internet with its neutral network.
Sure, if you want to see it like that. But if you try out StableDiffusion, etc you will notice that "imperfect memory" describes the AI as well. You can ask it for famous paintings and it will get the objects and colors generally correct, but only as well as a human artist would. The details will be severely lacking. And that's the best case scenario for the AI, because famous paintings will be over represented in the training data.
So if your AI responses are biased towards car crashes you will know why now.
Take a Stephen King book you have never read. Open a random page and point to a random paragraph. Do this 3x. You will find a car crash, a memory of a car crash, someone talking about a car crash, or someone concluding X happened because of a car crash.