In today’s issue of Command Line, I reported that ByteDance has been violating the developer license of both Microsoft and OpenAI by using GPT-generated data to train its own, competing model in China. After my report was published, OpenAI spokesperson Niko Felix sent the following statement confirm...
I don't know about that. Training your AI on someone else's AI feels a lot like drinking someone else's piss. I doubt you are going to extract much innovation out of that
It works pretty well. You can create a good dataset for a fraction of the effort and price it would have required to do it by hand. The quality is similar. You just have to review each prompt so you don't train your model on bad data.