Most argue training with copyrighted data is fair use.
AI companies have all kinds of arguments against paying for copyrighted content::The companies building generative AI tools like ChatGPT say updated copyright laws could interfere with their ability to train capable AI models. Here are comments from OpenAI, StabilityAI, Meta, Google, Microsoft and more.
Both of your examples are governed by the same set of privacy laws, which talk about consent, purpose and necessity, but not about scale. Legislating around scale open up the inevitable legal quagmires of "what scale is acceptable" and "should activity x be counted the same as activity y to meet the scale-level defined in the law".
Scale makes a difference, but it shouldn't make a legal difference w.r.t. the legality of the activity.
Intent is part of it as well. If you have too many people who want to use your service, you're not being attacked, you have an actual shortage of ability to service requests and need to adjust accordingly.
In this context I meant that it was the same person doing a "normal" thing at such a scale that it becomes illegal. Scale absolutely is something that can turn something from legal to illegal.