It’s about the training and tuning. If a model is very good at the few dozen things that I’m likely to want my phone to do, and able to recognize when it should ask a remote, larger model for help, that’s pretty excellent and could conceivably fit on a phone. Even better if the system uses my usage data to train the model to be better for me. For example: I ask it to build me a playlist a few times, but never ask it for recipes. Eventually this usage data retrains to better handle playlist building (probably using a RAG because of how specific the data is) and drops all the training needed for making recipes, which it can always call up the chain for.