That would actually be insane. Right now, I still need my GPU and about 8-10 gigs of VRAM to run a 7B model tho, so idk how that's supposed to work on a phone. Still, being able to run a model that's as good as a 70B model but with the speed and memory usage of a 7B model would be huge.
I only need ~4 GB of RAM/VRAM for a 7B model, my GPU only has 6GB VRAM anyway. 7B models are smaller than you think, or you have a very inefficient setup.