This is surprisingly difficult problem because different people are okay with different brand substitutions. Some people may want the cheapest butter regardless of brand, while others may only buy brand name.
For example my wife is okay with generic chex from some grocery stores but not others, but only likes brand names Cheerios. Walmart, Aldi, and Meijer generic cheese is interchangable, but brand name and Kroger brand cheese isn't acceptable.
Making a software system that can deal with all this is really hard. AI is probably the best bet, but it needs to be able to handle all this complexity to be useable, which is a lot of up front work
As long as the AI has access to their ongoing purchase histories it's actually quite easy to have this for day to day situations.
Where it would have difficulty is unexpected spikes in grocery usage, such as hosting a non-annual party.
In theory, as long as it was fine tuned on aggregate histories it should be decent at identifying spikes (i.e. this person purchased 10x the normal amount of perishables this week, that typically is an outlier and they'll be back to 1x next week), but anticipating the spikes ahead of time is pretty much impossible.
Both of these problems could feasibly be solved by user input. If you had the ability to set rules for your personal experience, problems like that would only last as long as it takes the user to manually correct.
Like, "Ai, I bought groceries for a party on March 5th. Don't use that bill to predict what I need" or "stop recommending butter that isn't this specific brand"
Also quite difficult from a vision perspective. Tons of potential object classes, objects with no class (e.g., leftovers, homemade things), potential obfuscation if you are monitoring the refrigerator/cabinets. If the object is in a container, how do you measure the volume remaining of that substance? This is just scratching the surface I imagine. These problems individually are maybe not crazy challenging but they are quite hard all together.
But you actually need vision because purchase history is not indicative of my future purchases. Sometimes I buy butter and eat it in a 3 days and buy again. Sometimes I'm not in the mood and have a chunk of butter to sit in my fridge for 3 weeks. It's honestly totally random for a lot of things. It depends only on my mood at the moment.
You'd be surprised at how many of those things you think are random would actually emerge as a pattern in long enough purchase history data.
For example, it might be that there's a seasonality to your being in the mood. Or other things you'd have brought a week before, etc.
Over a decade ago a model looking only at purchase history for Target was able to tell a teenage girl was pregnant before her family knew just by things like switching from scented candles to unscented.
There's more modeled in that data than simply what's on the receipt.
I agree, in the context of the tweet, that purchase history is enough to build a working product that roughly meets user requirements (at least in terms of predicting consumed items). This assumes you can find enough purchase history for a given user. Even then, I have doubts about how robust such a strategy is. The sparsity in your dataset for certain items means you will either a.) be forced to remove those items from your prediction service or b.) frustrate your users with heavy prediction bias. Some items also simply won't work in this system - maybe the user only eats hotdogs in the summer. Maybe they only buy eggs with brownie mix. There will be many dependencies you are required to model to get a system like this working, and I don't believe there is any single model powerful enough to do this by itself. Directly quantifying the user's pantry via vision seems easy in comparison.
Honestly I would be perfectly happy with the service like this, even if I had to manually input what groceries I need. It's still an incredibly complex problem though. AI is probably better suited for it than anything else since you can have iterative conversations with latest generation AIs. That is, if I tell it I need cereal, it looks at my purchase history and guesses what type of cereal I want this week, and adds it to my list, I can then tell it no, actually I want shredded mini wheats.
So it would probably have to be a combination of a very large database and information gathering system with a predictive engine and a large language model as the user interface.