The AI boom started last year, it's the reason prices are already high. So I'm skeptical that they are going to get higher, seeing as this whole AI thing feels like a bubble.
People are skeptical about AI, but companies seem to be less so.
They see an opportunity to save money and they’re gonna go for it, with the ship steered by shareholders who like the “AI” buzzword.
Maybe it won’t have any effect on GPU prices though. Crypto was a much more accessible market for anyone who wanted to have a go at it, whereas you do need to actually have an idea to make use of AI.
AI needs GPUs with a huge amount of vram. Nvidia moves to put criminally low amount of VRAM in their low and mid range gaming GPUs will ensure those GPUs won't be hoarded by companies for AI stuff. Whether gamers actually want to buy those GPUs with such low amount of VRAM is a different matter though.
Ai cards need to be efficient so they need tsmc 3n and 5n fabs. Desktop cards don't so they could use Samsung or older cheap tsmc fabs. When we had the shortages before it was the smaller components like vrm stages and capacitors that had a shortage. Those are now over supplied. There is no reason for the price hikes other than Nvidia seeing what people paid to scalpers and wanting it for themselves.
Nvidia has continued to push for more clock speed on lower end parts and charging higher prices. There is no reason a mid range die should be clocked to 3ghz on a super expensive pcb like the 4080 has or a low end part doing it on the 4070. Those both have pcb that cost more than the die for a die that would historically be used in $150-400 parts when adjusted for inflation. They also both should be clocked around 1.8-2ghz as that has a 60-70% reduction their power consumption for a 30% performance loss (see the mobile parts for what those parts should be close to for their base sku.)
Data centers care a lot about power. The ai products run around 2ghz in the sweet spot. Consumer cards target 3ghz this gen and use 3-4x the power that they do at ~2ghz. The die in the 4080 is a mid range size. It is what used to be in things like the 60 series or maybe a 70 series card. They have been overclocking the snot out of them stock and putting them on massively expensive pcb instead of giving us the larger dies we used to get. That shifts the costs to the board partners and lets them get away with selling the dies at a huge profit compared to their older products.
Back to data centers. You pay a lot for your spot based on power and location. If they stay efficient and pack lots of chips in, that is the cheapest way over the life of the server. If you save 10 or 20% power due to using a new node that is worth a huge reduction in data center fees. On the consumer desktop side, they can overclock to double the power instead of using a larger more expensive die and pocket the difference with no one really caring.
Tsmc 4n is a 6nm process based on improving their 7nm. 3n and 5n are both experimental process. 5n is smaller than 4n at lower density. The consumer cards are 7nm then 4n (the cheap ones). The data center cards are 5n and 3n (the high end expensive processes.) Ordering more consumer or data center do not conflict with each other. Doing more of the workstation cards could since they are full feature consumer dies, but those are not the ai cards.
AI is much more taxing than gaming. Machine learning will peg a gpu at a flat 100% constant use, while gaming fluctuates up and down depending on what's going on on screen. So being more power efficient while running a card at 100% 24/7 saves money on power costs, and corporations love saving money.