Intel problem is that they keep pushing extensions race while AMD proved with their Ryzen series that if you keep your instruction set to a minimum, then your CPUs will be energy efficient, even arm proved this by pushing extensions too far like intel and getting overheating chips
The overhead of additional instructions isn't the issue, they often translate those instructions into a smaller set of actual operations. It's not like they have a special circuit for every instruction, a lot of instructions translate to a pipeline of multiple, modular circuits.
The actual silicon will look more like ARM despite having a very large difference in instruction set sizes.
That depends on what you mean, but here are a few reasonable explanations:
Intel's chips are still on their Intel 7 process (similar to TSMC's 7nm process), whereas AMD is using TSMC's 4nm process, so AMD's CPUs are 2 nodes ahead; smaller process generally means more transistors in the same area, as well as lower power usage per clock
AMD's chiplet architecture makes it easier for them to move the CPU bits to a smaller arch, and the IO bits can stay on a cheaper arch (e.g. AMD uses 4nm for the cores, 6nm for the IO die); this increases yields and dramatically reduces costs, so AMD can invest more in architectural improvements
ARM prioritizes battery life over performance, so performance per watt won't be great at the high end, but it'll probably win at the low end; they also don't make their own chips (just designs), so comparing process nodes is meaningless
AMD focuses on different aspects of computing than either Intel or ARM, so perhaps they've just done a better job optimizing for what you care about
And for AMD's 3D v-cache chips, there's an enormous energy benefit, as taking stuff from the (much larger than usual) cache is far more energy efficient than constantly going back and forwards to RAM.
Correction, meteor lake's (Intel 14th gen) CPU tile is on the Intel 4 process (though admittedly that's a 7nm euv process). And they've also moved to a chiplet design. (CPU, GPU and IO are on 3 different processes)
Look up gcc x86 options https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html , intel have twice as big instruction set, and with expansion of instructions of arm, risc architecture ain't saving it from overheating now because of aforementioned, now bloated instruction set