“When we walk down the street, we know what we need to pay attention to—and what we don’t. Robots, on the other hand, treat all the information they receive about their surroundings with equal importance. Driverless cars have to continuously analyze data about things around them whether or not they are relevant. This keeps drivers and pedestrians safe, but it draws on a lot of energy and computing power.”
Yes, but game engines also hold the entire world inside themselves. There's no guessing, no estimating, no making sure that what it's looking at is actually a human or a bush - it already knows that.
The problem with computer vision being lazy is that it can't ignore something without understanding what it's looking at, and it can't understand what it's looking at without analyzing the data. It's a circular problem, and will be ridiculously hard to solve - the crux of the issue is that we as people are analyzing that same data, we just don't realize it.
Humans are bad at it, too. If you've ever ridden a bike or motorcycle, you quickly learn that car and truck drivers simply aren't looking for 2 wheelers. And therefore they don't see them. (I think this reinforces your point).