Probably because it's a complicated 3D shape. The 2D projection of the hand on the photo can change a lot depending on the camera angle, position of the hand and what the person is doing.
Also I noticed that AI has difficulty when different features are close to one-another, for example when someone crosses legs or holds and object. Maybe the AI is competent to draw the objects in isolation, but their combination is much more difficult. This is often the case with hands,