Actually, that is probably incorrect. My guess is that they fed a learning algorithm with no data thousands upon thousands of tagged photos and let the algorithm identify the qualities of the photos with similar tags. So, no one consciously said, “These are the qualities of gorillas.” Instead, there was just a bunch of photos with gorillas in them that also had the associated keyword “gorilla” and the algorithm saw the same qualities in this photo.
That’s how the algorithm is able to identify the window shot as “airplane” even though it is a fractional view of a plane.
Algorithms don’t lie. I guess that there is truth in mathematics.