How do you differentiate actual objects from shadows in a visual recognition system. You have to approach the problem without the built-in knowledge we all have of objects. A computer system processing a light sensor array only sees various colors at each of the sensor locations. There is no inherent image. The actual objects need to be differentiated from artifacts like shadows, glare.
Systems need to not only differentiate actual objects from artifacts, they need to discount objects that it should not stop for. You don't want the car stopping because some leaves fall in front of the car, or a trash bag blows across its path. What if it starts snowing or raining? So yes, there are real objects that are ok to run into. It's determining what is actually present in the path of the car that is a non-trivial task, but ultimately doable.
Believe it or not, I'm a person that can actually tell you how to do that, but that's not really the point. If their system is not robust enough to do this task with a visual recognition system, than they either should do it with RADAR or LIDAR, or keep their piece of crap off the road.
The bottom line is that they should have and must be required to have a system that can detect objects in the road. If they insist on using cameras, but don't have sufficiently developed software to do it with cameras correctly, then they should not be allowed to put that contraption on public roads.
If they have RADAR and/or LIDAR, then a failure to detect an incoming signal should have caused the thing to come to a stop as a fail safe.
You have to approach the problem without the built-in knowledge we all have of objects. A computer system processing a light sensor array only sees various colors at each of the sensor locations. There is no inherent image. The actual objects need to be differentiated from artifacts like shadows, glare.
I know quite a lot about this topic, and have built my own hardware and software to do this sort of task. I know how image data looks to a computer, and I know what must be done to it to extract meaningful information from this sort of image data.
Systems need to not only differentiate actual objects from artifacts, they need to discount objects that it should not stop for. You don't want the car stopping because some leaves fall in front of the car, or a trash bag blows across its path.
What sort of RADAR echo or LIDAR echo would a leaf have? Again, if your video system won't handle the task, you should not be attempting to use your faulty video system to accomplish a task that must absolutely be accomplished.
It’s perhaps an issue of the complexity of the operating environment and a lack of capability to formulate a working model through an electronics processing system.
Humans are not merely using just-in-time high level cognitive abilities to process a driving task. The senses-subconscious brain processes-spinal reflexes-instinctive behavior patterns-and intuition all play a part; such, that humans can often perform a familiar but skilled task, without continuous high level mental concentration.