Alexei Efros: what artificial intelligence can teach us about sight

21 Febbraio 2024
5:30 pm

Computer scientist Alexei Efros transformed his vision problems into his superpower. Through an innovative approach, he has managed to change the current perspective on machine vision by focusing on its ability to adapt, rather than only on data

WHO IS ALEXEI EFROS

Alexei Efros, a Russian-born computer scientist working at the Berkeley Artificial Intelligence Research Lab of the prestigious University of California, intends to revolutionize the world of computer vision, one data point at a time.

In 2016, Efros – who has defined his talent as a “superpower” – was awarded the Prize in Computing by the Association for Computing Machinery for his work in the field of synthetic images. Such talent probably gave him an advantage in his field of research, despite his vision problems.

Even though the last statement may seem paradoxical, Efros’ marked interest for vision and image processing in humans distinguishes his approach to artificial vision from that of most computer scientists.

DIFFERENCES BETWEEN ARTIFICIAL VISION AND HUMAN VISION

First of all, it should be stressed that machines and people “see” in a completely different way. Computers collect and process many images, but unlike people, they are unable to connect what they are “seeing” at that moment to what they have seen previously. Humans, on the other hand, give meaning to the set of colours and lights that surround them by comparing what they are looking at with their memories of similar images or experiences.

The importance of memory in the understanding of images in humans became immediately evident to Efros, despite his very poor eyesight. He is often able to fill the gaps left by his vision problems through his memory, so much so that he can “function basically as [well] as a normal person”. The fact that many people around him were not aware of his vision problems further confirmed that the answer to improved machine vision lies in memory and not “in the pixels”.

Currently, computer scientists train the visual system of computers by having them examine billions of images taken from the internet, processing each one at a time.

Human beings, on the other hand, when faced with unprecedented situations, absorb the information around them not only to address the immediate issue, but also to predict similar situations in the future. “What you see now is very correlated to what you saw a few seconds ago. You can think of it as video. All of the video’s frames are correlated to each other.”

One might think that the problem is easily solved by feeding videos to the computer, instead of still images, but Efros’ goal is much more complex: the computer must “see” the new information, process it and learn from it.

HOW CAN COMPUTER VISION IMPROVE

A similar approach to that proposed by Efros is TTT (test-time training), which is often used to improve predictive systems. Usually, computers are trained by using a large set of data for a certain time, and then deploying what they have processed. Instead, Efros suggests making the two phases coincide.

The main problem remains the “domain shift” or “data set bias”, i.e. if the training data is very different to the data that is used when the system is deployed, the computer will encounter serious difficulties. Computer scientists are still working on this problem, as it is currently impossible for computers to mimic the adaptability of biological life forms.

The only existing solution is to provide the system with as much data as possible, however, without the human “learning” phase, rare events are still a problem.

WHAT THE FUTURE HOLDS FOR MAN

In conclusion, despite advances in artificial intelligence and computer vision, the road ahead is still a long one. In the future, Efros hopes not only for advancements in the use of computer vision for applications such as self-driving cars, but also for a greater understanding of “human visual intelligence”. He hopes that humans, instead of fearing increasingly intelligent algorithms, can find, in their interaction with AI, the impetus to progress and increasingly harness their own creative ingenuity.

Sources:

Quanta Magazine

AEON