The shadow betrayed you? New technology of MIT can restore TV picture according to shadow

category:Internet
 The shadow betrayed you? New technology of MIT can restore TV picture according to shadow


Recently, MIT researchers have used a camera to record the shadows cast by video or human actions in the debris pile at the corner of the wall, which can roughly restore the original image.

As shown in the figure below, a video of a person operating the building blocks is playing on the screen, opposite is a pile of sundries, and the light from the screen is projected into the pile of sundries. The researchers filmed the debris pile and recorded the shadow of the video.

The recorded screen is as follows (left 1). If only by the naked eye, these shadows are very messy, which means little to us. But researchers can use neural networks to restore it. The restore effect is shown in the following figure (1 on the right). Even colors can be captured.

Specifically, a team at MIT created a new imaging system seven years ago that uses floors, doors and walls as mirrors to understand scenes that are not in line of sight. Using special lasers to generate recognizable 3D images opens up new possibilities for us to better understand things beyond our line of sight.

Recently, a team of scientists at MITs computer science and Artificial Intelligence Laboratory (CSAIL) took a step further in the research, but this time they did not use any special equipment.

They developed a way to reconstruct video by looking at the subtle shadows and Reflections on the clutter. This means that just having an open camera in the room can reconstruct the video in the invisible corner of the room, even if the area is outside the cameras field of view.

By observing the interweaving of shadows and geometry in the video, the teams algorithm can predict how light moves in the scene, which they call light transport.. Then, the system can estimate the hidden video based on the observed shadow - it can even reconstruct the contour of the real scene motion.

This type of image reconstruction can be beneficial to many aspects of society: autopilot can better understand what will happen at the corner, the elderly care center can enhance the safety of residents, and search and rescue teams can also enhance their ability in dangerous and handicapped areas.

The technology is passive, meaning there is no laser or other interference in the scene. However, the current processing time is still about two hours, but the researchers say the technology can eventually be used for the above applications to reconstruct scenes that are not in typical sight.

Debris pile u2248 pinhole camera

You can use non line of sight imaging devices such as lasers to achieve something, but in our method, you only need to have the light that naturally reaches the camera, and then extract the rare information as much as possible. Miika aittala, a former CSAIL postdoctoral and current research scientist at NVIDIA, said he now leads the research on the new technology. Considering that there has been a lot of progress in the field of neural networks in recent years, it seems like a good time to solve problems that were previously considered unsolvable in this field.

To capture this invisible information, the team used subtle, concise light cues, such as shadows and highlights of debris in the observed area.

In a way, a pile of clutter acts like a pinhole camera, similar to what you might have done in a primary science class: it obscures some light, but it also allows other light to pass through, images that depict the surrounding environment they touch. However, instead of a pinhole camera that lets part of the light pass through to form a readable image, we use a bunch of ordinary sundries, which will produce an unrecognized image disturbed (due to light transmission), which is a complex interaction of shadow and shadow.

You can think of the clutter as a mirror, providing you with a disturbed view of your surroundings - for example, a corner you cant see directly.

Algorithm used

The teams algorithm solves the problem of parsing the results of these disruptions to understand the light clues. Specifically, the goal of the algorithm is to restore the activities in the hidden scene to human readable video, which is the product of optical transmission and hidden video.

However, it is a classic question of chicken first or egg first to analyze these disturbing clues. In order to analyze the clearly disturbed patterns, the user needs to know the hidden video; or conversely, in order to know the hidden video, the user needs to know the disturbed patterns.

Mathematically, its like if I tell you that I think about two numbers in my head, and their product is 80. Can you guess what they are? Maybe 40 and 2? Or 371.8 and 0.2152? In our problem, each pixel faces a similar situation. Almost any hidden video can be explained by corresponding disturbing clues, and vice versa, aittala said. If we let the computer choose, it will only do simple work, providing us with a lot of random images that look like nothing.

Knowing this, the team focused on avoiding ambiguity. Their approach is to specify a scramble mode corresponding to reasonable real world shadows and shadows by algorithm, so as to recover hidden videos of edges and objects that appear to have consistent motion.

The team also took advantage of the surprising fact that neural networks naturally prefer to express image like content, even if they have never been trained for it, which helps to disambiguate. The algorithm trains two neural networks at the same time, both of which use a machine learning concept called deep image prior, and only one target video is specially processed. One network is used to generate scrambling patterns, and the other is used to estimate hidden video. When these two factors combine to reproduce the video recorded by debris, the two networks will be rewarded, which drives them to use reasonable hidden data to explain the observation.

To test the system, the team first piles up a pile of things in front of a wall, then projects video on the opposite wall and moves in front of the wall itself. Based on this, they can reconstruct videos that give you a general idea of the movement of hidden areas in the room.

The team hopes to improve the overall resolution of the system in the future, and finally test the technology in an uncontrolled environment.

Original link: https://news.mit.edu/2019/using-computers-view-unseen-computational-mirrors-mit-csail-1206