A New Technique Allows Recording What is Typed in a Video Call

February 26, 2021 Matt Mills Internet, News 0

Hackers tend to constantly adapt to changes and thus steal information and infect computers. It is true that we have many security tools with which we can protect ourselves, but cybercriminals are also looking for a way to break those barriers. In this article we echo a new attack that allows us to know what a person is writing during a video call.

A new attack knows what a user types in a video call

Video calls have become a widely used form of communication in recent months. The Covid-19 pandemic has brought important changes and one of them is the use of this type of services both by private users and also at the level of companies and organizations.

Now cybercriminals have found a new attack that allows them to know what a user is typing during a video call. Basically what they achieve is to take advantage of the video transmission to correlate body movements with the letters and words that that person is writing.

This discovery has been made by researchers at the University of Texas, San Antonio, and also at the University of Oklahoma. They indicate that they could not only attack in a normal video call, but also live on YouTube and other similar platforms.

Of course, for this to be possible, they indicate that it is necessary for the camera to be able to record part of the user’s body. The upper area where they would detect arm movements and how they are used to press certain keys.

They indicate that this type of attack can be used on different devices that have an integrated webcam . It would not only work with a computer, but also with many other equipment such as tablets, mobile phones and the like.

The objective of an attacker would be to record the words and texts written by the victim. This could put privacy at risk, but it could even steal the passwords that person enters when logging into any service.

Three stages of recording keystrokes

The researchers report that three stages are necessary for this to be possible. Each of them has its function and the end result is to obtain what the victim has typed on his keyboard. We are going to see what each of these phases or stages consists of.

Pre-processing : in this stage, the background of the video is removed, it is converted to grayscale and a segmentation of both arms occurs with respect to the person’s face.
Keystroke detection : this phase detects keystrokes, retrieves segmented frames from the arm, quantifies movements, and calculates where the keystrokes have been.
Word prediction : the last stage is to predict the words that have been written. It detects the different characteristics of movements before and after each keystroke and through a dictionary-based algorithm it is able to predict the words.

This group of researchers has carried out different tests with various groups of people using platforms such as Zoom, Hangouts and Skype. They saw that the attack did not have the same result for all webcams. There were also differences in word detection. In this sense, they successfully detected 91.1% of usernames, 95.6% of email addresses and 66.7% of written websites. However, for passwords they were not so successful: 18.9% of the total.