TWAIN, the standard for document capture and digitization

TWAIN

In some part of our lives we have used a scanner to digitize a document, but what we have surely all done is take pictures with our mobile phones. Well, both processes have used a standard called TWAIN that has been around since 1992 and is used in all operating systems and architectures. What is it and what does it consist of?

The acronym TWAIN stands for “Technology Without An Interesting Name” which translates to “Technology without an Interesting Name”, but which actually refers to the standard for imaging and scanning documents. Of which a large number of manufacturers of scanners, digital cameras are part in the field of hardware and software used to manipulate said documents. We explain how it works and what are the origins of this technology.

What is an image capture device?

Escaner TWAIN
To understand the TWAIN standard, the first thing we have to define is what an image capture device is, which what they do is capture a snapshot of a specific moment in the real world in digital format. A definition that seems simple, but it really is not because this implies a process at least complex that is composed of two parts, the first is the capture of the snapshot, the second the transformation into digital signals that can be manipulated by a processor.

If we start with the snapshot, the first thing that comes to mind is a photo, today the photographs are fully digitized, this is because the CCD, which is the device that captures the image, already generates the digital version of it. which is processed by a type of processor called an ISP. Although the most veteran of the place or those who work today in X-ray digitizations will have used a scanner more than once and it is that before the advent of digital photography there was no other remedy than to use one of these capture devices image to be able to digitize our photographs.

Of course, data encoding is important, since it is important that all systems that have to work with images understand the same code. The problem is that for a long time each of them spoke a different language, that is, each scanner digitized the images in its own way.

What is the TWAIN standard?

Pipeline TWAIN

It can be said that the TWAIN standard is the counterpart to PostScript. And yes, we know that maybe this is somewhat confusing, but PostScript is a standard that indicates how a printer has to print a document, well, the TWAIN standard does the same, but with scanners and therefore does the reverse. . That is, instead of generating a printed document from a digital file, what it does is generate a digital file through a printed document.

And what does the conversion process do? In principle, we could think of the existence of special hardware that is responsible for carrying out the process of creating the digital copy, which would make sense. The reality is that a scanner does not take a complete snapshot at once like a digital camera does, since it processes the image in parts, but they do share a common element, a CCD sensor. Due to the slowness in capturing the images and the use of a rather slow parallel port compared to the necessary memory, the TWAIN standard did not require special processors to digitize the images since its inception in 1992, the CPU was enough.

The other point in common with PostScript is that digitizing uses a common encoding that different programs that are compliant with the TWAIN standard can understand.

Is TWAIN a type of program?

Programa Escaner

If we talk about a program as an executable that can be managed by the user, then the answer is no, but not all system programs are for the user and among them is TWAIN.

In many places you will read that TWAIN is the scanner driver, but rather it is a function within it, the driver simply allows the applications to communicate, through the operating system, with the peripheral in question. Now, as with printers where there is a PostScript program that generates the document in a language that most printers understand, the same thing happens here, but TWAIN is not a program by itself, nor is it the driver, rather, it is a program for the exclusive use of the image grabber driver.

Some more modern scanners, thanks to the use of I / O interfaces much faster than when the standard was created, together with the integration of the hardware, already include an ISP that is literally in charge of internally carrying out the digitization process, storing it. in a buffer or temporary memory and transmit the image to our computer. How do they do that? Well, with the same hardware as digital cameras and mobile phones.

Its importance in digital cameras

Cámara Digital Webcam

Today the digital camera par excellence is the telephone that we carry in our pockets, but before it was digital cameras and long before film cameras, which we will not talk about here for obvious reasons.

The hardware that makes a digital camera possible is the same as that of a scanner, that is, a CCD and the entire digitization system. The difference is that one of the advantages of digital photography is the inclusion of an LCD screen that allows us to see the result of our photographs. Well, the scanning process inside the camera follows the TWAIN standard and the reason behind this is that image manipulation programs have to be able to manipulate the results.

Said digitization work is carried out by the ISP or Image Signal Processor, which over time have been digitized to become part of the processor of mobile phones, occupying their small space and benefiting today from neural networks to improve the quality of captured photos or specialized accelerators that are responsible for generating images in specific formats without the involvement of the CPU