Innosilicon Fantasy I: architecture and features

When we talk about GPUs in PCs, we usually name three American companies: AMD, NVIDIA and, to a lesser extent, Intel. What would happen if we told you that graphics cards are appearing in China that make use of British technology in their GPU, but assembled and manufactured in China? In this article we are going to describe the architecture of the Innosilicon Fantasy I.

Talking about Imagination’s PowerVR architectures is almost like talking about a Greek tragedy. Since its inception and across different generations, we’ve seen it on several different systems like the SEGA Dreamcast, ST Micro’s KYRO graphics cards, and even the PlayStation Vita. Their point in common? Authentic commercial failures despite the high quality of its GPU. However, they were lucky enough to be the graphics architecture of the processors of Apple devices, until those from Cupertino decided to go on their own and “design” their own graphics architecture for a while.

The period of disagreements between Apple and Imagination once again led the British to seek to license their graphics architectures to third parties. Currently, if we look at the panorama both in smart devices and in the PC world, we will see how Imagination and its PowerVR seem to have disappeared.

Its absence in the Android world has been taken advantage of by other participants, such as ARM itself with Mali or Qualcomm with its Adreno. This has made them move to other markets, such as the Chinese manufacturer Innosilicon, famous for its ASICs for mining, was the one who not long ago presented its Fantasy 1. It is the first graphics card based on a PowerVR since the early Kyro 2000s, but can they compete against NVIDIA and AMD in the PC space?

What is tiled rendering?

In the late 1990s, graphics card designers had to contend for performance with a common problem, lack of bandwidth. Graphics processors compared to how they are today were very simple. The first part of the 3D pipeline, prior to rasterization, was calculated by the CPU. The second part in change was carried out by the graphics card, which required large amounts of bandwidth that the memory of the time could not provide without skyrocketing costs.

The solution proposed by Imagination was the rendering by Tiles, which still retains being the basis of its architecture, so even today the Fantasy I once the geometry has been calculated in the GPU itself, additional stages are added compared to a conventional GPU. A Tile Renderer sorts the position of geometry in RAM based on its position in the scene just before rasterizing to create individual display lists for each tile that it will then resolve one by one during the rendering process.

Advantages

Due to the small size of each block or Tile, this allows it to be solved without having to access the VRAM, since they use internal memory for this. This also makes it ideal for lazy rendering that often uses multiple image buffers to calculate the lighting of the scene. Its other advantage is that since knowing the position of the elements in the scene is essential to generate the spatial data structure for Ray Tracing, it is easier to implement ray tracing in this type of architecture.

Disadvantages

However, this has two drawbacks. The first one is that it requires more complex hardware than a conventional GPU to achieve the same performance and, therefore, we will always obtain lower performance for a chip of the same size, the second is that the existence of memory high speed like GDDR or HBM eliminates its advantage in a Gaming PC. That is why this type of architecture has become standard in pocket devices, where the memory bandwidth for consumption reasons is limited.

PowerVR B-Series, the graphic architecture of Fantasy I

To understand the architecture of Innosilicon’s Fantasy I graphics cards, and incidentally also what is inside Apple’s processors for its devices, we have to take a tour of the current architecture of Imagination and although we know that it has recently been presented The C Series, also known as Photon, at the moment the most advanced devices use Imagination’s B-Series as architecture.

The core of the B-Series

The organization of each of these nuclei is as follows:

Four USC blocks, Unified Shader Cluster, where each has up to 128 ALUs in FP32 for a total of 512 per core. Given the ability to execute an add and multiply instruction in a single clock cycle, it is capable of doing 1024 operations per clock cycle.
8 texture units, each capable of producing 4 texels, for a total of 32.
16 ROPS.
1 tessellation unit.
1 raster unit.

Each of the cores is exclusively responsible for a tile or block on the screen independently of the rest. Hence, each of them has its own raster and tessellation units. In addition to carrying a small internal memory to resolve the image buffer inside it and reduce the impact on the system RAM. However, this memory is used exclusively for the ROPS and despite the benefits of the GPU, due to the huge texture maps used today, it is necessary to access the VRAM to obtain the texture data.

Fantasy I, the first chiplet GPU

The great novelty of the Imagination B-Series used in Fantasy I is the fact that it is the first GPU that is made up of chiplets, that is, different chips that work together as a single processor. To do this, the screen list is sent to the first of the four chiplets that make up the GPU, while the other three are subordinate. It is a solution very similar to the one that AMD has proposed in patents with RDNA 3 and that will surely be common in all GPUs of this type in the future.

However, this solution differs in a specific point, the use of rendering by tiles to perform what is pre-rendering and to be able to have several screen lists not before rasterizing, but from the beginning of the 3D pipeline. The concept is none other than rendering the scene without shaders or textures of any kind and from the computing pipeline and not the graphics. This allows you to organize multiple lists of commands and not just one that will allow you to exploit the large number of cores during pre-rendering. This process is carried out automatically once the command processor of the first GPU has read the screen list.

This allows us to have several screen lists for the same scene that can be organized by the different cores. This is how it is achieved that with a configuration of 2 chiplets each one is in charge of one half of the screen, with 4 of them they are distributed in a quarter.

What has Innosilicon brought to your graphics card?

However, not all the work has been done by the people of Imagination, but Innosilicon has been the one that has designed the rest of the graphics card, adding the PCB design and choosing the rest of the materials. Where what stands out the most is the use of GDDR6 or GDDR6X memories depending on the model to be used, support for DisplayPort 1.5 and HDMI 2.1, but specifically the use of its Innolink technology, which has been designed to internally communicate the four chiplets that make up part of the GPU.

Specifically, we have two different variants, the Type A calls can reach 5 TFLOPS of power in FP32 , it has a memory interface with the 128-bit GDDR6X VRAM at 19 Gbps with a bandwidth of 304 GB/s. Type B, on the other hand, have two complete GPUs and, therefore, are made up of 8 chiplets in total and double the numbers

Innosilicon Fantasy I are not for your PC

The reality is that you will not be able to buy Innosilicon’s Fantasy I graphics cards to use them in your Gaming PC, nor would you be interested, since Imagination designs its architectures for pocket devices where Windows is not the dominant operating system and neither is it is DirectX, because we find a series of shortcomings. It makes no sense to add functionality to your hardware that your client is not going to use and the biggest client of these GPUs, albeit covertly, is Apple and specifically its Metal API.

Ironically, PowerVR is so tied to Metal, the API used in iOS, macOS and the rest of Apple’s operating systems, that in the end Tim Cook’s people have ended up signing an agreement with Imagination so that they can continue developing the GPU integrated into their processors . So in the current Apple A15, M1 and its Pro and Max variants, what’s inside is a PowerVR. The counterpart of this is that those from Cupertino have created the general idea that they are so all-powerful that they can create all the hardware in a system and compete for resources against the whole world. The reality is very different.

The fact that a GPU made up of 4 chiplets reaches 6 TFLOPS when the input range for PC already reaches that may surprise us, but we must bear in mind that it is a design designed for mobile processors, but with the goal of reaching cloud computing and not to be used in a Gaming PC.

Designed for data centers and cloud computing

Let’s not forget that in servers it is normal to use several processors and that we have more and more servers based on smartphone processors. Nor can we forget the tendency to virtualize a graphics card in the cloud for several clients, by its nature, the Fantasy I does not require virtualization, each of the chiplets that compose it can work as a small GPU.

So we have an architecture that derives from mobiles and scales up to data centers, but without going through the neighborhood that is the PC. This means that it lacks a series of features that today are essential for PC games. That is why, despite the fact that the appearance of the Fantasy I may be reminiscent of that of a Gaming GPU or it does not look serious with those colors, they really are for cloud computing, although it is a first generation. Are we facing the future where the graphics card is not in the hands of the user, but in the server?

In any case, China as a rival superpower of the United States needs to be totally independent from a technological point of view and this means creating its own solutions outside of the classic ones from NVIDIA, Intel and AMD, which we remember are American companies.