PlayStation 5 Pro hardware leak shows GPU is RDNA 3/4 hybrid design with major improvements

Sports
PlayStation 5 Pro hardware leak shows GPU is RDNA 3/4 hybrid design with major improvements

Following rumors that Sony will unveil its own upscaling technology in the upcoming PlayStation 5 Pro, we now have a wealth of information about what the next-generation console's hardware will look like: the CPU will still be the same Zen 2 core, with slightly faster RAM, but The GPU, however, will be a completely new design. It also looks like a funky blend of RDNA 3 and RDNA 4.

The leak in question comes from Inside Gaming, and while the claims are unsubstantiated, the details seem credible enough to me. For example, the PS5 Pro's custom AMD APU is said to be powered by the same Zen 2 as the original PlayStation 5. That means it still has 8 cores and 16 threads, but that's fine for most games.

It will not run faster, but it appears to add an operating mode that consumes more power for a 10% increase in clock speed. In this mode, GPU performance is reduced by only 1%, but power is reduced. The answer is simple: backwards compatibility. All PS5 games, and older games running on the PS5 platform, are designed for CPUs running at 3.5 GHz at best. Any major change to this could screw up a lot of things.

The same goes for system RAM. All PS5s have 16GB of average speed GDDR6, with access shared between the CPU and GPU. running at 14Gbps on a 256-bit aggregate memory bus, the total bandwidth is quite small at 448GB/s. The PS5 Pro has a bandwidth of 576GB/s and assuming the bus width was modified, this would equate to a speed of 18 Gbps.

To put these numbers in perspective, the Radeon RX 6800 and RX 7800 XT both have 256-bit buses, but the former uses 16 Gbps GDDR6, while the latter has a chip running at 19.5 Gbps, providing 624 GB/sec of bandwidth. PS5 Pro will look especially great on paper, as the 7800 XT is also unbeatable, but this GPU also has a 64 MB Level 3 cache (aka Infinity Cache), which reduces the load on the VRAM.

There is no indication that the PS5 Pro's APU will have an Infinity Cache, as AMD and Sony need to make the chip as small as possible to keep manufacturing costs down. However, since the CPU and GPU use the same pool of RAM, the 576 GB/s bandwidth will take a big hit. This is especially true when one reads about the claimed changes to the graphics processor.

In the original PS5, the GPU is almost like an RDNA 2 processor, with 36 CUs (compute units) paired to 18 WGPs (workgroup processors). Inside each CU are two banks of 32 ALUs to handle all shaders; PS5 Pro rumors claim that the new GPU will have the same 30 WGPs as the RX 7800 XT, with a peak FP32 throughput of 33.5 TFLOPS

.

If both of these numbers are correct, the only way this can be achieved is if the CU is based on the RDNA 3 architecture, i.e., with two banks of 64 ALUs. A quick calculation on paper shows a boost clock of 2,180 MHz, which is lower than PS5, but the additional shaders more than make up for it.

However, CU's ray accelerator does support the BVH8 traversal shader, a BVH (bounding volume hierarchy) data structure used to speed up which objects rays are interacting with, In RDNA 3, the shader operates with four BVH children (aka BVH4) from each node.

Increasing this to eight means that, in theory, the PS5 Pro GPU will be able to handle traversal processing much faster than before. It is not as simple as "the performance doubled because the number doubled", so it is at least possible. While there is no indication that AMD has moved these operations from the CU to a dedicated hardware unit (as Nvidia does with its GPUs), the traversal shader change tells us that the new GPU is an RDNA 3/4 hybrid rather than a pure RDNA 3.

It is unlikely to be a full RDNA 4 design since it does not have an Infinity Cache, but if all the numbers are to be believed, it is closer to RDNA 4 than 3. The PS5 Pro chip is said to reach 300 TOPS in INT8 mode and 67 TFLOPS in FP16 mode It is reported that the chip can reach 300 TOPS in INT8 mode and 67 TFLOPS in FP16 mode. [For example, the Radeon RX 7800 XT with RDNA 3 is capable of 512 INT8/FP16 operations per clock per CU on the AI accelerator, which means peaks of 75 TOPS and 75 TFLOPS, respectively. For comparison, the peak INT8 throughput of Nvidia's GeForce RTX 3090 Ti's Tensor core is 320 TOPS.

What use would all of that be for AI, whether in upscaling or other tasks where machine learning can be applied?

At this stage, it is all ifs and buts, and there is absolutely no guarantee that this is true. However, machine learning and ray tracing performance have been RDNA's Achilles heel in the past, so it makes a lot of sense for AMD to significantly improve them in RDNA 4, given how important machine learning workloads are these days.

With a perfectly good PS5 already in use at my house and numerous gaming PCs, I doubt I'll be buying a PlayStation 5 Pro when it comes out, but from a tech writer's perspective, it looks pretty promising.

But from a GPU enthusiast's perspective, RDNA 4 looks very interesting: But from a GPU enthusiast's perspective, RDNA 4 looks very interesting. It may not top the performance charts, but it looks to have all the features you want.

.

Categories