• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Why did Sony stay with 36CUs for the PS5?

Bitmap Frogs

Mr. Community
Really that just weird how come MS does not do that and still have better Backwards Compatibility.

I get the idea that microsoft being how they are, they mandated everything go through the directx apis and no low level access, which means bc is way, way easier to get going. That's how it's done on pc. Sony probably followed the old tradition of letting devs do dirty tricky things that make bc more challenging. Keep in mind, so far, everytime Sony's had bc it was because they shipped the original silicon in the new console.
 

SF Kosmo

Al Jazeera Special Reporter
I think we were all surprised that Sony stuck with the same amount of compute units with the PS5 as they did with the PS4 Pro.
It has been said that it was to help with back compate for PS4 games, but I'm not 100% sold on that alone.
I'm not a dev so could there be other reasons such as making it easier to develop for by keeping the same amount of Cu's?
Like how much change would you have to do to dev tools by increasing the amount of Cu's?
Would keeping the same 36CUs mean their internal studios engines would be quicker to adapt to the PS5?

Or was it simply that like with the PS4 -> Pro that they would have needed to have either 36 Cu's on a butterfly GPU with 72CUs and as 72 was unfiesable it had to be 36?
They prioritized clock speed over CU count, believing it has performance benefits above and beyond the TFLOP count.

PS5's approach to balancing power, compute, and thermals is pretty interesting. I'm not knowledgeable enough to say if it'll pay off but I'm interested to see how it pans out.
 

Neo_game

Member
Sony said they could have had 48CU with less clock speed and still achieve the 10TF mark. But since they were able to achieve that with 36CU itself. They saved some money for may be other customization they wanted do like the harware I/O ?. As long as thermal and noise are under check which according to them is much better compared to PS4Pro. May be it was not worst thing to do but time will tell.
 
Last edited:

Entroyp

Member
Sony said they could have had 48CU with less clock speed and still achieve the 10TF mark. But since they were able to achieve that with 36CU itself. They saved some money for may be other customization they wanted do like the harware I/O ?. As long as thermal and noise are under check which according to them is much better compared to PS4Pro. May be it was not worst thing to do but time will tell.

That’s not what Cerny said. The 48 CU comparison was just to illustrate higher clock vs more CUs. Cerny even said it was a thought experiment and to no take it seriously.
 

Cacadookie

Neo Member
With the compute units that the Xbox series X has. You can run a game at the same render target as the PS5 and still have a little extra headroom to push any GPGPU physics on the left over shader no? Like maybe run some Machine Learned Physics on the spare 1.8-2tf
 

onQ123

Member
With the compute units that the Xbox series X has. You can run a game at the same render target as the PS5 and still have a little extra headroom to push any GPGPU physics on the left over shader no? Like maybe run some Machine Learned Physics on the spare 1.8-2tf


Only problem with that is that PS5 will have better geometry rendering , Pixel fill rate & some other advantages so it's not as simple as Xbox SX being able to run the game at the same render target as PS5 while having headroom.
 
where the fuck did you get that from? that's simply not the case... I am really confused where this is coming from tho. first time hearing that tbh and not quite sure what it's supposed to mean either.
I'd like to read about that if you don't mind.

I miss read it guys. I got it from Mark Cerny's quote below, but I believe he was just saying that not all CUs are the same, and that the PS5 CUs are larger than before. As another poster has already pointed out above, the CUs in both systems are of RDNA 2 related size.

“We’ve built a GPU with 36 CUs,” Mark Cerny says during a PS5 tech deep dive. “Mind you, RDNA 2 CUs are large, each has 62% more transistors than the CUs that we were using on PlayStation 4. So if we compare transistor counts, 36 RDNA 2 CUs, equates to roughly 58 PS4 CUs. It is a fairly sizeable GPU. Then we went with the variable frequency strategy, which is to say we continuously run the CPU and GPU in boost mode."
 

Xplainin

Banned
Sony said they could have had 48CU with less clock speed and still achieve the 10TF mark. But since they were able to achieve that with 36CU itself. They saved some money for may be other customization they wanted do like the harware I/O ?. As long as thermal and noise are under check which according to them is much better compared to PS4Pro. May be it was not worst thing to do but time will tell.
Obviously the 36CU number is for a reason. Why not 40, or 42? A couple of extra CUs isn't going to cost that much more.
So my question is why 36?
I get possibly BC, but is their another reason such as tools and game engines?
 
Obviously the 36CU number is for a reason. Why not 40, or 42? A couple of extra CUs isn't going to cost that much more.
So my question is why 36?
I get possibly BC, but is their another reason such as tools and game engines?

The likely answer is that with the rest of their on die customization that was the limit without moving up to a larger die size and though if it were just to pay for 2 more CUs in yields, to move to a whole next level of die size was not a good cost/heat/performance balance. It's like asking why didn't ps3 have 12 SPEs.... or 1gb of memory instead of wireless controllers or I mean it's a deep deep rabbithole to start down.
 
I miss read it guys. I got it from Mark Cerny's quote below, but I believe he was just saying that not all CUs are the same, and that the PS5 CUs are larger than before. As another poster has already pointed out above, the CUs in both systems are of RDNA 2 related size.

“We’ve built a GPU with 36 CUs,” Mark Cerny says during a PS5 tech deep dive. “Mind you, RDNA 2 CUs are large, each has 62% more transistors than the CUs that we were using on PlayStation 4. So if we compare transistor counts, 36 RDNA 2 CUs, equates to roughly 58 PS4 CUs. It is a fairly sizeable GPU. Then we went with the variable frequency strategy, which is to say we continuously run the CPU and GPU in boost mode."

Ok no problem.

I read that before from him. Just when you said compared to the XSX CUs I thought it was new information.

Happy that you clarified it.
 
You want me to link to where both PS5 and XSX are RDNA 2?
I miss read it guys. I got it from Mark Cerny's quote below, but I believe he was just saying that not all CUs are the same, and that the PS5 CUs are larger than before. As another poster has already pointed out above, the CUs in both systems are of RDNA 2 related size.

“We’ve built a GPU with 36 CUs,” Mark Cerny says during a PS5 tech deep dive. “Mind you, RDNA 2 CUs are large, each has 62% more transistors than the CUs that we were using on PlayStation 4. So if we compare transistor counts, 36 RDNA 2 CUs, equates to roughly 58 PS4 CUs. It is a fairly sizeable GPU. Then we went with the variable frequency strategy, which is to say we continuously run the CPU and GPU in boost mode."

Apparently the PS5 has RDNA2 CUs. I know I'm stating the obvious but the CUs are only a part of the GPU.
 

Neo_game

Member
Obviously the 36CU number is for a reason. Why not 40, or 42? A couple of extra CUs isn't going to cost that much more.
So my question is why 36?
I get possibly BC, but is their another reason such as tools and game engines?

It is so save money. 5700XT has 40CU and PS5 also has 40CU with 4 disabled. There was also rumor that PS5 was suppose to release in 2019 and it is RDNA 1.5 etc.

That’s not what Cerny said. The 48 CU comparison was just to illustrate higher clock vs more CUs. Cerny even said it was a thought experiment and to no take it seriously.

I guess he was justifying his decision to go with 36CU but I think it was clear their target was 10TF and want to emphasis that higher clock speed has its advantages. Something like intel, AMD where intel has less cores/threads but better performance in games, single core performance due to clock speeds
 
Let's be honest, you are going to sell what you have. I take that with a grain of salt, and the tests have shown better performance with more CUs and lower clocks.

Those tests were for PC games which were designed to run on a wide range of hardware. The only thing you can say about that test is when a PC game is designed to run on a wide range of GPU hardware, more CUs tend to increase performance more than frequency. It says absolutely nothing about console games coded close to the metal and designed to specifically make use of the PS5's faster clocks where all aspects of the GPU are sped up.

In the PC test, performance was likely left on the table in higher frequency GPUs simply because the graphics algorithms were designed to scale better with more CUs since that's the biggest differentiator in PC GPU capabilities.
 

93xfan

Banned
We don't really know if it's "better" yet.
MSFT PR is usually too optimistic.
Anyway. MSFT uses DirectX which is hw abstraction layer.
Sony allows any low level access. All the libraries are there just for convenience.

considering all Sony’s does is sometimes give you the higher clocks, and they’ve stated nothing else and MS has full BC, can do 2 more generations of BC, has auto HDR, and always runs at full clocks, I’m going out on a limb and going to say the Xbox will have better BC.

also, Sony’s method sometimes needs input from the developers.

I’ll admit, it’s entirely possible Digital Foundry brainwashed me into believing these facts count for something.
 

Amiga

Member
Really that just weird how come MS does not do that and still have better Backwards Compatibility.
Playstation uses lower level API, it gives more performance from the hardware but lowers compatibility with a different set of hardware, that's why Sony need to run PS5 hardware on "PS4 mode". AFAIK MS DX API is software based doesn't grant deep access to hardware but can better transfer software to different hardware.

but when hardware advances significantly software can emulate the deep API that was on older hardware, like the PS3 emulator that now exists.
 

All_Might

Member
that is theoretically true, but once again, PC tests show that for modern engines that is simply not the case... they easily use all available cores of a GPU, which is why you see an RTX2080ti outperform an RTX2080 at same clock speeds. if games wouldn't use the power of the additional hardware the games would run the same, which they don't



you run the same game on 2 GPUs, one running less active CUs at higher clocks and one running more CUs at lower clocks. both have the same peak TF performance. if high clocks would benefit the game engine the games would run better on the lower CU higher clock GPU, but they don't.
What you are saying is true, but that is because the frequency that the pc parts are shipping with, are already pretty near to being bottlenecked by one of its components (gpu components). Lets take a look at the rx 5700 and rx 5700xt which overclock to over 2 ghz. Doing so only gives you a marginal performance increase. The further you push, the less of an increase you get. At some point the performance even seems to decrease and the power does not scale anywhere close to being linear. Digital Foundry has a video on it. The main reason being, is that the architecture has reached a limiting factor or bottleneck if you will. So for Sony to push to such a high frequency must mean, that these bottlenecks have been addressed by AMD and a limiting factor should only appear by pushing the hardware even further than that 2.23 ghz. The used frequency Ideally is the sweet spot for the architecture and performance, up to that point could increase nearly linearly. That is what Sony should have pushed for in the best case scenario.
Regarding higher CU counts, I do agree that parallel workloads get increasingly harder on higher CU counts. But they are no where close to being as hard as CPUs. The example you give between the rtx 2080 and Ti are good, but they are for high level hardware access. Consoles give low level hardware access, meaning that as a developer you can pretty much write your code to that specific hardware and debug it with a ps5/xsx dev kit. The higher cu count ms has gone for, should not hinder anyone making excellent work on it. In fact the further we go into the next generation, the more both next generation systems will be able to show their true potential.
So what I am trying to say is, neither Sony nor Microsoft should be at a level that their frequency or gpu size would be bottlenecked by one of its sub components. Since that is very inefficient in its design and any uplift in frequency or size costs them a lot of money.

As to why that frequency is so high? The reason is still unknown, but I assume that they have planned for a specific workload to run faster. Certain workloads benefit heavily from increase in frequency and faster operations instead of parallel capability. I work on mobile devices and those are a bit different, but the same principles apply. On smartphones, especially android devices their clockspeeds are never sustainable, meaning that they throttle their clockspeeds in order to not overheat the entire system. This downclock in frequency is dependent on many factors one of them being ambient temperature for example.
So our process is like this, we look at a certain device, which we agree is the base for our performance goal. Then we take a look at what it’s general performance is like, how much it throttles under load in bad conditions, how well it handles specific scenes, especially harder ones and then optimise our code to that performance level. This is why benchmarks or metrics like teraflops are pretty useless to me, since they do not give guaranteed performance indications on a frame by frame base. Also they give a metric for a specific type of workload, different architectures give different results depending on the type of workload you are running.
The PlayStation 5 seems to running that frequency almost all the time, which is a good thing and unlike mobile devices it throttles based on a power budget in order to keep a certain temperature under any type of workload. If there are no bottlenecks inside the gpu architecture for the ps5 to run at such a high frequency the benefits could be really high, but that has yet to be seen. I am looking forward as we get to play on these new machines.
 
considering all Sony’s does is sometimes give you the higher clocks, and they’ve stated nothing else and MS has full BC, can do 2 more generations of BC, has auto HDR, and always runs at full clocks, I’m going out on a limb and going to say the Xbox will have better BC.

also, Sony’s method sometimes needs input from the developers.

I’ll admit, it’s entirely possible Digital Foundry brainwashed me into believing these facts count for something.

You had me until you mentioned Digital Foundry.
 

Aladin

Member
No doubt, PS5 has a better console arch. design. If they are unable to demonstrate significant fidelity benefits, they will lose value war.
 

njean777

Member
No doubt, PS5 has a better console arch. design. If they are unable to demonstrate significant fidelity benefits, they will lose value war.

They have already lost the power war, the real question is whether MS will undercut them and they will also lose the value war at the same time. MS has way more money to be able to take a loss if they want, should be interesting to see if they decide to do so or just continue keeping on.
 

bitbydeath

Member
Let's be honest, you are going to sell what you have. I take that with a grain of salt, and the tests have shown better performance with more CUs and lower clocks.

I am actually looking for other reasons. Like if they stick with the same amount of Cu's, is there a benefit to their tools and engines?
Are Sony engines built around their GPU CU number, and so keeping the same amount means less work on that?

Or was it totally BC?

The logic isn’t wrong though and not only have we seen numerous devs praising it but it’s so far taken the lead on next-gen graphics.
 
Last edited:
Cost. They made the decision to spend more on their storage capabilities which was the wise decision. As the generation goes on Xbox and PS5 will be outclassed by PC GPUs but PS5’s storage will hold up much better as it’s the best consumer storage solution available anywhere where the XsX storage is already outclassed by PC.

Sony can always release a PS5 Pro with double the CU.
 

MrLove

Banned
Crytek

You talked about the CUs. The PlayStation 5 now has 36 CUs, and the Xbox Series X has 52 CUs are available to the developer. What is the difference?


The main difference is that the working frequency of the PlayStation 5 is much higher and they work at a higher frequency. That's why, despite the differences in CU count, the two consoles’ performance is almost the same. An interesting analogy from an IGN reporter was that the Xbox Series X GPU is like an 8-cylinder engine, and the PlayStation 5 is like turbocharged 6- cylinder engine. Raising the clock speed on the PlayStation 5 seems to me to have a number of benefits, such as the memory management, rasterization, and other elements of the GPU whose performance is related to the frequency not CU count. So in some scenarios PlayStation 5's GPU works faster than the Series X. That's what makes the console GPU to work even more frequently on the announced peak 10.28 Teraflops. But for the Series X, because the rest of the elements are slower, it will not probably reach its 12 Teraflops most of the time, and only reach 12 Teraflops in highly ideal conditions.

Sony says the smaller the number of CUs, the more you can integrate the tasks. What does Sony's claim mean?

It costs resources to use all the CUs at the same time. Because CUs need resources that are allocated to the GPU when they want to run code. If the GPU fails to distribute all the resources on all the CUs to execute a code, it will be forced to drop a number of CUs in use. For example, instead of 52, use 20 of them because GPU doesn't have enough resources for all CUs at all times.

Aware of this, Sony has used a faster GPU instead of a larger GPU to reduce allocation costs. A more striking example of this was in the CPUs. AMD has had high-core CPUs for a long time. Intels on the other hand has used less core but faster ones. Intel CPUs with less cores but faster ones perform better in Gaming. Clearly, a 16- or 32-core CPU has a higher number of Teraflops, but a CPU with a faster core will definitely do a better job. Because it's hard for gamers and programmers to use all the cores all the time, they prefer to have fewer cores but faster.

 

On Demand

Banned
It’s because of backwards compatibility. Any other reasoning is false imo. What kind of coincidence is that using the exact same CU count as Pro for anything else other than BC?

To mimic the PS4 you have double the CU and shut off 18. For PS5 they would have to use 54 CU. Why didn’t they? Cost? Die space? Incompatibility with the architecture they used? PS5 would of had more CU than SX.
 
Last edited:

jose4gg

Member
It’s because of backwards compatibility. Any other reasoning is false imo. What kind of coincidence is that using the exact same CU count as Pro for anything else other than BC?

To mimic the PS4 you have double the CU and shut off 18. For PS5 they would have to use 54 CU. Why didn’t they? Cost? Die space? Incompatibility with the architecture they used? PS5 would of had more CU than SX.

You can disable CU using software...
 
tests done with RDNA PC graphics cards, one running at lower clocks but higher CUs and one running lower CU count but higher clocks, show the performance difference is basically not measurable in almost any game, and is actually pointing towards more CUs having a positive effect on some.
so there is no evidence in this actually having any positive impact on modern game engines on RDNA PC parts

Doesnt have much to do with custom RDNA2 in closed environment. Only people whom have no tech knowledge still stick to that "argument".

PC world is brute force world more than optimization world, so is it surprise that there are no huge differences when using standard tests/games?

Just stupid analogy to use
 

draliko

Member
Because they were able to nearly match a bigger chip performance with lower costs, and maybe better yields.

Also they invested in a state of the art SSD and I/O controllers plus having feature packed controller. They managed a fantastic package.
better yelds on that high freq i'm not so sure, to have all cu go that high is no easy task. And sony has always been bad at cooling solutions, let's hope they got it right this time. After the ps4pro i'm seriously thinking about waiting the first console revision
 
Last edited:
D

Deleted member 775630

Unconfirmed Member
Because of money, $399 is a great price point
 
Top Bottom