• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Matrix Awakens Power Consumption Comparison. XSS vs XSX vs PS5.

Sosokrates

Report me if I continue to console war
I guess the next UE5 test case will be stalker 2
Unreal 4 vs 5. Not the same thing my man

True, but it still shows when the XSX GPU is pushed hard it requires about 200watts which a lot more stable then what we are seeing in the matrix demo.

Even the SeriesS does not have the fluctuations in power draw.

Its also interesting that fps drops happen when the power draw dips.
 
Interesting, xsx versions seems unoptimised, it should not be getting such drastic dips.

We can see here Gears5 is a lot more stable and using more power, in gameplay it doesn't go below 190w and during battle its around the 200-210w.


That's the case during "gameplay" here. I have seen whole sections with average of about 195W on XSX. I'd say they pretty much push the system to its max here. During the cinematic power consumption also varies a lot on PS5.
 

Sosokrates

Report me if I continue to console war
That's the case during "gameplay" here. I have seen whole sections with average of about 195W on XSX. I'd say they pretty much push the system to its max here. During the cinematic power consumption also varies a lot on PS5.

Gameplay is where the xsx has issues in the matrix demo.
 

Fafalada

Fafracer forever
Its also interesting that fps drops happen when the power draw dips.
FPS drops introduce wait-times in the frame-delivery pipeline, thus they correlate to utilization drops.
Basically it's not curious at all /dumbledore.

I hope this and the games have finally shut that up.
I think this will just generate endless amounts of concern trolling of every developer that has their release subjected to power-draw analysis. Because now games that aren't hitting 220W on SX will be considered "poorly optimized" - you can already see it starting in this thread.
 

Sosokrates

Report me if I continue to console war
FPS drops introduce wait-times in the frame-delivery pipeline, thus they correlate to utilization drops.
Basically it's not curious at all /dumbledore.


I think this will just generate endless amounts of concern trolling of every developer that has their release subjected to power-draw analysis. Because now games that aren't hitting 220W on SX will be considered "poorly optimized" - you can already see it starting in this thread.
Then why are are the fps drops happening?
 

Sosokrates

Report me if I continue to console war
Yes, notably during cross road traversals. But it's probably I/O related, not GPU. The big frame-time spikes indicate this should be caused by I/O. Seems the PS5 could handle everything I/O better?

I doubt it, DF already explained that it streams 30mb/s.

A PC with a Sata SSD could probably with an appropriate GPU could perform better.
 
I doubt it, DF already explained that it streams 30mb/s.

A PC with a Sata SSD could probably with an appropriate GPU could perform better.
Don't listen to DF. They are ignorant. Not everything I/O is about seeking stuff from the SSD. And bandwidth is just one part of the problem. PS5 has being designed around I/O, and its SSD is just a small part of the solution.
 

GasTheWeebs

Banned
PS5...
How Dare You Greta GIF
 

Fafalada

Fafracer forever
I doubt it, DF already explained that it streams 30mb/s.
I/O transfers don't work the way people think they do. Eg: 99% of slow-loading times in PS2 era for instance, was not caused by transfer speeds (or anything to do with the drive limitations for that matter). Doesn't change the physical reality of slow-loads in the end though.
The speeds of these SSDs are great, but they're still orders of magnitude removed from CPU/GPU ms budgets. It should be 'improbable' I/O is interfering with framerate given the goals of Nanite, but we have no way to be sure without seeing profiler data of the code running.

Then why are are the fps drops happening?
If I could reliably answer that just by looking at YT-videos, I'd be a retired millionaire already. Maybe a billionaire.
 

intbal

Member
Interesting. It's also crazy to see consoles deliver this stuff at this performance considering they consume way less power than a gaming pc. They are super well optimized and makes me wonder what is going on with PC manufacturers.

I mean, if GPU, CPU or memory is pretty similar in terms of technology and performance between these consoles and PC, why do similar PCs consume way more?

You think all this awesomeness powers itself???

vIYY48z.jpg
 

M1chl

Currently Gif and Meme Champion
Are people still unaware nearly a year later? Microsoft went with a larger chip at lower clocks, Sony went with a smaller chip with a higher clocked GPU, the natural result is that the PS5 uses a bit more power to get where they are. 52CUs vs 36, 1.825GHz vs up to 2.23. Higher clocks increase power use more than a wider design, and the PS5 uses an instruction mix based instead of a thermal based variable clock system where the power controller looks at the instruction mix and adjusts based on the expected power use based on that, and that can just be for milliseconds within a frame and generally can always run at peak. The design is meant to avoid having to design the system for the worst case instruction mix, instead of designing the cooling and all for that, they can just drop the clocks for milliseconds when those rare worst case power situations come up.

"Using it better" and "Stressing it more" are both bad takes, this would have been exactly what we expected since we knew the specs.

Slightly different designs, but so far as I see the results are trading blows and largely very similar.
Both of those are absolutely valid takes, tho. The SW on xbox have far bigger overhead, due to being in Virtualised environement and behind Direct X. True that's not that big of a deal on GPU, but on CPU + RAM absolutely. Also you are not able to do some asm coding, if you want to do something really quick and so on. So It think those are both valid things to say.

However the console never going to be anything else. Thus on surface level this conversation does not have much weight on it.
 

Riky

$MSFT
Then why are are the fps drops happening?

The frame drops happen on both consoles, in fact the most consistent framerate is actually the Series S version. Like was said by NXgamer in his breakdown I wouldn't read anything into performance because it's variable, you can get a different set of results each run through.
It's also not an actual game as such just a demo so I doubt it's had the amount of optimization that a real paid for product will have.
We'll see how STALKER 2 runs in a few months and how Hellblade 2 along with the next Coalition game fare.
We know there is more performance to come as The Coalition have stated they are working with Epic to get support for Tier 2 VRS into the engine for launch.
 

PaintTinJr

Member
The power figure I was referring to was using a 6 core Zen 3 CPU and a 3060/3060 Ti. Should be fine with DirectStorage.
But that is never going to match the instantaneous transfer rate of loading and decompression the PS5 can and will do - so even if it isn't streaming in gigabytes constantly the benchmark to match - to compare power draw - is the hardware needed in a PC to match (or beat) it on every spec, no?

I was allowing 4 desktop PC cores to match the APU's 8x Zen 2 mobile cores for gameplay, and then 8 more PC cores - as stated by Cerny as IO equivalent - to match the real-time decompression capabilities of the IO complex - even when its transferring hundreds of megabyte payloads.

My PC has a 12 core Xeon with a RTX 3060, Samsung 980 Pro and is better and worse than my PS5 in certain capabilities, and it definitely draws around 550watt under a good gaming load - so much so it showed up (two shitty) EVGA 700BQ PSUs that was a buying constraint of getting a RTX 3060, only for when I switched to the EVGA because it was new - in place of the SLI capable 650watt PSU I was using - the system killed it the first time it saw a gaming workload, and killed the identical replacement in the same way, after install.

My old 650watt was promised to a nephew's build - so I bought a Fractal 860p for now to be used in my eventual rebuild in 12months. But matching up to PS5 performance on PC in a fair way is closer to twice the PS5 power draw IMHO, and if you only put a PSU in that was rated as low as you claim - with your hardware and a good game workload - I'd be surprised if it didn't trip the PSU's overpower protection.
 

PaintTinJr

Member
I also remember people saying that smartshift would enable using less power for the same results as the XSX. It seems that's not true either.
I certainly said something along those lines more than once, but obviously it was implicit that the workload would be heavily nanite/lumen sdf lighting for the 90% geometry Epic stated, like we saw in the first and second UE5 demos.

This demo doesn't have geometry to kitbash, it is better suited to hw acceleration IMO so the hardware workout is still with one foot in last-gen bottlenecks and the power draw and visual comparison reflects the strength's of both new consoles at different times in the demo.
 

FireFly

Member
But that is never going to match the instantaneous transfer rate of loading and decompression the PS5 can and will do - so even if it isn't streaming in gigabytes constantly the benchmark to match - to compare power draw - is the hardware needed in a PC to match (or beat) it on every spec, no?
According to Nvidia slides, RTX IO supports up to 14 GB/s read bandwidth.


My old 650watt was promised to a nephew's build - so I bought a Fractal 860p for now to be used in my eventual rebuild in 12months. But matching up to PS5 performance on PC in a fair way is closer to twice the PS5 power draw IMHO, and if you only put a PSU in that was rated as low as you claim - with your hardware and a good game workload - I'd be surprised if it didn't trip the PSU's overpower protection.
Power draw is not the same as total power supply capacity (PS5 PSU is apparently rated at 350W). In the below review, the 3060 Ti system reached 372 Watts with a 10900K, so 350 Watts with a 5600X should be possible. You could further reduce power consumption by downclocking, since the 3060 Ti has plenty of performance to spare and the laptop version targets 2080 performance in 150W for the GPU.

 
Last edited:

Sosokrates

Report me if I continue to console war
Same as people believing "most powerful console" because MS said so. Just don't mind and enjoy whatever game you like.
It aint the same. By all metrics prior to this gen the XSX was more powerful.

I personally think sonys apis and dev environment has the most impact in making up or performing better then seriesX, however the higher clockspeed is over hyped, sure it will be more performant then a 10tfop GPU with 44cus @1825mhz, but in the end compute is the most important metric.
 

PaintTinJr

Member
According to Nvidia slides, RTX IO supports up to 14 GB/s read bandwidth.



Power draw is not the same as total power supply capacity (PS5 PSU is apparently rated at 350W). In the below review, the 3060 Ti system reached 372 Watts with a 10900K, so 350 Watts with a 5600X should be possible. You could further reduce power consumption by downclocking, since the 3060 Ti has plenty of performance to spare and the laptop version targets 2080 performance in 150W for the GPU.

What performance to spare? It seems like you are going by TF/s than comparing actual things like gigapixel rate that nanite heavily depends on with the 2080 and mobile 3060TI.

Lowering clocks will lower fill-rate - probably below the PS5 level and will increase cache latency in the 3060TI, and we haven't factored in the latency from the RTX IO card to the 3060TI to the compute it needs to use for that decompression rate, or actually matched up to see if it can offer the low latency IO of the PS5 for sporadic and mid size transfers. Then there's a small matter of power draw of the RTX IO card, to even begin to talk about lowering GPU clocks. no matter which way you try to align with current hardware - if you are being fair - I think you'll struggle to get under double the PS5 power draw.

Good point about the PSU rating in the PS5, yeah, the same 100watt clearance on his lower rating PSU would take it to 450watt, but even then I suspect the system would struggle to not trip the overpower protection when the RTX IO card is used - if not doing the 12 Core CPU instead.
 
Last edited:

PaintTinJr

Member
It aint the same. By all metrics prior to this gen the XSX was more powerful.

I personally think sonys apis and dev environment has the most impact in making up or performing better then seriesX, however the higher clockspeed is over hyped, sure it will be more performant then a 10tfop GPU with 44cus @1825mhz, but in the end compute is the most important metric.
I don't know why people keep saying that, when more often than not it has been nvidia's pixel-rate advantage over the years that has kept them in first place in GPU comparisons, because unless you have fill-rate to throw the results of the compute at the screen to visualise it, then you didn't need to do the computation.
 

Sosokrates

Report me if I continue to console war
I don't know why people keep saying that, when more often than not it has been nvidia's pixel-rate advantage over the years that has kept them in first place in GPU comparisons, because unless you have fill-rate to throw the results of the compute at the screen to visualise it, then you didn't need to do the computation.

Because the PS4 had 1.84tf and GDDR5 and everyone claimed because of this the ps4 was more powerful and they were correct.

The same with the xbox one x.
 
Because the PS4 had 1.84tf and GDDR5 and everyone claimed because of this the ps4 was more powerful and they were correct.

The same with the xbox one x.

Well the One X was far ahead of the Pro. Not the case with the PS5 and the XSX where each system has their strengths. Definitely not a repeat of the Pro vs One X days. The comparisons prove that.
 

Sosokrates

Report me if I continue to console war
Well the One X was far ahead of the Pro. Not the case with the PS5 and the XSX where each system has their strengths. Definitely not a repeat of the Pro vs One X days. The comparisons prove that.

We were talking about performance indicators. Not claiming any parallels between the 1X and pro and the XSX and PS5.
 

Hoddi

Member
What performance to spare? It seems like you are going by TF/s than comparing actual things like gigapixel rate that nanite heavily depends on with the 2080 and mobile 3060TI.

Lowering clocks will lower fill-rate - probably below the PS5 level 0 and will increase cache latency in the 3060TI, and we haven't factored in the latency from the RTX IO card to the 3060TI to the compute it needs to use for that decompression rate, or actually matched up to see if it can offer the low latency IO of the PS5 for sporadic and mid size transfers. Then there's a small matter of power draw of the RTX IO card, to even begin to talk about lowering GPU clocks. no matter which way you try to align with current hardware - if you are being fair - I think you'll struggle to get under double the PS5 power draw.

Good point about the PSU rating in the PS5, yeah, the same 100watt clearance on his lower rating PSU would take it to 450watt, but even then I suspect the system would struggle to not trip the overpower protection when the RTX IO card is used - if not doing the 12 Core CPU instead.
Nanite doesn't rely on pixel fillrate at all. It's rasterized in software and runs in compute shaders.

The actual Nanite passes aren't even the performance intensive part of UE5. They're only ~8ms out ~38ms at 1440p in the Valley of the Ancients demo on my 2080Ti.
 
Last edited:

PaintTinJr

Member
Because the PS4 had 1.84tf and GDDR5 and everyone claimed because of this the ps4 was more powerful and they were correct.

The same with the xbox one x.
The Xbox One X even had a paper at GDC on how to use its additional compute to increase rendering IIRC - something that certainly wouldn't be needed for a console that had superior hardware specs in memory, memory bandwidth, and CPU clock if pixel fill-rate wasn't a bottleneck, no?

edit:
The PS4 didn't just have more compute than the X1, but had a far superior future looking design with the amount of focus on async compute, the unified memory and how that made for software solutions where CPUs were able to cache snoop other chiplet's cache deterministically. It was such a good design that the X1X pretty much borrowed the broad strokes of the design as a new Xbox architecture at the refresh 1-2years after a Pro had been a compliment to the limitations of the PS4's success. In a world of mostly 30fps games on PS4 and X1 the Pro focused on rendering faster - in half the time for 60fps modes and PSVR - at the expense of resolution, and the X1X typically rendered higher resolution - to compliment the deferred rendering solutions for 30fps, making it easier to exploit the additional compute and additional memory, as it gives more room to keep deferred outputs for longer, saving on fill-rate - but in 60fps modes the X1X advantages weren't easy wins all round IIRC.
 
Last edited:

PaintTinJr

Member
Nanite doesn't rely on pixel fillrate at all. It's rasterized in software and runs in compute shaders.

The actual Nanite passes aren't even the performance intensive part of UE5. They're only ~8ms out ~38ms at 1440p in the Valley of the Ancients demo on my 2080Ti.
It runs in the geometry and fragment shaders IIRC what they said in the presentations, and kitbashing - which Land of the Ancient shows the bottlenecks of on inadequate hardware (the 1060 has really good fill-rate by the way for an old card) - is at its simplest form overdraw. So I'm struggling to see why that software shader isn't pixel-rate bound. The 3ms figure is for the PS5 only, TC figures said nanite killed performance when used excessively for them - as it is in the first two demos.

The pixel-rate in a £400-500 PS5 console is exceptional, and the texture rate doesn't have any caveats - like not being usable while using BVH acceleration too - AFAIK.
 
Last edited:

FireFly

Member
What performance to spare? It seems like you are going by TF/s than comparing actual things like gigapixel rate that nanite heavily depends on with the 2080 and mobile 3060TI.

Lowering clocks will lower fill-rate - probably below the PS5 level and will increase cache latency in the 3060TI, and we haven't factored in the latency from the RTX IO card to the 3060TI to the compute it needs to use for that decompression rate, or actually matched up to see if it can offer the low latency IO of the PS5 for sporadic and mid size transfers. Then there's a small matter of power draw of the RTX IO card, to even begin to talk about lowering GPU clocks. no matter which way you try to align with current hardware - if you are being fair - I think you'll struggle to get under double the PS5 power draw.

Good point about the PSU rating in the PS5, yeah, the same 100watt clearance on his lower rating PSU would take it to 450watt, but even then I suspect the system would struggle to not trip the overpower protection when the RTX IO card is used - if not doing the 12 Core CPU instead.
The 6600 XT is slightly ahead of the PS5 in terms of compute and texture rate, and has a 16% higher fill rate. Yet the 3060 Ti is ~14% faster at 1080p. That's what I mean by performance to spare.

Perhaps UE5 titles will be more fill rate bound than current games, but in the Matrix demo the XSX and PS5 deliver very similar performance, despite the XSX having a signficantly lower fill rate. So that indicates you can compensate with additional compute, of which the 3060 Ti has plenty. And hardware accelerated Lumen uses ray tracing, which RTX cards are significantly faster at.

But ignoring this, the fact remains that you should be able to get 3060 Ti performance in 350W, which was all Zathalus was claiming.

I don't know why people keep saying that, when more often than not it has been nvidia's pixel-rate advantage over the years that has kept them in first place in GPU comparisons, because unless you have fill-rate to throw the results of the compute at the screen to visualise it, then you didn't need to do the computation.
The 3080 has a peak fill rate of 164.2 GPixel/s, while the 6800 XT has a peak fill rate of 288.0 GPixel/s. Interesting that from Turing to Ampere, Nvidia doubled the amount of peak compute, while leaving fill rate barely changed.
 
Last edited:

Hoddi

Member
It runs in the geometry and fragment shaders IIRC what they said in the presentations, and kitbashing - which Land of the Ancient shows the bottlenecks of on inadequate hardware (the 1060 has really good fill-rate by the way for an old card) - is at its simplest form overdraw. So I'm struggling to see why that software shader isn't pixel-rate bound. The 3ms figure is for the PS5 only, TC figures said nanite killed performance when used excessively for them - as it is in the first two demos.

The pixel-rate in a £400-500 PS5 console is exceptional, and the texture rate doesn't have any caveats - like not being usable while using BVH acceleration too - AFAIK.
I don't know which presentation you're referring to but it's easy enough to run the demo through Nsight. I've highlighted the main Nanite passes in this screenshot below and it doesn't show the raster units and ROPs being pushed to any degree. Especially not Z + color which are barely even used.

YJZywMK.png
 
I actually think the xsx is leaving performance on the table here. They also go up to 220 watts but not often enough. 1.8 ghz clocks are probably too conservative for an rdna 2.0 card in a console box. This is just one game but when she’s walking, the Xsx givers around 195w while the ps5 goes up to 210 watts. Its possible they are hitting the 30 fps cap here, but i really don’t care if the consoles go over 200w if it means better performance. We know the Xsx can dip in this game so seeing it average less power consumption tells me they could’ve pushed The clocks more or gone with a variable clock system of their own.
MS already have issues getting enough chips, pushing 20 more watts through the thing will not help with this.

But we are listening.
 

PaintTinJr

Member
I don't know which presentation you're referring to but it's easy enough to run the demo through Nsight. I've highlighted the main Nanite passes in this screenshot below and it doesn't show the raster units and ROPs being pushed to any degree. Especially not Z + color which are barely even used.

YJZywMK.png
I'm confused. Why you would selectively choose to profile a scene - that may or may not be showing the typical kitbashing overdraw Epic showed would hurt performance in that demo - when I already accepted your 8ms from your 33.3ms?(30fps) 38.8ms?(24fps) numbers running on a 2080ti? Numbers which I assumed you were quoting from Epic?

Your original numbers still seem good, and 8ms from 33ms is ~25% of the frame time, so your profiling at around 25% would align with what you already stated. 8ms in a 16.6ms on the other hand (in a 60fps frame) doesn't look good, when sdf Lumen lighting is going to need 3 times that amount for PS5 level rendering - according to Epic's info. And as I already stated, TC said nanite hurt performance when they tried to use it heavily in their series X research project.

So I stand by my point that pixel-rate typically ends up being the limiting factor of perceived (frame-rate) performance for hardware - not compute IMO.
 

Hoddi

Member
I'm confused. Why you would selectively choose to profile a scene - that may or may not be showing the typical kitbashing overdraw Epic showed would hurt performance in that demo - when I already accepted your 8ms from your 33.3ms?(30fps) 38.8ms?(24fps) numbers running on a 2080ti? Numbers which I assumed you were quoting from Epic?

Your original numbers still seem good, and 8ms from 33ms is ~25% of the frame time, so your profiling at around 25% would align with what you already stated. 8ms in a 16.6ms on the other hand (in a 60fps frame) doesn't look good, when sdf Lumen lighting is going to need 3 times that amount for PS5 level rendering - according to Epic's info. And as I already stated, TC said nanite hurt performance when they tried to use it heavily in their series X research project.

So I stand by my point that pixel-rate typically ends up being the limiting factor of perceived (frame-rate) performance for hardware - not compute IMO.
I posted that screenshot to show which GPU units were being used to process the Nanite passes in particular. Look at the statistics in the bottom half and the summary on the right - and note the ZROP (depth) and CROP (color) headings as they're hardly being utilized.

Here are the full-frame unit throughputs for reference. I don't see how it's fillrate bound though I'd still want to see that presentation if you have it.

LaSpewM.png
 
I posted that screenshot to show which GPU units were being used to process the Nanite passes in particular. Look at the statistics in the bottom half and the summary on the right - and note the ZROP (depth) and CROP (color) headings as they're hardly being utilized.

Here are the full-frame unit throughputs for reference. I don't see how it's fillrate bound though I'd still want to see that presentation if you have it.

LaSpewM.png
Very interesting. It's actually compute bound. But note that the most important memory after Shader memory is L1 cache. This is where PS5 has a notable advantage against XSX. This is what I was actually expecting.

We already saw in Death Stranding PC how important was L1 cache on PC.
 
Top Bottom