• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.
  • The Politics forum has been nuked. Please do not bring political discussion to the rest of the site, or you will be removed. Thanks.

Hardware AMD RDNA 3 GPUs To Be More Power Efficient Than NVIDIA Ada Lovelace GPUs, Navi 31 & Navi 33 Tape Out Later This Year

FireFly

Member
Aug 5, 2007
1,832
1,356
1,440
of course CU's doesn't scale performance linearly but still improves it specially with higher resolutions more pixels requires more parallelisation same with bandwith, so i guess that's the reason 3070 Ti is only 5% faster then 3070 at 1440p because 3070 doesn't lack BW but have enough of it.
i was comparing 3090 with 6900XT at 4K res btw.
The point is that Nvidia didn't increase number of CUs (SMs) with the 3080/3070. The 3080 has 68 SMs, just like the 2080 Ti. It just has double the number of FP32 ALUs per SM. So you really can't use TFLOPS to compare the two architectures, because the ratio of compute (FP32) to texture rate/fill rate/integer rate, is different.
 
Last edited:

PaintTinJr

Member
Jan 30, 2020
1,219
2,699
475
Oxfordshire, England
Not sure where the random ps4 games come into the topic, but I'll bite. What RDNA2 games impressed you on the other hand? It seems you are trying to imply something, without explicitly stating it.
No, there is no games exploiting the benefits of RDNA2 so far, AFAIK, so I’m going completely on the UE5’s nanite bypassing hardware rasterization, and outperforming it 2 or 3 to with shader implementations on higher clock with larger and faster caches subsystems with AMD having that edge.

Then you have UE5 lumen, which if you split the frustum into a cascade of different sized frusta slices: the first cascade is HW RT and then SW RT, and then split the SW RT further into LoD/workload cascades into the scene - as UE5 does. AFAIK the vast bulk of the pixels being UE5 lit by the engine is done by SW RT, not HW RT – which only has a depth of around 5-10metre IIRC - so IMO all the new AAA games -majority using UE5 - will rely mostly on GPU accelerated software rasterization for geometry and more general purpose gpu compute – like AMD’s – for GI lighting.

IMO AMD, Epic and PlayStation have gone all in on this strategy for some years. Lumen’s specs requirement suggest a clear division for gaming at the top. Last-gen wins with DLSS and H/w RT for nvidia won’t carry over IMO; especially with IQ of nanite/lumen being so high, DLSS’s wins on obvious undersampled items where triangles span many pixels, low quality textures issues, and faceted or aliased graphical artefacts is going to be slim pickings on UE5 games is my take.

Nvidia’s hardware currently holds its own running UE5 tools, but as SW RT advances with optimisations to the likes of SW Lumen, I suspect RDNA2 will continue to cope, but RTX’s lower 2000/3000 series will lack the flexibility – from hedging their bets on a more even balance towards HW RT/ HW rasterization than AMD’s long time push for async and a less specific solution to HW RT in favour of more versatile GPU compute.
 

Turk1993

Member
Jan 13, 2018
1,068
2,929
425
no one is tripping tflop for tflop AMD > Nvidia at rasterization and compute.
6900XT just 23Tflops and at 2 place while RTX 3900 is 35 Tflops at 1st place.
Lol, you guys said that it performed better at rasterization when its clearly behind. And idgaf about Tflops, Nvidia Ampere line is performing better thats my point. it doesn't mather if its 100Tflops vs 5Tflop if a 3090 is performing better than a 6900xt than its better ,if a 3080 performs better than a 6800xt than it is better. Fun fact 3080 is closer to 6900XT than 6900xt is to 3090, and falls back even behind with DLSS and RT.

Lol 4K only when a lot of people still got 1440p monitors where 6900XT is better
Lol at 4K? ok dude game on 720p then. And BTW




And all of this without RT or DLSS and yall act like AMD did something revolutionary smh.
 

Rickyiez

Member
Jan 20, 2020
978
1,447
515
Nvidia: laughs in DLSS

Rasterization in RDNA2 is already better than any NV card, so what? The battlefield has changed while AMD seems to be still stuck in the past. Hopefully RDNA3 will implement the ML features from XSX and FSR will be tweaked to use it in a similar fashion as DLSS works.



Yeah, I'm still waiting to see 3080 tank in performance due to its 10GB VRAM while RX 6000 cards will "age like a fine wine" ;)
Age like a fine wine with the slow ass memory bandwidth, yeaaaa right . Tell me why 6000 cards falls short in 4k where the only situation VRAM mattered.
 

Marlenus

Member
Jul 29, 2013
1,923
780
670
UK
Age like a fine wine with the slow ass memory bandwidth, yeaaaa right . Tell me why 6000 cards falls short in 4k where the only situation VRAM mattered.

They dont fall short. Scaling from 2080Ti is bang on. Futher memory overclocking has 0 performance impact but core overclocking does improve performance.

Ampere scales well to high resolutions where all the shaders can get used.

EDIT: So it is not that RDNA2 falls behind but Ampere puls ahead.
 
Last edited:

Marlenus

Member
Jul 29, 2013
1,923
780
670
UK
Lol, you guys said that it performed better at rasterization when its clearly behind. And idgaf about Tflops, Nvidia Ampere line is performing better thats my point. it doesn't mather if its 100Tflops vs 5Tflop if a 3090 is performing better than a 6900xt than its better ,if a 3080 performs better than a 6800xt than it is better. Fun fact 3080 is closer to 6900XT than 6900xt is to 3090, and falls back even behind with DLSS and RT.


Lol at 4K? ok dude game on 720p then. And BTW




And all of this without RT or DLSS and yall act like AMD did something revolutionary smh.

In the most uptodate TPU game suite the results are



A bit different to the launch review.
 

Turk1993

Member
Jan 13, 2018
1,068
2,929
425
In the most uptodate TPU game suite the results are



A bit different to the launch review.
Soo the gap is bigger now with the 3090 and 6900xt at QHD ok thnx. And at 4K the diff is even bigger now with the "latest benchmark" and even the 3080Ti surpasses the 6900XT. But its nice to see 6800xt getting some shine on newer games. But this is still without RT or DLSS, put on those 2 and you don't need to benchmark to see the difference between Amd and Nvidia.
 
  • Like
Reactions: MightySquirrel

Darius87

Member
Jul 16, 2018
1,092
2,637
525
The point is that Nvidia didn't increase number of CUs (SMs) with the 3080/3070. The 3080 has 68 SMs, just like the 2080 Ti. It just has double the number of FP32 ALUs per SM. So you really can't use TFLOPS to compare the two architectures, because the ratio of compute (FP32) to texture rate/fill rate/integer rate, is different.
You sound like i just have to ignore half of shaders units because it doesn't have extra texture units even it doesn't have anything to do with how Tflops are calculated.
and 2080Ti is less powerfull then 3080 i don't know what's your point bringing it up.
 

Marlenus

Member
Jul 29, 2013
1,923
780
670
UK
How about RDNA 3's performance on DLSS and Raytracing?

It can't do DLSS because that is NV proprietary tech. They might have an FSR 2 that is similar to DLSS and includes vector data and temporal data but no details at all so that is 100% speculation.

RT vs NV no idea but I do expect RT to be improved vs RDNA2.
 
  • Like
Reactions: Kazekage1981

Darius87

Member
Jul 16, 2018
1,092
2,637
525
Lol, you guys said that it performed better at rasterization when its clearly behind. And idgaf about Tflops, Nvidia Ampere line is performing better thats my point. it doesn't mather if its 100Tflops vs 5Tflop if a 3090 is performing better than a 6900xt than its better ,if a 3080 performs better than a 6800xt than it is better. Fun fact 3080 is closer to 6900XT than 6900xt is to 3090, and falls back even behind with DLSS and RT.


Lol at 4K? ok dude game on 720p then. And BTW




And all of this without RT or DLSS and yall act like AMD did something revolutionary smh.
like i said again RDNA2 Tflop is better(more efficient) then Ampere Tflop and you can clearly see that in graphs i really don't care if you compare more powerfull Nvidia card to RDNA2 less powerful card that doesn't make sense i could connect 2 SLI AMD cards and say it's better then 1 Nvidia card.
and don't bring DLSS or RT in this that's not what i'm comparing.
 

FireFly

Member
Aug 5, 2007
1,832
1,356
1,440
You sound like i just have to ignore half of shaders units because it doesn't have extra texture units even it doesn't have anything to do with how Tflops are calculated.
and 2080Ti is less powerfull then 3080 i don't know what's your point bringing it up.
You don't ignore anything – that's the whole point. When you want to know how Ampere is expected to perform in games vs Turing or RDNA 2 even, you need to know not just the amount of compute but also the texture rate, fillrate and integer rate. It only makes sense to ignore these things if they have a fixed relationship to the amount of compute. And that only holds within the same architecture.
 

rnlval

Member
Jun 26, 2017
1,375
1,123
460
Sector 001
gpucuriosity.wordpress.com
I’m already familiar with the benchmarks but they’re also synthetic so they don’t mean much as of yet. Also no graphics engines are currently using mesh/primitive shaders.

RDNA 3 is a completely different architecture to RDNA 2, so much so in fact that AMD even consider RDNA 3 and 4 a part of the new GFX 11 family, they only do such things when big changes happen.

Back to Mesh/Primitive Shader performance, AMD have a number of patents for geometry handling and workload distribution, all filed in the past year or two and all in line for RDNA 3. So the geometry performance of RDNA 3 should be something to look out for since AMD have made zero changes to their Geometry Engine since Vega.

EDIT: also forgot to add, a few leakers did mention early on that RDNA 3 would bring multiple improvements with it one of which being “drastically improved geometry handling”.

But this is all just speculation for now lol

AMD needs to improve the geometry and raytracing performance by more than 2X.
 
  • Like
Reactions: Insane Metal

rnlval

Member
Jun 26, 2017
1,375
1,123
460
Sector 001
gpucuriosity.wordpress.com
like i said again RDNA2 Tflop is better(more efficient) then Ampere Tflop and you can clearly see that in graphs i really don't care if you compare more powerfull Nvidia card to RDNA2 less powerful card that doesn't make sense i could connect 2 SLI AMD cards and say it's better then 1 Nvidia card.
and don't bring DLSS or RT in this that's not what i'm comparing.
Hint: Ampere RTX's excess TIOPS/TFLOPS compute vs rasterization and Direct Storage's GpGPU decompression.
 

rnlval

Member
Jun 26, 2017
1,375
1,123
460
Sector 001
gpucuriosity.wordpress.com
The point is that Nvidia didn't increase number of CUs (SMs) with the 3080/3070. The 3080 has 68 SMs, just like the 2080 Ti. It just has double the number of FP32 ALUs per SM. So you really can't use TFLOPS to compare the two architectures, because the ratio of compute (FP32) to texture rate/fill rate/integer rate, is different.
For Ampere, NVIDIA evolved Turing SM's INT CUDA cores to handle FP workload.
 

Setsuna Mudou

Member
May 29, 2020
1,059
4,762
490
I know it is but the amount of bullshit rumors in favor of AMD is always ridiculous.
AMD has just been kicking ass and doing everything right for a number of years now. Especially when they were pretty much on the brink of bankruptcy, fighting 2 tech giants at the same time (Intel and Nvidia).

A few years later and they have both Nvidia and Intel by the balls.

That woman (Dr. Lisa Su) did something absolutely incredible steering an almost sinking ship into a place of dominance.
 
Last edited:

Marlenus

Member
Jul 29, 2013
1,923
780
670
UK
I know it is but the amount of bullshit rumors in favor of AMD is always ridiculous.

People thought Navi 21 being 2x Navi 10 performance was BS and that was not.

If AMD are gunning for 2.5x performance over the 6900XT then it may be possible with a wide MCM design that has a 500W TDP. This would be the equivalent to a 295X2 type card only it would look like a single GPU and work without requiring game support. Infact 66% more TDP + 50% more perf / watt gets you 2.5x performance.

Is it doable, maybe and AMD have done 500W cards in the past so it is not out of their scope but I am not sure you could compare it directly to the 6900XT, it would probably sit in an entirely new tier if they hit that target.
 

Turk1993

Member
Jan 13, 2018
1,068
2,929
425
like i said again RDNA2 Tflop is better(more efficient) then Ampere Tflop and you can clearly see that in graphs i really don't care if you compare more powerfull Nvidia card to RDNA2 less powerful card that doesn't make sense i could connect 2 SLI AMD cards and say it's better then 1 Nvidia card.
and don't bring DLSS or RT in this that's not what i'm comparing.
"less powerfull card" lol now you just accept it ok nice. First it was Amd had better performance and now its not fair to compare top of the line Nvidia card with top of the line Amd card :messenger_tears_of_joy: . And you can SLI any Amd card you want Nvidia will still come on top. Just accept it and move on.
I know it is but the amount of bullshit rumors in favor of AMD is always ridiculous.
literally this, always overhyping there stuff and act like nvidia is doomed and can't compete with them.
AMD has just been kicking ass and doing everything right for a number of years now. Especially when they were pretty much on the brink of bankruptcy, fighting 2 tech giants at the same time (Intel and Nvidia).

A few years later and they have both Nvidia and Intel by the balls.

That woman (Dr. Lisa Su) did something absolutely incredible steering an almost sinking ship into a place of dominance.
What they had with the CPU is amazing but also Intels fault. This is not the case with Nvidia, they are not playing the waiting game.
 

PaintTinJr

Member
Jan 30, 2020
1,219
2,699
475
Oxfordshire, England
People thought Navi 21 being 2x Navi 10 performance was BS and that was not.

If AMD are gunning for 2.5x performance over the 6900XT then it may be possible with a wide MCM design that has a 500W TDP. This would be the equivalent to a 295X2 type card only it would look like a single GPU and work without requiring game support. Infact 66% more TDP + 50% more perf / watt gets you 2.5x performance.

Is it doable, maybe and AMD have done 500W cards in the past so it is not out of their scope but I am not sure you could compare it directly to the 6900XT, it would probably sit in an entirely new tier if they hit that target.
If they are gunning for 2.5x the performance, I think the context of that 2.5x is different than before because of UE5 nanite/lumen effectively deprecating existing hw geometry and rasterization engines, making them legacy technology in gaming terms.

If both AMD and nvidia refocus on performance in terms of UE5's nanite and lumen, then 2.5x the performance, when nanite out performs hw at 2 or 3 to 1 for rasterizing, might mean proportionally less silicon needed, to achieve that gain.

I would speculate if RDNA3 is a multi chip setup, the legacy hw engines might be in one chip only, and the 2nd chip might be entirely for accelerating general purpose SW rendering, like nanite and lumen algorithms, meaning simpler design, and better power efficiency - and probably zero improvement over a 6900XT for games using UE4.

Anyone that can remember playing 3D games prior to 3D hw accelerators like the 3dfx Voodoo back in the day, will appreciate that great programmers like Carmack and Sweeney were able to iterate their renderers by working them smarter, rather than just throwing faster and faster CPUs at them, and hopefully with GPU hardware where it is now with RDNA2 and UE5 prelease, and with RDNA3/RTX 40xx coming, that type of SW renderer revolution - going full circle - is hopefully coming to GPUs, finally.
 

Setsuna Mudou

Member
May 29, 2020
1,059
4,762
490
What they had with the CPU is amazing but also Intels fault. This is not the case with Nvidia, they are not playing the waiting game.
They already did. RTX 2000 was a fucking joke and the IPC on the 3000 series per TFLOP is even worse than Pascal (and the 2000 series as well). And this is coming from someone that had a 1080 and a 3080.
I went with it because the VR performance was a bit steadier with the 3080 vs the 6800, but i kinda fucked myself because in 1440p the 6800 dominates the 3080. Anyway, good enough for an upgrade, but I expect much better late next year.
 
Jan 7, 2018
2,643
3,237
605
They already did. RTX 2000 was a fucking joke and the IPC on the 3000 series per TFLOP is even worse than Pascal (and the 2000 series as well). And this is coming from someone that had a 1080 and a 3080.
I went with it because the VR performance was a bit steadier with the 3080 vs the 6800, but i kinda fucked myself because in 1440p the 6800 dominates the 3080. Anyway, good enough for an upgrade, but I expect much better late next year.

?

Maybe you're thinking about some outlier game. Perf from diff archs can vary greatly from game to game and that's the reason why you bench multiple games when comparing gpu's.


 

Turk1993

Member
Jan 13, 2018
1,068
2,929
425
They already did. RTX 2000 was a fucking joke and the IPC on the 3000 series per TFLOP is even worse than Pascal (and the 2000 series as well). And this is coming from someone that had a 1080 and a 3080.
I went with it because the VR performance was a bit steadier with the 3080 vs the 6800, but i kinda fucked myself because in 1440p the 6800 dominates the 3080. Anyway, good enough for an upgrade, but I expect much better late next year.
How is that the case when the top 5 cards back then where literally all from Nvidia? And for VR i can't say anything, never looked into. And again the perfromance in QHD is the same between 6800XT and RTX 3080, it literally changes from game to game that favors Amd or Nvidia. And when there is support for DLSS and RT than its just gets dominated by Nvidia. And ofcourse next year the cards from Nvidia and Amd gonna outperform anything thats avaible right now from both brands. They are making nice progression and both are needed. i don't want Amd to get crushed by nvidia believe me. Its never good for the consumers (us) to give the full power to 1 company. it will slow down the progression anbd fuck up the prices even more. But im saying that Amd is still behind and want them to push harder make it more competitive in all categories RT, DLSS,...
 

Buggy Loop

Member
Jun 9, 2004
6,503
3,460
1,735
Quebec, canada
Nobody is getting their pants caught down because of MCM, every developer of high-performance silicon evaluated/is evaluating it. Apple, Hi-Silicon, Qualcomm, Nvidia, Broadcom, AMD, Intel...

It's been looked at / researched for a long long time, but like any technology, the optimization of the best time of entry, based on the cost of these more complex communication and memory arrangements vs monolithic chips that also keep improving, might determine one company's entry into the market with MCM maybe before another. But again, it's a tradeoff of a symptom of yields on monolithic, but TSMC seems to be killing it on 5nm with Apple, so a well optimized die on monolithic is in no way a problem. MCM is interesting and inevitable, but it's not going to be a 1st gen of "slaughter".

Worse yields on large monolithic 5nm die = more costly
Better yields at the foundry on MCM but with a lot more complex communication bus system = more costly

Also the CPU comparison that keeps poping up with Ryzen...

As a first gen Ryzen buyer (1600) and now latest gen (5600X), you're fucking delusional if you think that was an easy win. It took years for them to even get their head out of the water, only to raise the price and become a worse bang for the buck than some Intels products with the latest line. This was achieved with a sleeping Intel sitting on their 14nm++++++++ throne and being stubborn to not go to TSMC or any 3rd party foundry for so many years. Intel fucked up and it's why a CEO is gone. But don't think for a minute, and this is coming from an eternal AMD CPU supporter (even fucking Phenom), that Intel is done for. They managed to squeeze so much out of 14nm, they're practically unbeatable in design densities. Give them TSMC 5nm and you've just woken a sleeping dragon.

Nvidia ain't Intel, like at all. They are implicated everywhere, something that AMD has been slacking for years. CUDA, since 2007, they are in all universities, they publish a ton of papers along researchers on all kinds of subjects (make a quick search on Nvidia VR patents, it's wild), they are the reference to anything related to deep learning/ML acceleration. They basically leaded the project and give MS the keys to implementing ray tracing and ML in DirectX 12. They're no Intel.

But hey, more competition the better. But as myself i was a huge ATI/AMD fan since mid 1990's, i learned with time to lower expectations.
 

rnlval

Member
Jun 26, 2017
1,375
1,123
460
Sector 001
gpucuriosity.wordpress.com
They already did. RTX 2000 was a fucking joke and the IPC on the 3000 series per TFLOP is even worse than Pascal (and the 2000 series as well). And this is coming from someone that had a 1080 and a 3080.
I went with it because the VR performance was a bit steadier with the 3080 vs the 6800, but i kinda fucked myself because in 1440p the 6800 dominates the 3080. Anyway, good enough for an upgrade, but I expect much better late next year.
Ampere's TFLOPS compute power shows its strength with GpGPU compute. Normal raster games use ROPS as the primary read/write units while Async compute uses TMUs as its read/write units.

The same method for Radeon VII (strong with Async compute path vs raster ROPS path) also applies for NVIDIA Ampere.

MS's non-raster path compute examples

  • MS DirectStorage PC version uses GpGPU compute for decompression.
  • MS DirectML also GpGPU compute or Tensor cores for processing.
 

rnlval

Member
Jun 26, 2017
1,375
1,123
460
Sector 001
gpucuriosity.wordpress.com
Nobody is getting their pants caught down because of MCM, every developer of high-performance silicon evaluated/is evaluating it. Apple, Hi-Silicon, Qualcomm, Nvidia, Broadcom, AMD, Intel...

It's been looked at / researched for a long long time, but like any technology, the optimization of the best time of entry, based on the cost of these more complex communication and memory arrangements vs monolithic chips that also keep improving, might determine one company's entry into the market with MCM maybe before another. But again, it's a tradeoff of a symptom of yields on monolithic, but TSMC seems to be killing it on 5nm with Apple, so a well optimized die on monolithic is in no way a problem. MCM is interesting and inevitable, but it's not going to be a 1st gen of "slaughter".

Worse yields on large monolithic 5nm die = more costly
Better yields at the foundry on MCM but with a lot more complex communication bus system = more costly

Also the CPU comparison that keeps poping up with Ryzen...

As a first gen Ryzen buyer (1600) and now latest gen (5600X), you're fucking delusional if you think that was an easy win. It took years for them to even get their head out of the water, only to raise the price and become a worse bang for the buck than some Intels products with the latest line. This was achieved with a sleeping Intel sitting on their 14nm++++++++ throne and being stubborn to not go to TSMC or any 3rd party foundry for so many years. Intel fucked up and it's why a CEO is gone. But don't think for a minute, and this is coming from an eternal AMD CPU supporter (even fucking Phenom), that Intel is done for. They managed to squeeze so much out of 14nm, they're practically unbeatable in design densities. Give them TSMC 5nm and you've just woken a sleeping dragon.

Nvidia ain't Intel, like at all. They are implicated everywhere, something that AMD has been slacking for years. CUDA, since 2007, they are in all universities, they publish a ton of papers along researchers on all kinds of subjects (make a quick search on Nvidia VR patents, it's wild), they are the reference to anything related to deep learning/ML acceleration. They basically leaded the project and give MS the keys to implementing ray tracing and ML in DirectX 12. They're no Intel.

But hey, more competition the better. But as myself i was a huge ATI/AMD fan since mid 1990's, i learned with time to lower expectations.

Intel May Rename its 7nm Process Node to 5nm to Highlight Similarity w/ TSMC’s 5nm EUV Process.

TSMC’s 5nm EUV process has a density of 171M/mm2 while Intel’s 7nm node has a peak density of 200-250M/mm2. As you can conclude from these observations, Intel’s process nodes are much denser than the corresponding TSMC nodes, and it’d be fair to say that the chipmaker’s 10nm and 7nm processes are comparable to TSMC’s 7nm and 5nm, respectively

Estimates put peak densities of Intel and TSMC nodes (million transistors per mm2) at: (Via)
  • TSMC 10nm: 52.5
  • TSMC N7: 91 <------ AMD
  • TSMC N5: 171 <------ Apple
  • TSMC N3: 290

  • Intel 14nm 37.5
  • Intel 10nm: 101 <----- Intel Tiger Lake, Intel Alder Lake.
  • Intel 7nn: 200-250

PS; TSMC also has N6 for "6 nm" e.g. AMD Rembrandt APU.
 
Last edited:

rnlval

Member
Jun 26, 2017
1,375
1,123
460
Sector 001
gpucuriosity.wordpress.com
If they are gunning for 2.5x the performance, I think the context of that 2.5x is different than before because of UE5 nanite/lumen effectively deprecating existing hw geometry and rasterization engines, making them legacy technology in gaming terms.

If both AMD and nvidia refocus on performance in terms of UE5's nanite and lumen, then 2.5x the performance, when nanite out performs hw at 2 or 3 to 1 for rasterizing, might mean proportionally less silicon needed, to achieve that gain.

I would speculate if RDNA3 is a multi chip setup, the legacy hw engines might be in one chip only, and the 2nd chip might be entirely for accelerating general purpose SW rendering, like nanite and lumen algorithms, meaning simpler design, and better power efficiency - and probably zero improvement over a 6900XT for games using UE4.

Anyone that can remember playing 3D games prior to 3D hw accelerators like the 3dfx Voodoo back in the day, will appreciate that great programmers like Carmack and Sweeney were able to iterate their renderers by working them smarter, rather than just throwing faster and faster CPUs at them, and hopefully with GPU hardware where it is now with RDNA2 and UE5 prelease, and with RDNA3/RTX 40xx coming, that type of SW renderer revolution - going full circle - is hopefully coming to GPUs, finally.

UE5 nanite/lumen did not deprecate rasterization (floating-point geometry to integer pixels) and raster ops (color and Z-buffer/depth buffer read/write units). The text in the bold format is fundamental hardware areas that differ the GPU from DSP.

Hint: RX 6800 XT has 128 ROPS expansion which enabled NAVI 21 to be competitive with classic raster ops game workloads. RX 6800 XT 128 ROPS expansion lessens the need for Async compute/TMU read-write path and notice AMD has lessened Async compute PR marketing with NAVI 21's release.
 
Last edited:
Mar 23, 2013
2,629
1,023
810
This is perfect confirmation bias supporting my decision to buy more AMD shares in the morning before earnings on Tuesday.
 

rnlval

Member
Jun 26, 2017
1,375
1,123
460
Sector 001
gpucuriosity.wordpress.com
?

Maybe you're thinking about some outlier game. Perf from diff archs can vary greatly from game to game and that's the reason why you bench multiple games when comparing gpu's.
Resizable BAR (a.k.a, AMD SAM) also benefits RTX 3080s.
 

rnlval

Member
Jun 26, 2017
1,375
1,123
460
Sector 001
gpucuriosity.wordpress.com
Nobody is getting their pants caught down because of MCM, every developer of high-performance silicon evaluated/is evaluating it. Apple, Hi-Silicon, Qualcomm, Nvidia, Broadcom, AMD, Intel...

It's been looked at / researched for a long long time, but like any technology, the optimization of the best time of entry, based on the cost of these more complex communication and memory arrangements vs monolithic chips that also keep improving, might determine one company's entry into the market with MCM maybe before another. But again, it's a tradeoff of a symptom of yields on monolithic, but TSMC seems to be killing it on 5nm with Apple, so a well optimized die on monolithic is in no way a problem. MCM is interesting and inevitable, but it's not going to be a 1st gen of "slaughter".

Worse yields on large monolithic 5nm die = more costly
Better yields at the foundry on MCM but with a lot more complex communication bus system = more costly

Also the CPU comparison that keeps poping up with Ryzen...

As a first gen Ryzen buyer (1600) and now latest gen (5600X), you're fucking delusional if you think that was an easy win. It took years for them to even get their head out of the water, only to raise the price and become a worse bang for the buck than some Intels products with the latest line. This was achieved with a sleeping Intel sitting on their 14nm++++++++ throne and being stubborn to not go to TSMC or any 3rd party foundry for so many years. Intel fucked up and it's why a CEO is gone. But don't think for a minute, and this is coming from an eternal AMD CPU supporter (even fucking Phenom), that Intel is done for. They managed to squeeze so much out of 14nm, they're practically unbeatable in design densities. Give them TSMC 5nm and you've just woken a sleeping dragon.

Nvidia ain't Intel, like at all.
They are implicated everywhere, something that AMD has been slacking for years. CUDA, since 2007, they are in all universities, they publish a ton of papers along researchers on all kinds of subjects (make a quick search on Nvidia VR patents, it's wild), they are the reference to anything related to deep learning/ML acceleration. They basically leaded the project and give MS the keys to implementing ray tracing and ML in DirectX 12. They're no Intel.

But hey, more competition the better. But as myself i was a huge ATI/AMD fan since mid 1990's, i learned with time to lower expectations.
Tesla dumped NVIDIA and created their own ML math processor. Reason: Nvidia Drive PX2 computing platform's cost per performance issues.
 

PaintTinJr

Member
Jan 30, 2020
1,219
2,699
475
Oxfordshire, England
UE5 nanite/lumen did not deprecate rasterization (floating-point geometry to integer pixels) and raster ops (color and Z-buffer/depth buffer read/write units). The text in the bold format is fundamental hardware areas that differ the GPU from DSP.

Hint: RX 6800 XT has 128 ROPS expansion which enabled NAVI 21 to be competitive with classic raster ops game workloads. RX 6800 XT 128 ROPS expansion lessens the need for Async compute/TMU read-write path and notice AMD has lessened Async compute PR marketing with NAVI 21's release.
I don't follow your response about specific OPs. The Engines (vertex and geometry) are completely bypassed by nanite AFAIK, because they reimplement it in a SW async compute shader - further bypassing blending and other fixed cost aspects of hw accelerated features.

I'm not even entirely sure it uses the zbuffer in the same way a game rendering through the vertex, geometry and fragment shader pipelines does, so historical AMD features like Early-Z will be bypassed also, as I understand it. But I'm not entirely up-to-date on if any of the +10year old features are actually still fixed silicon, and how much is implemented in the GPU firmware/driver, now. So your point could well be true, if all those features are just implemented firmware/driver.
 
Last edited:

psn

Member
Jun 23, 2013
1,482
55
575
Germany
?

Maybe you're thinking about some outlier game. Perf from diff archs can vary greatly from game to game and that's the reason why you bench multiple games when comparing gpu's.


You can bench multiple games, yet you can still force an outcome of your choice, depending on the games you bench.

If you check some of the newest releases for example - even at 4k - the performance difference is not really there:


And you can add many older games, where NVidia really shines. I mean the 3090 is also a beast and depending on the games you play it performs better. But keep in mind, that it is 50% more expensive as well.

I couldnt care less about DLSS, it always looked and played like shit on my 3080 FE in motion. I mean you buy an expensive monitor with close to 0 overshoot and ghosting @240Hz and then activate DLSS and it introduces that weird ghosting and artifacts. That might be fine for singleplayer games. But in multiplayer shooters that is just a no-go for me. Might be fixed in the future, but to this very point DLSS is not perfect as well.
Luckily, it doesnt matter with the power of a 3080 / 6800xt or above right now.
 
Jan 7, 2018
2,643
3,237
605
You can bench multiple games, yet you can still force an outcome of your choice, depending on the games you bench.

If you check some of the newest releases for example - even at 4k - the performance difference is not really there:


And you can add many older games, where NVidia really shines. I mean the 3090 is also a beast and depending on the games you play it performs better. But keep in mind, that it is 50% more expensive as well.

I couldnt care less about DLSS, it always looked and played like shit on my 3080 FE in motion. I mean you buy an expensive monitor with close to 0 overshoot and ghosting @240Hz and then activate DLSS and it introduces that weird ghosting and artifacts. That might be fine for singleplayer games. But in multiplayer shooters that is just a no-go for me. Might be fixed in the future, but to this very point DLSS is not perfect as well.
Luckily, it doesnt matter with the power of a 3080 / 6800xt or above right now.
True.

Here's a good example: you could select only top 3 games from this graph for benchmarking and claim that 6800 xt is 20% faster or select bottom 3 and say 3080 is 15% faster and on both occasions you'd be right !

That's what fanboys mostly do - they select only games that their specific cards architecture perform better on. I just don't understand why outlets do not benchmark with RT enabled or completely exclude games with RT from their bench suites - it's not Nvidia's fault that Amd's cards performs inadequately.

 

Genx3

Member
Jun 23, 2019
974
1,391
400
I’m already familiar with the benchmarks but they’re also synthetic so they don’t mean much as of yet. Also no graphics engines are currently using mesh/primitive shaders.

RDNA 3 is a completely different architecture to RDNA 2, so much so in fact that AMD even consider RDNA 3 and 4 a part of the new GFX 11 family, they only do such things when big changes happen.

Back to Mesh/Primitive Shader performance, AMD have a number of patents for geometry handling and workload distribution, all filed in the past year or two and all in line for RDNA 3. So the geometry performance of RDNA 3 should be something to look out for since AMD have made zero changes to their Geometry Engine since Vega.

EDIT: also forgot to add, a few leakers did mention early on that RDNA 3 would bring multiple improvements with it one of which being “drastically improved geometry handling”.

But this is all just speculation for now lol
Mesh shaders is NOT Primitive shaders.
Stop pretending they are the same thing when they are not.

"What are the differences between PS5 and Xbox Series X? Difficult (and unfair) to make a complete comparison in a nutshell, but now we can point out at least one big difference: according to what Digital Foundry reported, PS5 uses AMD’s RDNA 1 architecture Primitive Shaders, while Xbox Series X uses RDNA 2’s Mesh Shaders.

The Primitive Shaders were presented in June 2019 for RDNA 1 and were shown by Sony in the “Road to PS5” presentation. Xbox Series X, on the other hand, has opted for the latest technologies and, not surprisingly, Microsoft has pointed this out in its marketing campaign in recent months, probably knowing that the direct competitor would not be able to offer the same level of technology."

Digital Foundry say that Series X|S has a more complete programmable front end (Mesh Shaders/VRS/SFS) than the old PS5 primitive shaders.
Primitive Shaders were presented by AMD in June 2019 for RDNA1 and shown by Sony on “Road to PS5” presentation.

If you want to see the differences your best bet is to look up the patents for Primitive and Mesh shaders they are different.
Primitive shaders is the precursor to Mesh shaders. Mesh shaders is more advanced essentially the 2.0 version.
 
Last edited:

MikeM

Member
May 21, 2021
539
766
295
Can someone break this down to stupid people like me? I am assuming this is good, but will it be better than Nvidia?
 

Marlenus

Member
Jul 29, 2013
1,923
780
670
UK
True.

Here's a good example: you could select only top 3 games from this graph for benchmarking and claim that 6800 xt is 20% faster or select bottom 3 and say 3080 is 15% faster and on both occasions you'd be right !

That's what fanboys mostly do - they select only games that their specific cards architecture perform better on. I just don't understand why outlets do not benchmark with RT enabled or completely exclude games with RT from their bench suites - it's not Nvidia's fault that Amd's cards performs inadequately.


If you did RT only you could bench Dirt 5 and WoW Shadow lands and come out with RDNA2 > Ampere. Unless NV fixed the issue in Shadowlands that is.

Also in some scenes in RE Village @ 4k with RT it eats up the 8GB buffer so the 12GB 3060 vests the 3060Ti and 3070. As does the 6700XT.

Guru 3d

So as you rightly point out can cherry pick anything, even 30+ game suites. Personally if the games are popular then the performance is the performance. I also think a few left field picks using various engines/APIs is a good idea to see how well less popular games with less driver support fare.
 

Darius87

Member
Jul 16, 2018
1,092
2,637
525
"less powerfull card" lol now you just accept it ok nice. First it was Amd had better performance and now its not fair to compare top of the line Nvidia card with top of the line Amd card :messenger_tears_of_joy: . And you can SLI any Amd card you want Nvidia will still come on top. Just accept it and move on.
accept what? that top nvidia card is 35 tflops starting at 1500$ and top amd card is 23 tflops starting at 999$ :messenger_grinning_smiling: of course nvidia will be at top i'm not arguing that. if your brain would figure out that context is important you would've read my first comments where i'm arguing that RDNA2 have better raster and better Tflop performance then ampere , but deluded people always bring up most expensive Nvidia card where AMD didn't even tried to compete with.
3090 defeats 6900XT just by 5 to 7% with 12 more Tflops now imagine 6900XT with 35Tflops it would be left wet fart from 3090 at raster. it's is clear that AMD offers best value for performance:
6800XT have 10 less Tflops then 3080 and latter defeats just by 3%
6800 beats 3070Ti by 3% with 5 less Tflops.
https://www.techpowerup.com/
 
Dec 24, 2020
468
2,687
340
Mesh shaders is NOT Primitive shaders.
Stop pretending they are the same thing when they are not.

"What are the differences between PS5 and Xbox Series X? Difficult (and unfair) to make a complete comparison in a nutshell, but now we can point out at least one big difference: according to what Digital Foundry reported, PS5 uses AMD’s RDNA 1 architecture Primitive Shaders, while Xbox Series X uses RDNA 2’s Mesh Shaders.

The Primitive Shaders were presented in June 2019 for RDNA 1 and were shown by Sony in the “Road to PS5” presentation. Xbox Series X, on the other hand, has opted for the latest technologies and, not surprisingly, Microsoft has pointed this out in its marketing campaign in recent months, probably knowing that the direct competitor would not be able to offer the same level of technology."



If you want to see the differences your best bet is to look up the patents for Primitive and Mesh shaders they are different.
Primitive shaders is the precursor to Mesh shaders. Mesh shaders is more advanced essentially the 2.0 version.
I never said Mesh Shaders where the same as Primitive Shaders. You also seem to be misunderstanding what Mesh/Primitive Shaders do. The purpose of both is to provide compute shader functionality into the graphics pipeline so they rely less on CPU and RAM. The hardware for Mesh/Primtive Shaders is EXACTLY the same across RDNA 2 cards, PS5 and Series X/S, and AMD have made zero changes to their geometry engine since Vega (refer to the whitepapers).

Now Primitive Shaders modify the traditional graphics pipeline to achieve this whilst Mesh Shaders introduce a brand new pipeline. Which is better? It’s hard to say but both offer the same performance when optimised and from what I’ve been told, tesselation can be inserted easier with Mesh Shaders where as with Primitive Shaders require a different code path but that’s about it.

Let’s not pretend Mesh Shaders are a more advanced version, it’s simply not true. AMD are still filing patents within the past year for Primitive Shaders. In fact all RDNA 2 cards convert Mesh Shaders into Primitive Shader in code and it’s very likely Series X/S is doing this as well.

Here are some tweets from LeviathanGamer2 who has covered this topic extensively


 

Rikkori

Member
May 9, 2020
2,704
5,122
470
Can someone break this down to stupid people like me? I am assuming this is good, but will it be better than Nvidia?
It's too early to tell any details like that for a comparison. All we know is that the next gen's top end cards will be monstrously performant with a bigger leap than ever before (and likely exorbitantly priced). Bodes well for raytracing performance due to the sheer number of cores.
 

Buggy Loop

Member
Jun 9, 2004
6,503
3,460
1,735
Quebec, canada
If you did RT only you could bench Dirt 5 and WoW Shadow lands and come out with RDNA2 > Ampere. Unless NV fixed the issue in Shadowlands that is.

Sure you/we could, and it has been done repeatedly, but everyone ignores or wants to forget the logic behind the discrepancies and review sites did a shit job overall to present what is going on.

Let's go back in time, when the reviews were dropping :

Dirt 5 without RT : 6800XT 30% ahead of the 3080 (and of course, taken at 1440p rather than 4k, it's techspot ..)
Dirt 5 with RT : 6800XT 34% ahead of the 3080



Which kind of make sense for light RT games that only use shadows as RDNA 2 will simply have a couple of CUs disabled for light RT tasks while Nvidia's ASIC philosophy behind RT is not really being worked to it's full potential. A +5% raw RT performance. I think anyone on either camp can understand that optimization can/will make these kind of differences. But the foundation of this game was rotten at launch.

The main outlier of this drastic difference has nothing to do with RT, it's this game's variable shading which is a custom algorithm for RDNA 2 and was borked on Nvidia (until there's a patch..). But more importantly, not a single tech site/youtuber thought that this was worth to put aside for a minute till we figure out why 30% difference is happening, contact the developers and to not include them in averages. Of course not. Since the wave of reviews are now frozen in time and these benchmarks keep perpetuating lousy arguments of RDNA 2 performances on the internet since last fall.

Let's now look what happens when a patch hits?


Well well well..

1440p : +9% without RT, +6% with RT for 6800XT vs 3080
4k : -6% without RT, +1% with RT for 6800XT vs 3080

WOW shadowlands : (Again also with fidelityFX VRS)


You seeing this shit?

2080 TI at the same performance as the 3090, did not raise any red flags? The scaling does not make a fucking sense here. Ain't wasting more time analyzing such a title when the basics of it are so broken to begin with.

Also in some scenes in RE Village @ 4k with RT it eats up the 8GB buffer so the 12GB 3060 vests the 3060Ti and 3070. As does the 6700XT.

Guru 3d

Big difference in memory usage..:pie_thinking:


But more importantly, the other sites do not show the drop that guru3D showed for the 3060 TI and 3070. Kitguru too but the image is just way too big..


Those VRAM numbers alone are so different between guru3D and techpowerup.
Guru3D likely let the game run for a long time, because there's a memory leak in this game (found later). The bigger the memory, the longer it takes to hit a performance wall, but also explain guru3d's huge memory difference with other sites. Taking a snapshot of this moment in a benchmark review, is lazy reporting.

So as you rightly point out can cherry pick anything, even 30+ game suites. Personally if the games are popular then the performance is the performance. I also think a few left field picks using various engines/APIs is a good idea to see how well less popular games with less driver support fare.

I think we can all live and accept a few % difference based on game sponsors, it's expected and ultimately, barely anyone would care right? Ideally, there's more choice in the same performance range so that the prices drop, and that we have more offer/choice on the market. What I find aggravating is the lack of critical thinking we find on almost all modern tech sites when they end up with a result that completely diverge from the norm and they don't fucking come back to it for a correction later! It's pretty shit all around, on both sides to be honest, and these "fanboy ammos" keep appearing in time, way too long, sometimes well beyond a patch or a driver fix.

So these tech sites, rather than questioning the results and asking questions to the devs, they are now basically doing the job that an automated script could be doing for them, just churning out numbers without or barely any discussions or questions. I think actually an AI would write better articles than most tech sites.
 

Turk1993

Member
Jan 13, 2018
1,068
2,929
425
accept what? that top nvidia card is 35 tflops starting at 1500$ and top amd card is 23 tflops starting at 999$ :messenger_grinning_smiling: of course nvidia will be at top i'm not arguing that. if your brain would figure out that context is important you would've read my first comments where i'm arguing that RDNA2 have better raster and better Tflop performance then ampere , but deluded people always bring up most expensive Nvidia card where AMD didn't even tried to compete with.
3090 defeats 6900XT just by 5 to 7% with 12 more Tflops now imagine 6900XT with 35Tflops it would be left wet fart from 3090 at raster. it's is clear that AMD offers best value for performance:
6800XT have 10 less Tflops then 3080 and latter defeats just by 3%
6800 beats 3070Ti by 3% with 5 less Tflops.
https://www.techpowerup.com/
3080 cost 729$ vs 999$ 6900xt, and the 3080 is closer to the 6900xt than 6900xt is to 3090 so there goes your 999$ vs 1500$ price bullshit. And Amd themself compared the 6900xt with the 3090 and 6800xt with 3080 soo you are dellusional if you don't think they are not the direct competitors. And the 3080 destroys the 6900xt with RT and DLSS so 729$ card from Nvidia destroys 999$ card from Amd right? you are a joke!
 

BattleScar

Member
Jul 29, 2016
1,130
2,357
485
Game development is in transision towards next-generation game console hardware.
What does this have to do with anything? Consoles use RDNA2 and Zen 2 now....?

That benchmark shows the 3070 being the 6900XT, but in actual games the 6900XT is substantially faster. Synthetics are never representative of anything.
 

Darius87

Member
Jul 16, 2018
1,092
2,637
525
3080 cost 729$ vs 999$ 6900xt, and the 3080 is closer to the 6900xt than 6900xt is to 3090 so there goes your 999$ vs 1500$ price bullshit. And Amd themself compared the 6900xt with the 3090 and 6800xt with 3080 soo you are dellusional if you don't think they are not the direct competitors. And the 3080 destroys the 6900xt with RT and DLSS so 729$ card from Nvidia destroys 999$ card from Amd right? you are a joke!
and 6900xt is better then 3080 with having 6 Tflops less with just 6 - 7% worse then top Nvidia cards at 1199$ and 1499$ it just proves my point i was making from start RDNA2 Tflop are better then Ampere Tflop and of course you're that deluded one who have to bring up DLSS and RT crap while i'm almost at every post mention rasterisation performance congrats again for ignoring context.
 
Jan 7, 2018
2,643
3,237
605
If you did RT only you could bench Dirt 5 and WoW Shadow lands and come out with RDNA2 > Ampere. Unless NV fixed the issue in Shadowlands that is.

Also in some scenes in RE Village @ 4k with RT it eats up the 8GB buffer so the 12GB 3060 vests the 3060Ti and 3070. As does the 6700XT.

Guru 3d

So as you rightly point out can cherry pick anything, even 30+ game suites. Personally if the games are popular then the performance is the performance. I also think a few left field picks using various engines/APIs is a good idea to see how well less popular games with less driver support fare.

That would be a very rough sell, when usually we see 25-50% uplift for nvidia in games with RT and that's without the magical alientech that is DLSS.



oh and RE viliage has been patched and is now smooth sailing

 
Last edited: