• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Kraken benchmarks, perf analysis (with and without RDO). What PS5 hardware will be capable of.

UnNamed

Banned

TL;DR

Dataset 1 (Texture data BC1,3,4,5, and 7. Mix of diffuse, normals, etc.)
Kraken ratio: 1.76:1 (PS5 perf will be: 5.5GB/sec * 1.76 = 9.68GB/sec)
Kraken + RDO ratio: 3.13:1 (PS5 perf will be: 5.5GB/sec * 3.13 = 17.2GB/sec)

Dataset 2 (Texture data BC6 and 7. Mix of diffuse, normals, etc. Very much what MSFT expects of BCpack data)
Kraken ratio: 1.78:1 (PS5 perf will be: 5.5GB/sec * 1.85 = 10.1GB/sec)
Kraken + RDO ratio: 3.99:1 (PS5 perf will be: 5.5GB/sec * 1.99 = 21.9GB/sec)

tenor.gif
 
Have Sony clarified what data "compresses particularly well" enough to actually utilize the (essentially) 4:1 compression ratio yielding the 22 GB/s figure? I know it would be lossy data regardless, but that still leaves a wide range.

I know video files would be able to handle that type of compression rate very well without any significant quality loss, for example.

no, but at lambda 40, its "mostly" like lossless, above that it will be more perceptible but also more compressed, how much? will probably vary per game maybe some games may prefer a bit lossy textures other may prefer a lower lambda for higher quality

for other data there is this


InfiltratorDemo-WindowsNoEditor.pak

No compression : 2,536,094,378

No Oodle Data (Zlib), no Oodle Texture : 1,175,375,893

Yes Oodle Data, no Oodle Texture : 969,205,688

No Oodle Data (Zlib), yes Oodle Texture : 948,127,728

Oodle Data + Oodle Texture lambda=40 : 759,825,164



very good for storage too

its discussed in beyond3d

 
Last edited:

Tomeru

Member
I thought flux capacitor pipeline prevents degrading of anti mesh projections, but this is beyond anything I expected!
 
Last edited:

psorcerer

Banned
actually utilize the (essentially) 4:1 compression ratio yielding the 22 GB/s figure?

According to the tests in the OP BC6/7 texture formats are capable of 4:1 if quality reduction is at 40 (30 being - "no noticeable quality loss")
Unfortunately we don't have too much data in the OP article: what's x86 CPU usage for these speeds? what is the real quality (some examples would be nice)? and so on...
 

FranXico

Member
The same people who were cheering the Velocity Architecture "2.5x multiplier on average" are now complaining about the authenticity and real use applicability of these results. LMFAO.

Yeah, take these as a best case scenario, since they were measured on specific datasets of different characteristics.
Still promising, no matter how much you want to doubt their "real world impact".
 

RaySoft

Member

TL;DR

Dataset 1 (Texture data BC1,3,4,5, and 7. Mix of diffuse, normals, etc.)
Kraken ratio: 1.76:1 (PS5 perf will be: 5.5GB/sec * 1.76 = 9.68GB/sec)
Kraken + RDO ratio: 3.13:1 (PS5 perf will be: 5.5GB/sec * 3.13 = 17.2GB/sec)

Dataset 2 (Texture data BC6 and 7. Mix of diffuse, normals, etc. Very much what MSFT expects of BCpack data)
Kraken ratio: 1.85:1 (PS5 perf will be: 5.5GB/sec * 1.85 = 10.1GB/sec)
Kraken + RDO ratio: 3.99:1 (PS5 perf will be: 5.5GB/sec * 3.99 = 21.9GB/sec)

Fixed;-)
 
But doesn't the PS5's SSD transfer at a naximum of 9gb/s? Now it's 22 gb/s?
The disc itself transfers up to 5.5GB/s.

The data on the disc can be compressed, depending on what it is tramsfering the unpacked data can explode anywhere from 9GB/s to 22GB/s (this was all in the reveal with the Cerny). This is what their hardware decompression chip can handle, their APIs makes taking advantage of the compression method (kraken) more or less automatic as far as I understand.

This is an awful lot of data and people speculate on the actual real world benefits of the technology.
 
The same people who were cheering the Velocity Architecture "2.5x multiplier on average" are now complaining about the authenticity and real use applicability of these results. LMFAO.

Yeah, take these as a best case scenario, since they were measured on specific datasets of different characteristics.
Still promising, no matter how much you want to doubt their "real world impact".
I'm a huge PS fan and I wonder about the actual benefits.
Does it even matter when the system only has 16 gb ram?
This is why it matters, games can keep only what they need is active memory, so a lot of space end up being saved.
 

geordiemp

Member
I'm a huge PS fan and I wonder about the actual benefits.

This is why it matters, games can keep only what they need is active memory, so a lot of space end up being saved.

Its more than just freeing up memory, it will impact cycles and frame times....

Whats important here IMO.... is that RAM is common, and so when IO is writing to RAM, there is likely contention in some cycles where CPU and GPU cannot use it.

There is no scenario as such where you would not want the Streaming number of cycles to RAM to be as short as possible.

When doing a 16 ms frame and the hardwrae is in contention for sharing some resource, it is best when every bit and process runs as fast as possible and takes as few cycles as possible.
 
Last edited:
Yeah - Sonys Next Gen. Xbox Folks will play inflated Xbox One Games...
why this actually is happening i laid out here :
That was hilarious...i bookmarked your post to read it again and again and again👍👍👍
 
Honestly, this is probably the greatest Performance difference between the XSX and the PS5. But I think only PS5 exclusives can really take advantage of it.
Read the post of GreyHand23, its for al third party developers and its free Update....so no
 
Last edited:

Raonak

Banned
Wooo that's fucking fast.

For reference games will use about
12-14gb RAM, so if you can load 14-20gb a second, then yeah. Load times are done for.
Have Sony clarified what data "compresses particularly well" enough to actually utilize the (essentially) 4:1 compression ratio yielding the 22 GB/s figure? I know it would be lossy data regardless, but that still leaves a wide range.

I know video files would be able to handle that type of compression rate very well without any significant quality loss, for example.
Compression is variable on the data used. Generally, if it uses a lot of similar patterns, it can be compressed real nicely.
 
Last edited:

GreyHand23

Member
Wasn't Oodle incorporated/available this gen as well?

So before Oodle Texture was announced, the comparison was like this:

PS5 - Kraken for all data compression

XSX - BCPack for textures, ZLIB for all other data and further compresses the BCPack compressed textures

Kraken compresses data about 10% better than ZLIB, but also decompresses much faster when using similar hardware resources. BCPack resulted in XSX having a higher compression ratio overall, but still falling behind because the PS5 starts with a much higher raw SSD speed to begin with. Oodle Texture is a BCPack equivalent in terms of the pipeline. So now the comparison looks like this:

PS5 - Oodle Texture for textures, Kraken for all other data and further compresses the Oodle Texture compressed textures

XSX - BCPack for textures, ZLIB for all other data and further compresses the BCPack compressed textures

If you go and look at Lamda 40 (the highest texture compression ratio for Oodle Texture) samples on Oodle's website you will see that although it is lossy compression, to the naked eye the difference is nearly imperceptible especially compared to other lossy compressors. Given this I expect developers will routinely use either Lamda 30 or 40 level of compression.

What does this mean for PS5 games? It means if a developer chooses to go with the max level of compression possible using Oodle Texture and Kraken, they can rely on a system that was built to take full advantage of that with hardware decompression that can keep up with that speed because of its ability to decode data at 22 GB/s max. Another benefit is that game sizes can be smaller, although this is somewhat counteracted by the need for an increase in quality and quantity of assets. I suspect that bigger and better looking games will ultimately land at around the same size as they are currently because of higher compression and the elimination of a need to duplicate data on the SSD for faster data access times. As for how the speed directly benefits games, I suspect that it will take awhile for game engines and developers to implement ways to use the full speed available here, especially as Oodle Texture just released. At a minimum we should see faster loading, little to no pop in, much higher levels of texture detail, new gameplay options, and with time, games that ultimately surpass the UE5 demo level of fidelity. Exciting times ahead!
 
Last edited:

JonnyMP3

Member
As long as these 100GB games that we're seeing these days can be compressed into a lot smaller size, hopefully that 825GB SSD won't won't be a worry for me anymore. Storage space is probably the one thing that had me worrying about the PS5.
 

sircaw

Banned
This is like being in a canoe without a paddle.

I am going nowhere with this info.

What does it all mean "lollipop_disappointed: .
 
Last edited:
As long as these 100GB games that we're seeing these days can be compressed into a lot smaller size, hopefully that 825GB SSD won't won't be a worry for me anymore. Storage space is probably the one thing that had me worrying about the PS5.
With SSD, you won't have duplicates which means the games should shrink in size while tectures size will take some space. So I guess game sizes will not get bigger this gen.
 

GreyHand23

Member
Can BCPack compete with this

BCPack is a Lossy RDO Texture compressor like Oodle Texture. Unfortunately we don't have any information on exactly how much it compresses or how the quality of the texture compares to similar level of compression with Oodle Texture. What I'm comfortable saying is that in terms of compression alone, both solutions are likely equal with a chance for Sony to be ahead here because Kraken is better than Zlib and with the data we currently have it appears that Sony built more headroom into there hardware decompressor than Microsoft did.
 

JonnyMP3

Member
With SSD, you won't have duplicates which means the games should shrink in size while tectures size will take some space. So I guess game sizes will not get bigger this gen.
Yeah the lack of having to duplicate common objects filling up the HDD spots for quick access is definitely a bonus in file size as well as being able to compress this well. But as mentioned, the flip side could be higher fidelity assets across the board which then is taking up more data again. That's the trade off again. But fingers cross that with this much compression and memory space saved, is that when games do go to 100GB again, is because a game has literally designed an entire planet to play on.
 

Panajev2001a

GAF's Pleasant Genius
Can BCPack compete with this

Yes and it is a bit better than that in pure compression ratio terms from what I understand, but I might be wrong as I am basing it of the data MS and Sony gave when they presented the consoles and gave compressed equivalent bandwidth numbers. BCPack being better by a few percentage point does not do much to change a 2x baseline difference (2.4 GB/s vs 5.5 GB/s).

AFAICS, you have these scenarios (from best to worst relatively in terms of compressing and the decoding native GPU texture data):
  • BCPack and Oodle Texture + Oodle BC7Prep + Kraken (the latter requires extra decompression at runtime on the CPU or in an optimised fast async compute shader, reference decoder provided by Oodle)... I suspect MS built a HW decoder that essentially implements what Oodle does to decode those two pre-processing steps
  • Oodle Texture + Kraken
  • Kraken
  • zlib
 
Last edited:

GreyHand23

Member
Yes and it is a bit better than that in pure compression ratio terms from what I understand, but I might be wrong as I am basing it of the data MS and Sony gave when they presented the consoles and gave compressed equivalent bandwidth numbers. BCPack being better by a few percentage point does not do much to change a 2x baseline difference (2.4 GB/s vs 5.5 GB/s).

AFAICS, you have these scenarios (from best to worst relatively):
  • BCPack + zlib and Oodle Texture + Oodle BC7Prep + Kraken (the latter requires extra decompression at runtime on the CPU or in an optimised fast async compute shader, reference decoder provided by Oodle)... I suspect MS built a HW decoder that essentially implements what Oodle does to decode those two pre-processing steps
  • Oodle Texture + Kraken
  • Kraken
  • zlib

I don't think BC7Prep is needed for Oodle Texture + Kraken to compete with BCPack +zlib. It's just an extra bit of compression that developer can use if they think they really need. Having to use GPU or CPU resources to do so will probably dissuade most though.
 

Panajev2001a

GAF's Pleasant Genius
So before Oodle Texture was announced, the comparison was like this:

PS5 - Kraken for all data compression

XSX - BCPack for textures, ZLIB for all other data and further compresses the BCPack compressed textures

Kraken compresses data about 10% better than ZLIB, but also decompresses much faster when using similar hardware resources. BCPack resulted in XSX having a higher compression ratio overall, but still falling behind because the PS5 starts with a much higher raw SSD speed to begin with. Oodle Texture is a BCPack equivalent in terms of the pipeline. So now the comparison looks like this:

PS5 - Oodle Texture for textures, Kraken for all other data and further compresses the Oodle Texture compressed textures

XSX - BCPack for textures, ZLIB for all other data and further compresses the BCPack compressed textures

If you go and look at Lamda 40 (the highest texture compression ratio for Oodle Texture) samples on Oodle's website you will see that although it is lossy compression, to the naked eye the difference is nearly imperceptible especially compared to other lossy compressors. Given this I expect developers will routinely use either Lamda 30 or 40 level of compression.

What does this mean for PS5 games? It means if a developer chooses to go with the max level of compression possible using Oodle Texture and Kraken, they can rely on a system that was built to take full advantage of that with hardware decompression that can keep up with that speed because of its ability to decode data at 22 GB/s max. Another benefit is that game sizes can be smaller, although this is somewhat counteracted by the need for an increase in quality and quantity of assets. I suspect that bigger and better looking games will ultimately land at around the same size as they are currently because of higher compression and the elimination of a need to duplicate data on the SSD for faster data access times. As for how the speed directly benefits games, I suspect that it will take awhile for game engines and developers to implement ways to use the full speed available here, especially as Oodle Texture just released. At a minimum we should see faster loading, little to no pop in, much higher levels of texture detail, new gameplay options, and with time, games that ultimately surpass the UE5 demo level of fidelity. Exciting times ahead!

On top of that, if you are willing to do some async compute decoding on the GPU you can layer the BC7Prep step and increase compression ratio even further:http://cbloomrants.blogspot.com/2020/06/oodle-texture-bc7prep-data-flow.html
 

Panajev2001a

GAF's Pleasant Genius
I don't think BC7Prep is needed for Oodle Texture + Kraken to compete with BCPack +zlib. It's just an extra bit of compression that developer can use if they think they really need. Having to use GPU or CPU resources to do so will probably dissuade most though.

Some games may have enough async compute power left and may want to compress data further (saves on GPU <-> main RAM bandwidth too).
Also, it is only a fast decoding step you need to run on the GPU, it does not make it entirely software based decompression and decoding: http://cbloomrants.blogspot.com/2020/06/oodle-texture-bc7prep-data-flow.html

I think BCPack decoder is essentially something like BC7Prep + Oodle Texture (RDO) + Kraken (well zlib) but having that extra step decoded in HW instead of using the GPU or CPU to decode it.
 
Last edited:

GreyHand23

Member

GreyHand23

Member
Some games may have enough async compute power left and may want to compress data further (saves on GPU <-> main RAM bandwidth too).
Also, it is only a fast decoding step you need to run on the GPU, it does not make it entirely software based decompression and decoding: http://cbloomrants.blogspot.com/2020/06/oodle-texture-bc7prep-data-flow.html

I think BCPack decoder is essentially something like BC7Prep + Oodle Texture (RDO) Kraken but having that extra step decoded in HW instead of using the GPU or CPU to decode it.

It's possible that BCPack is doing such a thing, but where's the evidence that it is? I think at this time Microsoft hasn't revealed enough information about BCPack for us to make any concrete claims other than it is a very good RDO texture compressor.
 

Panajev2001a

GAF's Pleasant Genius
It's possible that BCPack is doing such a thing, but where's the evidence that it is? I think at this time Microsoft hasn't revealed enough information about BCPack for us to make any concrete claims other than it is a very good RDO texture compressor.

I assumed both MS and Sony were quoting their compressed equivalent bandwidth numbers using the best ideal case scenario (meaning Sony was also using Oodle Texture, without BC7Prep though as that requires an extra software decoding step, while MS was only using BCPack) and judging the average compression ratios based on that. This still leaves 4.8 GB/s vs 8-9 GB/s of equivalent compressed bandwidth.

If Sony had not included the free to decode Oodle Texture RDO step in their numbers, I would expect them to revise their numbers now that it is all public... if they do not it makes me think they were already counting on that then unannounced solution.
 

GreyHand23

Member
I assumed both MS and Sony were quoting their compressed equivalent bandwidth numbers using the best ideal case scenario (meaning Sony was also using Oodle Texture, without BC7Prep though as that requires an extra software decoding step, while MS was only using BCPack) and judging the average compression ratios based on that. This still leaves 4.8 GB/s vs 8-9 GB/s of equivalent compressed bandwidth.

If Sony had not included the free to decode Oodle Texture RDO step in their numbers, I would expect them to revise their numbers now that it is all public... if they do not it makes me think they were already counting on that then unannounced solution.

Fabian, who works for RadGameTools already stated that Sony did all of their testing without Oodle Texture.



Another interesting tidbit.

 

Journey

Banned

TL;DR

Dataset 1 (Texture data BC1,3,4,5, and 7. Mix of diffuse, normals, etc.)
Kraken ratio: 1.76:1 (PS5 perf will be: 5.5GB/sec * 1.76 = 9.68GB/sec)
Kraken + RDO ratio: 3.13:1 (PS5 perf will be: 5.5GB/sec * 3.13 = 17.2GB/sec)

Dataset 2 (Texture data BC6 and 7. Mix of diffuse, normals, etc. Very much what MSFT expects of BCpack data)
Kraken ratio: 1.78:1 (PS5 perf will be: 5.5GB/sec * 1.85 = 10.1GB/sec)
Kraken + RDO ratio: 3.99:1 (PS5 perf will be: 5.5GB/sec * 1.99 = 21.9GB/sec)


BCpack is more efficient at texture compression than Kraken. Kraken + RDO is a different story. Some developers argue that the most important factor, what takes the most amount of space by far, is texture data, more so going into next gen, which is why BCpack is a dark horse. Time will tell.
 
Top Bottom