• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Xbox Series X’s BCPack Texture Compression Technique 'might be' better than the PS5’s Kraken

jroc74

Phone reception is more important to me than human rights
Reading thru the Kraken blog more closely thats been posted in here and it seems Sony, Cerny were being modest with the average compression amounts, or the Kraken devs just made that much of an improvement between the PS5 reveal and now.

I understand more now where the 8-9GB/s comes from now. The 17GB/s is easier to see.

It makes sense to set it lower and just be surprised at the better results.
 
Last edited:
I was taking about PRT and addressing your comment about PRT being (potentially) less efficient than SFS and why I was agreeing with you (with some qualifications which I pointed out)!!! Feedback from the GPU is the bloody point of Sampler Feedback (streaming). I was saying the exact opposite of what you read :p.

My mistake, must have clearly misread lol.
 
It will definitely help series x games, I think the multiplier will be quite a bit lower than x3 in real usage scenarios but I’m sure the difference will be quite noticeable vs not using the tech. This generation of consoles really bring a lot of new and exciting stuff to the table, especially if you look vs last gen which was a weak hardware upgrade from the one before.

Microsoft's 2.5x is apparently accounting for the fact that textures aren't the only things being copied in, so apparently when all was said and done they determined 2.5x was the safe figure to give. At least that's what one of their engineers said even though they've seen higher.
 

Corndog

Banned
People did ask him (and devs here). Stanard states it in plain English I think: I have not seen anybody claiming that SFS grants you a 2-3x efficiency advantage on top of best in class PRT based virtual texturing solutions. Stanard’s comments fully qualifies the early 2-3x multiplier:
2t2PZ9s.jpg

9Ajga0Z.jpg

vkkodl1.jpg
I’m not seeing anything definitive there. Maybe I’m just missing it. Isn’t it more then prt. It is also which textures are not even used in the frame?
 
Last edited:

Lethal01

Member
You're missing the entire point. I don't care how it compares to PS5, and most shouldn't either. I care about what it will do for Xbox Series X games. If it you makes you feel better, yes, if PS5 is using this feature it will always have the faster I/O.

Happy Jonah Hill GIF

I simply wanted to know if it were something specific about Xbox's SFS than give a 2.5x boost or if it's just a technique that anything with good enough SSD that could do it. Don't know why you've been so aggressive about it. I'm not aiming to go "HA PS5 is faster" I just keep hearing 2.5~3x faster but never really hearing faster than what exact scenario since it felt like the question gets avoided I thought it would be better to not get your hopes up and assume it's 2.5x faster than a system that has no form of PRT at all.

WIsh we could get some numbers on how much overhead is get cut off using it.

I'd actually be kinda happy if PS5 couldn't do it because I'd be able to get excited for what a PC with a PS5 level SSD and PRT could do.
 
Last edited:

MrLove

Banned
Traditional mip streaming in today's AAA games market almost certainly means some form of virtualized texture paging system aka PRT. What would be the point in Microsoft building their console around a technique that has absolutely zero real benefit over what was already being used largely across Xbox consoles?

Microsoft built their system around technology and features designed to greatly outperform what was being used on xbox one generation titles. Enter Sampler Feedback Streaming. I highly doubt a Microsoft engineer working on it would be calling it a brand new feature or capability. I equally doubt the Metro Exodus technical director would be saying they plan to use it in future titles if they were already using exactly that for years.

And let's look at what the coalition has to say. Clearly this isn't something that was already in use on Xbox.


If the below statement by Coalition doesn't make crystal clear that this is new functionality for Xbox game developers then nothing will.



So coalition thinks SFS is a game changer. Says future titles will use it. In other words, Gears 5 and any other title they've ever made makes absolutely no use of anything remotely like it. They believe it will allow them to increase texture detail beyond what they can fit into memory, and change loading to just before they need it instead of the more traditional level loading approach.

The only XSX unique selling point is supposedly that the texture filters handle the special case better when a certain MipLevel is not in the VRAM. In practice, presumably an implicit check of the residency map and clamps of the MipLevel on the existing ones. That probably doesn't knock people that much off their feet

The (fine-grained loading / unloading of textures) works very well with Tiled Resources (granularity is a tile = page). Strictly speaking, the description there is simply wrong. With sampler feedback you get more and better information about which parts of a texture are used. This can be used to optimize your texture streaming. And that works even without exclusive features. Nobody there has mentioned anything that would not work with the well-known sampler feedback.
And if you don't believe me, you can read the description of the feature for DirectX12 at Microsoft. They also describe how you can use the sampler feedback for texture streaming and call it: "Sampler Feedback for Streaming" or SFS for short. Apparently no exclusive feature of the XBSX but also available on the desktop https://github.com/microsoft/DirectX-Specs/blob/master/d3d/SamplerFeedback.md



It is also best if you imagine it roughly like the virtual memory management of CPUs with pagetables. One side (a memory block of often 4kB for CPUs) and 64kB for GPUs (but in principle is also variable) is a certain resident resident in memory or not. The assignment of the virtual address to the physical address and also the information as to whether the block is in memory or not is in the pagetable. With the granularity of a page, part of a texture may or may not be selectively present in memory.
With sampler feedback, a "map" can now be obtained, in principle, which tells you which properties of a texture (with each MipLevel) will not be used for each individual texel, which is huge, even with a reduced way). This is of course great to hear about which, if the already loaded (resident) parts of a texture want to become scarce and you have to reload more tiles (and it also says which ones) and what you can throw out of the memory. Without sampler feedback that's a bit worried, in fact. Of course, the fact that not everything has to be in storage is also gilded.
Oh yes, trilinear filtering usually leads to the expiring transition between the MipLeveln.

That's is the true gamechanger. Sony removed all bottlnecks. MS did here sadly nothing.

PSA5-100-Mal-705x409.jpg





PS5 is several times faster and this was befor oodle
 

Corndog

Banned
The only XSX unique selling point is supposedly that the texture filters handle the special case better when a certain MipLevel is not in the VRAM. In practice, presumably an implicit check of the residency map and clamps of the MipLevel on the existing ones. That probably doesn't knock people that much off their feet

The (fine-grained loading / unloading of textures) works very well with Tiled Resources (granularity is a tile = page). Strictly speaking, the description there is simply wrong. With sampler feedback you get more and better information about which parts of a texture are used. This can be used to optimize your texture streaming. And that works even without exclusive features. Nobody there has mentioned anything that would not work with the well-known sampler feedback.
And if you don't believe me, you can read the description of the feature for DirectX12 at Microsoft. They also describe how you can use the sampler feedback for texture streaming and call it: "Sampler Feedback for Streaming" or SFS for short. Apparently no exclusive feature of the XBSX but also available on the desktop https://github.com/microsoft/DirectX-Specs/blob/master/d3d/SamplerFeedback.md



It is also best if you imagine it roughly like the virtual memory management of CPUs with pagetables. One side (a memory block of often 4kB for CPUs) and 64kB for GPUs (but in principle is also variable) is a certain resident resident in memory or not. The assignment of the virtual address to the physical address and also the information as to whether the block is in memory or not is in the pagetable. With the granularity of a page, part of a texture may or may not be selectively present in memory.
With sampler feedback, a "map" can now be obtained, in principle, which tells you which properties of a texture (with each MipLevel) will not be used for each individual texel, which is huge, even with a reduced way). This is of course great to hear about which, if the already loaded (resident) parts of a texture want to become scarce and you have to reload more tiles (and it also says which ones) and what you can throw out of the memory. Without sampler feedback that's a bit worried, in fact. Of course, the fact that not everything has to be in storage is also gilded.
Oh yes, trilinear filtering usually leads to the expiring transition between the MipLeveln.

That's is the true gamechanger. Sony removed all bottlnecks. MS did here sadly nothing.

PSA5-100-Mal-705x409.jpg





PS5 is several times faster and this was befor oodle

Show me the math that shows this.
 

Panajev2001a

GAF's Pleasant Genius
I’m not seeing anything definitive there. Maybe I’m just missing it. Isn’t it more then prt. It is also which textures are not even used in the frame?
The point of SF is the HW is keeping score of what is visible and what not / what the GPU tried to access that was not available and help the developer decide what texture data should be fetched next.

Not many people’s evening reading maybe, but there is SFS/SF docs here (MS has the best docs, as per usual): https://microsoft.github.io/DirectX-Specs/d3d/SamplerFeedback.html. I did completely miss an earlier instructions, added before SF support, that allows you to do a simple check whether a texture sample actually hit resident data (loaded in memory) or not for example:
Independently of whether sampler feedback is available, Direct3D12-based applications have, through tiled resources, the ability to progressively load the mip chain. There’s also the status bit available through any sample which can be plugged in to CheckAccessFullyMapped so that the app can detect tile residency, and opportunities for streaming in more texture data.

However, the strategy of tiled resources + CheckAccessFullyMapped or other residency determination mechanisms, to detect and load non-resident tiles on demand has room for improvement. That’s where sampler feedback comes in. Sampler feedback streamlines the process of writing out “ideal mip levels”, since the detection of the “ideal mip level” and writing-it-out can be done all in one step, allowing the driver to optimize this process.
… but again nobody was arguing SFS did not improve anything at all on top of PRT and earlier capabilities.
 
Last edited:

Corndog

Banned
The point of SF is the HW is keeping score of what is visible and what not / what the GPU tried to access that was not available and help the developer decide what texture data should be fetched next.

Not many people’s evening reading maybe, but there is SFS/SF docs here (MS has the best docs, as per usual): https://microsoft.github.io/DirectX-Specs/d3d/SamplerFeedback.html. I did completely miss an earlier instructions, added before SF support, that allows you to do a simple check whether a texture sample actually hit resident data (loaded in memory) or not for example:

… but again nobody was arguing SFS did not improve anything at all on top of PRT and earlier capabilities.
So the argument is how much improvement. And you are saying it’s not 2.5x. What would be your best guess?
 

Panajev2001a

GAF's Pleasant Genius
So the argument is how much improvement. And you are saying it’s not 2.5x. What would be your best guess?
I am not sure, there are many things that make me exclude the 2.5-3x average figure (it is immense for starter and curiously it matches exactly with how only loading portions of a texture/PRT saved memory bandwidth and storage costs). It depends on the dev and what they are trying to do, I do not have a good number to give you as we are not comparing two HW implementations equally documented but a HW solution (very accurate) vs something a developer would write (partially helped by other instructions in the GPU).
It could range from negligible beside some extra computation in the shader to a completely botched implementation that can even get you a 10x difference (their own docs state such scenario are specially crafted corner cases to show a BAD implementation of PRT… SFS helps you avoid many easy to make mistakes).
 
Last edited:

Dodkrake

Banned
Yes and also some people completely forget that prt+ is a thing, which is prt and sf combined, which accidentally is exactly what sfs is, except anyone is free to use it. Difference is that it’s not built in the hardware. So if you use the sfs multiplier for the xsx with some mental gymnastics you should do the same to ps5. So lets see, 17GB/s times 3 equals 51GB/s. Or no lets take the 22GB/s and times 3 equals 66gb/s. Oh my, look at that. The gap only widens in ps5’s favor.

Magic GIF by The Paley Center for Media

Multipliers sound better. With multipliers you can always pull any number out of your ass and it will look better. R&C shows Sony walking the walk.
 

quest

Not Banned from OT
The point of SF is the HW is keeping score of what is visible and what not / what the GPU tried to access that was not available and help the developer decide what texture data should be fetched next.

Not many people’s evening reading maybe, but there is SFS/SF docs here (MS has the best docs, as per usual): https://microsoft.github.io/DirectX-Specs/d3d/SamplerFeedback.html. I did completely miss an earlier instructions, added before SF support, that allows you to do a simple check whether a texture sample actually hit resident data (loaded in memory) or not for example:

… but again nobody was arguing SFS did not improve anything at all on top of PRT and earlier capabilities.
You were not but there are plenty saying it is only pr ie no improvements.
 

Boglin

Member
You were not but there are plenty saying it is only pr ie no improvements.
I think trolls are so abundant on either side that constructive research has been drowned out for a long time on this forum. Unfortunately to a non-techy, what a troll says and what a genuine person says might not appear much different so they'll follow their bias.

The past few days have been surprisingly good compared to the past with not too many disingenuous people shitting up the conversation.
 
Last edited:
The only XSX unique selling point is supposedly that the texture filters handle the special case better when a certain MipLevel is not in the VRAM. In practice, presumably an implicit check of the residency map and clamps of the MipLevel on the existing ones. That probably doesn't knock people that much off their feet

The (fine-grained loading / unloading of textures) works very well with Tiled Resources (granularity is a tile = page). Strictly speaking, the description there is simply wrong. With sampler feedback you get more and better information about which parts of a texture are used. This can be used to optimize your texture streaming. And that works even without exclusive features. Nobody there has mentioned anything that would not work with the well-known sampler feedback.
And if you don't believe me, you can read the description of the feature for DirectX12 at Microsoft. They also describe how you can use the sampler feedback for texture streaming and call it: "Sampler Feedback for Streaming" or SFS for short. Apparently no exclusive feature of the XBSX but also available on the desktop https://github.com/microsoft/DirectX-Specs/blob/master/d3d/SamplerFeedback.md



It is also best if you imagine it roughly like the virtual memory management of CPUs with pagetables. One side (a memory block of often 4kB for CPUs) and 64kB for GPUs (but in principle is also variable) is a certain resident resident in memory or not. The assignment of the virtual address to the physical address and also the information as to whether the block is in memory or not is in the pagetable. With the granularity of a page, part of a texture may or may not be selectively present in memory.
With sampler feedback, a "map" can now be obtained, in principle, which tells you which properties of a texture (with each MipLevel) will not be used for each individual texel, which is huge, even with a reduced way). This is of course great to hear about which, if the already loaded (resident) parts of a texture want to become scarce and you have to reload more tiles (and it also says which ones) and what you can throw out of the memory. Without sampler feedback that's a bit worried, in fact. Of course, the fact that not everything has to be in storage is also gilded.
Oh yes, trilinear filtering usually leads to the expiring transition between the MipLeveln.

That's is the true gamechanger. Sony removed all bottlnecks. MS did here sadly nothing.

PSA5-100-Mal-705x409.jpg





PS5 is several times faster and this was befor oodle


You're missing a larger point. There's one thing better than a faster SSD, and it's more available RAM. This isn't meant to compare to the PS5, but you fail to see where Microsoft chose to focus and how what they targeted with their asset streaming strategy compared to how Sony have publicly told us they've targeted asset streaming could turn out to be pretty fucking smart.

Microsoft placed a premium on getting much better than normal system memory efficiency, to the point where the end result is that their SSD can get away with doing 2.5x less work yet still get the same exact results onto screen. It can be slower and still not matter if it's doing less work anyway. That's not me saying what Sony is doing isn't smart, but in this particular case it isn't as cut and dry as Sony has a faster SSD, so nothing Microsoft does matters.

If it were a foot race Microsoft isn't concerned about outrunning anybody, but thinking about how they can make it so they can still finish faster or close to the same time as the others while running a shorter distance.
 
I think trolls are so abundant on either side that constructive research has been drowned out for a long time on this forum. Unfortunately to a non-techy, what a troll says and what a genuine person says might not appear much different so they'll follow their bias.

The past few days have been surprisingly good compared to the past with not too many disingenuous people shitting up the conversation.

Yes, it has been refreshing. I'm starting to see gif wars though, so that means it's heating up. :messenger_grinning_sweat:
 
My take is who would know best how PRT operates across Xbox One better than Microsoft? Maybe most PRT solutions aren't top draw, and apparently even then some of the best still end up over-streaming to be safe, which might make something like SFS that much more important. Microsoft built special monitoring hardware inside Xbox One X specifically for the purpose of monitoring how games are using memory.

They kinda already have all the information they need, and after getting that information, they went out to go PRT, bad PRT, no PRT and probably everything in between across the entirety of Xbox One X games.


"However, these larger mipmaps require a significant amount of memory compared to the lower resolution mips that can be used if the object is further away in the scene. Today, developers must load an entire mip level in memory even in cases where they may only sample a very small portion of the overall texture. Through specialized hardware added to the Xbox One X, we were able to analyze texture memory usage by the GPU and we discovered that the GPU often accesses less than 1/3 of the texture data required to be loaded in memory. A single scene often includes thousands of different textures resulting in a significant loss in effective memory and I/O bandwidth utilization due to inefficient usage. With this insight, we were able to create and add new capabilities to the Xbox Series X GPU which enables it to only load the sub portions of a mip level into memory, on demand, just in time for when the GPU requires the data. This innovation results in approximately 2.5x the effective I/O throughput and memory usage above and beyond the raw hardware capabilities on average. SFS provides an effective multiplier on available system memory and I/O bandwidth, resulting in significantly more memory and I/O throughput available to make your game richer and more immersive."

Figured these tweets were useful, so just collected them in one place. Love when these guys engage about this stuff.


hxYMT0v.jpg


nXaQC8Q.jpg


vk36eEr.jpg

Ys8BLZ6.jpg
 

Lethal01

Member
You're missing a larger point. There's one thing better than a faster SSD, and it's more available RAM. This isn't meant to compare to the PS5, but you fail to see where Microsoft chose to focus and how what they targeted with their asset streaming strategy compared to how Sony have publicly told us they've targeted asset streaming could turn out to be pretty fucking smart.

Microsoft placed a premium on getting much better than normal system memory efficiency, to the point where the end result is that their SSD can get away with doing 2.5x less work yet still get the same exact results onto screen. It can be slower and still not matter if it's doing less work anyway. That's not me saying what Sony is doing isn't smart, but in this particular case it isn't as cut and dry as Sony has a faster SSD, so nothing Microsoft does matters.

If it were a foot race Microsoft isn't concerned about outrunning anybody, but thinking about how they can make it so they can still finish faster or close to the same time as the others while running a shorter distance.

This is what I keep hearing and the thing I want some hard statements on, How much better than "normal" is the Series X and what exactly do they consider "normal"? Is "normal" actually referring to a software implementation of PRT? or none at all?

I'd love nothing more than to hear them confirm that they are comparing against previous PRT implementations, but from the demo, you keep posting it would seem they are comparing against a system that loading in full textures. which could also seem to imply that they didn't really choose to focus in a different direction than Sony since we know PS5 wouldn't be loading in the full texture when it only needs a portion of it.

Unless maybe PRT creates absolutely terrible-looking artifacts if you don't have the filters that they have implemented? Or maybe the overhead from doing it with software is gigantic? Or is the difference that the old PRT would load in the whole texture then discard what you don't need which makes the memory footprint the same in the end but would have a far longer load time? Or is PRT really so imprecise with special hardware that it loads in 3x more than what's needed?
 
Last edited:
This is what I keep hearing and the thing I want some hard statements on, How much better than "normal" is the Series X and what exactly do they consider "normal"? Is "normal" actually referring to a software implementation of PRT? or none at all?

I'd love nothing more than to hear them confirm that they are comparing against previous PRT implementations, but from the demo, you keep posting it would seem they are comparing against a system that loading in full textures. which could also seem to imply that they didn't really choose to focus in a different direction than Sony since we know PS5 wouldn't be loading in the full texture when it only needs a portion of it.

Unless maybe PRT creates absolutely terrible-looking artifacts if you don't have the filters that they have implemented? Or maybe the overhead from doing it with software is gigantic? Or is the difference that the old PRT would load in the whole texture then discard what you don't need which makes the memory footprint the same in the end but would have a far longer load time? Or is PRT really so imprecise with special hardware that it loads in 3x more than what's needed?

All I know is Microsoft analyzed what games were doing on Xbox One X, they put in special monitoring hardware specifically to analyze how games were using memory for textures, and they said they noticed that the GPU quite often utilized less than 1/3rd of all the textures in memory over extended periods of time. They keep stating that compared to what Xbox devs were using up until now, SFS will get developers an average 2.5x multiplier effect on system memory and I/O bandwidth.

In the real-time demo they've shown they had a bar simulating how texture streaming worked on xbox one x, and then they had another bar representing a 9th gen console running XVA with an SSD, but without SFS. Pretty much it was the Series X bar wiithout SFS. They seem to be claiming confidently they will get a 2.5x multiplier effect on both system RAM capacity as well as effective I/O performance through the SSD because of that 2.5x.

They also went so far as to say that despite their demo for SFS being relatively simple, they still say confidently that an AAA title with many more objects and complex surfaces or textures will still prove their 2.5x multiplier true. So that's the number they are confidently sticking to and what all their engineers and Microsoft Advanced Technology Group engineers keep telling people. They've been saying this number since March of last year, then reiterated it with DF in their full spec reveal, and have since followed it up at Hotchips and other major tech events they've attended.
 

Lethal01

Member
All I know is Microsoft analyzed what games were doing on Xbox One X, they put in special monitoring hardware specifically to analyze how games were using memory for textures, and they said they noticed that the GPU quite often utilized less than 1/3rd of all the textures in memory over extended periods of time. They keep stating that compared to what Xbox devs were using up until now, SFS will get developers an average 2.5x multiplier effect on system memory and I/O bandwidth.

In the real-time demo they've shown they had a bar simulating how texture streaming worked on xbox one x, and then they had another bar representing a 9th gen console running XVA with an SSD, but without SFS. Pretty much it was the Series X bar wiithout SFS. They seem to be claiming confidently they will get a 2.5x multiplier effect on both system RAM capacity as well as effective I/O performance through the SSD because of that 2.5x.

They also went so far as to say that despite their demo for SFS being relatively simple, they still say confidently that an AAA title with many more objects and complex surfaces or textures will still prove their 2.5x multiplier true. So that's the number they are confidently sticking to and what all their engineers and Microsoft Advanced Technology Group engineers keep telling people. They've been saying this number since March of last year, then reiterated it with DF in their full spec reveal, and have since followed it up at Hotchips and other major tech events they've attended.

that's fair and I've heard that too, guess I'm just asking for information we don't really have yet. so for now I won't be expecting. But a lot of the talk going on made me feel like I was missing info that everyone had which is why they were saying that Microsoft's recent improvements are revolutionary rather than PRT getting wide adoption being revolutionary.
 
Last edited:

IntentionalPun

Ask me about my wife's perfect butthole
Yes and also some people completely forget that prt+ is a thing, which is prt and sf combined, which accidentally is exactly what sfs is, except anyone is free to use it. Difference is that it’s not built in the hardware. So if you use the sfs multiplier for the xsx with some mental gymnastics you should do the same to ps5. So lets see, 17GB/s times 3 equals 51GB/s. Or no lets take the 22GB/s and times 3 equals 66gb/s. Oh my, look at that. The gap only widens in ps5’s favor.

Magic GIF by The Paley Center for Media
Got any links to documentation of any other PRT+ impelementations?

The only thing that comes back is Microsoft's own reference doc for SFS..

Where are these games using PRT+ or hardware w/ SFS?

I'm not saying it doesn't exist.. but I don't think it's some prevalent thing.
 

Heisenberg007

Gold Journalism
Hence the use of the word "effective"

If you request 12GB of texture data for your game, but SFS makes it so that to get the same visual result on screen, you only need 2.5x less than that 12GB, which leaves you with only 4.8GB of data to load due to efficiency of SFS, what would you call such speed?

The SSD speed isn't changing from 2.4GB/s and neither is the compressed speed changing from 4.8GB/s, but you know what has changed? You've gotten 2.5x more effective work done in way less time, it's on screen, and it only took Series X 1 second to do it with SFS.

Microsoft actually HAS said this, and has been saying this from the start, but nobody seemed to take them serious when they said it. They've said from the start Sampler Feedback Streaming gives us a multiplier effect on our SSD bandwidth and I/O.

This guy below works for Microsoft's Advanced Technology Group. He's Graphics R&D at Microsoft and is personally working on Xbox's Texture Compression stuff and other graphics technologies, such as Sampler Feedback Streaming. So he personally has said this.



Here are more examples from Microsoft. Jason Ronald said this last year July.




The coalition technical director said this.



Here is Jason Ronald saying it's a massive boost to I/O in the inside Xbox Series S video revealing Series S. What do we think he means when he says "massive leap in IO and memory efficiency"






Here again they said it as early as March of Last year. They've always been saying it, but people were so caught up in how impressive Sony's solution is and how much faster their SSD is compared to the Series X SSD that people laughed off Microsoft talking about Sampler Feedback Streaming as just marketing or PR, or "power of the cloud" talk. But these are the engineers and the technical people who have been saying this stuff now and shouting it from the roof tops, not just the PR folks.



GTxqvTM.jpg

I don't disagree with this. But I think you'll also agree with me that it's misleading to use terms like 12 Gb/s, especially when used in reference to PS5 comparison, because on Xbox it's not literally 12 Gb/s data streaming.

SFS is allowing to carefully pick the right data to stream -- which is excellent. But after that, it still streams at 4.8 Gb/s at best.

The reason it's misleading to use that number is that PS5 can just stream at more than 12 Gb/s if the dev has used Kraken + Oodle. The SSD's theoretical peak is 22 Gb/s.

I think the right way to phrase this is that SFS helps with selecting the right data with 3x more efficiency. And then Xbox's SSD and I/O, with the help of all VA components, stream data up to 4.8 Gb/s.
 

Panajev2001a

GAF's Pleasant Genius
I don't disagree with this. But I think you'll also agree with me that it's misleading to use terms like 12 Gb/s, especially when used in reference to PS5 comparison, because on Xbox it's not literally 12 Gb/s data streaming.

SFS is allowing to carefully pick the right data to stream -- which is excellent. But after that, it still streams at 4.8 Gb/s at best.

The reason it's misleading to use that number is that PS5 can just stream at more than 12 Gb/s if the dev has used Kraken + Oodle. The SSD's theoretical peak is 22 Gb/s.

I think the right way to phrase this is that SFS helps with selecting the right data with 3x more efficiency. And then Xbox's SSD and I/O, with the help of all VA components, stream data up to 4.8 Gb/s.
On both PS5 with PRT (whatever enhancement they have for it) and XSX|S with SFS you talk about equivalent bandwidth. I am not finding it misleading, but sure some caveats should be mentioned IMHO.

PS5’s SSD reads 5.5 GV/s, XSX|S’s SSD reads 2.4 GB/s. Each may decompress the data in the custom I/O block they paired with the SSD and inflate it (average output of the I/O decoder block is 8-9 GB/s without Oodle Texture preprocess on PS5 [with Oodle Texture preprocessing it can boost compression rates and thus the output rate to 12-16 GB/s and more] and maximum output is 22 GB/s, while on XSX|S the average output after decompression is 4.8 GB/s with a maximum of over 6 GB/s).

PRT and SFS tell you how efficiently you can put those I/O decoder unit’s GB/s of output to use or rather the equivalent bandwidth you would need to have to transfer the data you need in the same interval of time as PRT/SFS based solutions which avoid transferring unnecessary data would. SFS|PRT also help to minimise the RAM used to load textures the GPU does not need to render.
 
Last edited:

Heisenberg007

Gold Journalism
On both PS5 with PRT (whatever enhancement they have for it) and XSX|S with SFS you talk about equivalent bandwidth. I am not finding it misleading, but sure some caveats should be mentioned IMHO.

PS5’s SSD reads 5.5 GV/s, XSX|S’s SSD reads 2.4 GB/s. Each may decompress the data in the custom I/O block they paired with the SSD and inflate it (average output of the I/O decoder block is 8-9 GB/s without Oodle Texture preprocess on PS5 [with Oodle Texture preprocessing it can boost compression rates and thus the output rate to 12-16 GB/s and more] and maximum output is 22 GB/s, while on XSX|S the average output after decompression is 4.8 GB/s with a maximum of over 6 GB/s).

PRT and SFS tell you how efficiently you can put those I/O decoder unit’s GB/s of output to use or rather the equivalent bandwidth you would need to have to transfer the data you need in the same interval of time as PRT/SFS based solutions which avoid transferring unnecessary data would. SFS|PRT also help to minimise the RAM used to load textures the GPU does not need to render.
I understand, but it becomes a lot more problematic when we ignore said caveats during discussions.

If someone mentions that "PS5 SSD and I/O can stream up to 12-16 Gb/s", someone would come and say, "Well, because of SFS, Xbox can also effectively stream data at 12 Gb/s." You would agree that this is completely misleading and wrong.

In addition, the liberal use of multipliers makes it even more complicated. The 2.5x multiplier is used at will on top of anything and everything. I saw calculations up to 33 Gb/s data streaming on Xbox. :messenger_expressionless::messenger_tears_of_joy:
 

Panajev2001a

GAF's Pleasant Genius
You're missing a larger point. There's one thing better than a faster SSD, and it's more available RAM. This isn't meant to compare to the PS5, but you fail to see where Microsoft chose to focus and how what they targeted with their asset streaming strategy compared to how Sony have publicly told us they've targeted asset streaming could turn out to be pretty fucking smart.

Microsoft placed a premium on getting much better than normal system memory efficiency, to the point where the end result is that their SSD can get away with doing 2.5x less work yet still get the same exact results onto screen. It can be slower and still not matter if it's doing less work anyway. That's not me saying what Sony is doing isn't smart, but in this particular case it isn't as cut and dry as Sony has a faster SSD, so nothing Microsoft does matters.

If it were a foot race Microsoft isn't concerned about outrunning anybody, but thinking about how they can make it so they can still finish faster or close to the same time as the others while running a shorter distance.
I am not saying what MS is doing is not smart and that the have chosen lesser solutions, just that they do not close the gap with PS5 as they were not designed to.
I personally do not think MS agrees with Sony that they would have to setup such a high target nor that third party devs would have an easy time gaming it in the first few years and before they as MS plan a new HW refresh (SFS lowers the burden of efficient virtual texturing and it is very important and difficult to make things both easier and more efficient) and thus invested the money Sony spent in the custom SSD+custom SSD controller + custom I/O block elsewhere and avoided having to ramp up GPU frequency that high (so they had an easier time achieving their cooling plans and a super silent console), but there is no data that suggest it gains that much efficiency as you are implying… although I do agree on the angle of you restrict it to XSX vs XOX exactly as you said it for example, it is not overselling it IMHO.

You keep saying that you are comparing it to XOX (which XVA will allow it to run circles around despite a tiny RAM increase) and then slowly go back to talking how it actually matches or beats PS5 which is not something that offends me, it is just something I think is inaccurate and would seem to grossly oversell XVA for some reason (the I/O gap between the two consoles is still there, but sure for all the devs that have a hard time implementing virtual texturing correctly or efficiently SFS could definitely help… again should not assume that harder != impossible to get near the same improvements).
 

Panajev2001a

GAF's Pleasant Genius
PRT getting wide adoption being revolutionary.
Which is IMHO the case. SFS meaning more devs can make use of PRT textures without having to do as much work as before also contributes to the wide adoption of the tech and it is a worthwhile goal (in many many cases in tech you choose between easy vs fast, here it seems like a tragedy to say that getting easy+fast is awesome enough :)… it is not a slight to it to say they are not getting easy+tons faster than before).
 
  • Like
Reactions: Rea

IntentionalPun

Ask me about my wife's perfect butthole
Microsoft was not comparing to games w/ PRT on Xbox One.

You can tell that by looking at the slide right before the demo of SFS.. where they outright state PRT was rare last generation.

Which I believe it was.. the idea that most games were doing PRT last gen seems to be a bit of a straw man. The tech was developed and in the chips for last gen, and in PC GPUs, but the i/o limitations held it back.. iD was one of the only companies to do anything decent with it, as they had experience with that tech even on the PS3 gen, but the same I/O limitations existed between PS3/PS4, and Rage 2 still had pop-in issues fairly apparent.

Look at the slide that appears RIGHT before the SFS demo:

fg1Nf5M.png
 
Last edited:

Panajev2001a

GAF's Pleasant Genius
If someone mentions that "PS5 SSD and I/O can stream up to 12-16 Gb/s", someone would come and say, "Well, because of SFS, Xbox can also effectively stream data at 12 Gb/s."
… and if people really want to play a game like that then you ram that 12-16 GB/S number and multiply it by ~2.5x too as you are using partially resident textures too and thus not transferring whole textures but only the regions that the game needs to have in memory. Now you are comparing apples to apples.

We have also seen that, well this is the argument Senjutsu brought forward, that SFS is not automatic and needs developers to support it and in his opinion few or none have right now in production so you either compare both numbers just taking compression into account (nobody is talking about latency which can be a much bigger difference maker too) or you add a very similar multiplication factor for partial textures streaming to both (use a slightly lower multiplier for PS5 if you really want to, not that I think you must, but you still have a very large gap).
 
Last edited:

Heisenberg007

Gold Journalism
… and if people really want to play a game like that then you ram that 12-16 GB/S number and multiply it by ~2.5x too as you are using partially resident textures too abd thus not transferring whole textures but only the regions that the game needs to have in memory. Now you are comparing apples to apples.

We have also seen that, well this is the argument Senjutsu brought forward, that SFS is not automatic and needs developers to support it and in his opinion few or none have right now in production so you either compare both numbers just taking compression into account (nobody is talking about latency which can be a much bigger difference maker too) or you add a very similar multiplication factor for partial textures streaming to both (use a slightly lower multiplier for PS5 if you really want to, not that I think you must, but you still have a very large gap).
I just think even that would be disingenuous, as adding a 2.5x multiplier on that 12-16 Gbps on PS5 would take it to 30-40 Gbps, but that would be the speed of data streaming in its purest sense.
 
that's fair and I've heard that too, guess I'm just asking for information we don't really have yet. so for now I won't be expecting. But a lot of the talk going on made me feel like I was missing info that everyone had which is why they were saying that Microsoft's recent improvements are revolutionary rather than PRT getting wide adoption being revolutionary.

The foundation is in PRT, and that's totally fine because it has its use. Microsoft clearly had a plan, or something they wanted to study some more. They laid the groundwork with Xbox One X when they installed some monitoring hardware to analyze texture usage in games on the Xbox One X GPU. They gathered what had to be a metric shit ton of useful data across a wide swath of titles, and clearly felt they need to work on a solution. So they went ahead and planned and built in the hardware and functionality designed to solve the problem they wanted to solve, and in doing so it would ease adoption and help studios make better games easier.

Part of an existing or known technique, upcoming new technique that would have been standard across newer GPU anyway, who knows. But they obviously took what they learned, build their console around it, added unique customizations not found elsewhere, and built an API to manage it all and expose it to game developers, and they built a pretty impressive demo to showcase all the possible benefits.
 

BeardGawd

Banned
Got any links to documentation of any other PRT+ impelementations?

The only thing that comes back is Microsoft's own reference doc for SFS..

Where are these games using PRT+ or hardware w/ SFS?

I'm not saying it doesn't exist.. but I don't think it's some prevalent thing.

This of course will go ignored.

There aren't many games using PRT. And the ones that did weren't using Sampler Feedback and thus not as efficient.

Last gen I/O was orders of magnitude slower so you would think there'd be even more incentive to use more efficient streaming... but there is a reason why hardly any game used PRT and that's because it wasn't easy or practical to use. With SFS that will change for Xbox and PC as MS has taken the time and money to develop a competent solution. If Playstation devs want to use PRT and Sampler Feedback they will have to build their own solution just like last gen unless Sony developed their own PRT+ equivalent? Personally I don't think it's outside the realm of reason to assume Sony thought their I/O stack was already fast enough as is.
 

muteZX

Banned
This of course will go ignored.

There aren't many games using PRT. And the ones that did weren't using Sampler Feedback and thus not as efficient.

Last gen I/O was orders of magnitude slower so you would think there'd be even more incentive to use more efficient streaming... but there is a reason why hardly any game used PRT and that's because it wasn't easy or practical to use. With SFS that will change for Xbox and PC as MS has taken the time and money to develop a competent solution. If Playstation devs want to use PRT and Sampler Feedback they will have to build their own solution just like last gen unless Sony developed their own PRT+ equivalent? Personally I don't think it's outside the realm of reason to assume Sony thought their I/O stack was already fast enough as is.

Granite SDK is the fastest, most efficient and most complete texture streaming middleware available. It is designed to integrate easily into any 3D engine. Granite breaks down texture mip maps into small tiles, automatically detects what tiles are visible for every frame and loads only those tiles into memory.

PS4 Games Could See Increased Visual Fidelity, Lesser Load Times, Granite’s Texture Streaming Detailed.


John Carmack has said that he will implement Partially Resident Textures in Doom 4. Both the PlayStation 4 and Xbox One support this technology via hardware.
 

Panajev2001a

GAF's Pleasant Genius
I just think even that would be disingenuous, as adding a 2.5x multiplier on that 12-16 Gbps on PS5 would take it to 30-40 Gbps, but that would be the speed of data streaming in its purest sense.
I get that, it is not real transfer speed from the SSD nor decompression rate from the I/O decoder units (zlib/BCPack vs Kraken), but it is useful to understand the equivalent bandwidth you would need without such features if you needed to transfer the same amount of textures.

PRT and SFS lower the RAM needed for textures by not requiring you to store unnecessary data which of course means you do not need to transfer it either. This is an effective bandwidth boost.

It is as disingenuous (well unless taken and explained with a pinch of salt) as discussing about compression ratios in a way.
 

IntentionalPun

Ask me about my wife's perfect butthole
Granite SDK is the fastest, most efficient and most complete texture streaming middleware available. It is designed to integrate easily into any 3D engine. Granite breaks down texture mip maps into small tiles, automatically detects what tiles are visible for every frame and loads only those tiles into memory.

PS4 Games Could See Increased Visual Fidelity, Lesser Load Times, Granite’s Texture Streaming Detailed.


John Carmack has said that he will implement Partially Resident Textures in Doom 4. Both the PlayStation 4 and Xbox One support this technology via hardware.

iD's tech was one of the only one's trying it.. you just quoted a speculative article about an SDK, that didn't get wide use otherwise... I mean look at their massive client list:


And some of these uses were probably about using PC main memory as a pool for PRT from GPU memory, not streaming from totally random I/O speeds.

PRT was all the rage in 2012 when AMD added support for it to their chips, which ended up in last-gen consoles..

That's not evidence of wide support for it.
 
Last edited:

Panajev2001a

GAF's Pleasant Genius
This of course will go ignored.

There aren't many games using PRT. And the ones that did weren't using Sampler Feedback and thus not as efficient.

Last gen I/O was orders of magnitude slower so you would think there'd be even more incentive to use more efficient streaming... but there is a reason why hardly any game used PRT and that's because it wasn't easy or practical to use.
That mainly comes to RAM requirements being served a lot better by older consoles (XSX has only 2 GB of RAM more than XOX and 8 GB more than Xbox One) and the fact you had so so slow HDD’s which had very high latency I/O operations too so less incentive to wrestle with many requests of small files all over the place and spend shader cycles on top of it.

High latency (high seek times too) would encourage bigger batches / larger transfers not very fine grain texture streaming from disk: you would necessarily load a LOT more data than you need for this current frame: that is the whole point of the “seconds of gameplay stored in RAM” comment Cerny made.

Aside from few developers targeting the highest of the high end tech, Id had a hard on with a single fully unified virtual texturing solution they took to the extreme but few third parties and some first parties had art budgets that needed devs to extract every possible ounce of performance and RAM for unique texture data on screen.

With SFS that will change for Xbox and PC as MS has taken the time and money to develop a competent solution. If Playstation devs want to use PRT and Sampler Feedback they will have to build their own solution just like last gen unless Sony developed their own PRT+ equivalent? Personally I don't think it's outside the realm of reason to assume Sony thought their I/O stack was already fast enough as is.
Possibly they thought people using PRT (much more practical and necessary now) and whatever the HW supports was enough to allow most devs to implement as efficient virtual texturing and without using PRT it would also be fast enough… but... This assumes that:
1.) using PRT to deliver very similar bandwidth improvements were impossible or a daunting task for devs… or that somehow the consoles HAVE to be equal somehow

and

2.) Sony was not as concerned about reducing memory consumption with unnecessary data instead of maximising RAM use for what is on screen (contradicted by statements made by Cerny and other devs)… Both XSX and PS5 suffer from high RAM prices and receive the smallest generational RAM increase yet: I think that reducing waste/overhead was at the forefront of their mind and even if there were some practical problems in PS4’s 2013 PRT implementation it is reasonable to think they addressed it too as they went through the I/O end to end pipeline with a very fine tooth comb.

MS made it possibly even easier for devs without sacrificing performance and should be praised for that achievement, but that does not mean that we have to add additional multipliers on top to make them close the gap without evidence.
I do not think we can safely assume GPU’s have stopped improving the access they give to tiled resources/PRT capabilities until MS unveiled SF in their DX12U specs more than 10 years later (which is a multi vendor API) :).
 

muteZX

Banned
iD's tech was one of the only one's trying it.. you just quoted a speculative article about an SDK, that didn't get wide use otherwise... I mean look at their massive client list:


And some of these uses were probably about using PC main memory as a pool for PRT from GPU memory, not streaming from totally random I/O speeds.

PRT was all the rage in 2012 when AMD added support for it to their chips, which ended up in last-gen consoles..

That's not evidence of wide support for it.

Unreal and Unity plugin is functional and ready ..

"For an HD screen, Granite requires roughly 650MB of VRAM at any given time, no matter how much texture content you use. You can have 20GB of textures for one single character or thousands of 4K PBR assets. Years of optimization work ensures that Granite has a limited and predictable impact on your framerate."


8wl1rua.png



UE5 demo - 768 mega streaming pool ..



Lets say that SONY and Epic are not idiots and situation around effective "PRT like" streaming is fully rectified ..
 

Darius87

Member
Unreal and Unity plugin is functional and ready ..

"For an HD screen, Granite requires roughly 650MB of VRAM at any given time, no matter how much texture content you use. You can have 20GB of textures for one single character or thousands of 4K PBR assets. Years of optimization work ensures that Granite has a limited and predictable impact on your framerate."


8wl1rua.png



UE5 demo - 768 mega streaming pool ..


Lets say that SONY and Epic are not idiots and situation around effective "PRT like" streaming is fully rectified ..
that's just for geometry only doesn't tell whole story.
 

muteZX

Banned
that's just for geometry only doesn't tell whole story.

I will show you something .. watch ..

0010101011001010101101101000110

0111011011100010000101101110111

One of those lines is for textures, another for geometry .. see .. for SSD IOP/streaming engine its just a bunch of bits, It doesnt matter.
 

Darius87

Member
I will show you something .. watch ..

0010101011001010101101101000110

0111011011100010000101101110111

One of those lines is for textures, another for geometry .. see .. for SSD IOP/streaming engine its just a bunch of bits, It doesnt matter.
you avare that it says Nanite right?
 

muteZX

Banned
768 mb for u5 is just for streaming geometry(nanite) get it? add textures and you'll have many gigs more get it? :messenger_grinning_smiling:
ahaaaaaaaaaaa .. but no .. streaming pool is a streaming pool .. just a little bigger /textures, audio/.

I don't expect the future PS5 game to change 8 giga textures per frame in main RAM every 1/60 second.

KUtFJoh.png


Killzone PS4 ..
 
Last edited:
Top Bottom