• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Forspoken Load Times Are Barely A Second Long On PS5, Has Multiple Graphics Modes

S0ULZB0URNE

Member
Can't you enhance the nipples with the power of the cloud?
Tecmo wants no part of Xbox these days.

So no jiggle fun on there
 

VFXVeteran

Banned
The render equation in almost anything 3D can be replaced with simple or complex pre-calculation and lookups IMO, like lightmaps, shadowmaps, cubemaps, and so maybe BVH structures - if the foreground frustum cascade chunk can be reduced to a base BVH with a smaller diff BVH for the dynamically moving parts positions/orientations - can also be pre-calculated,for streaming, and if the camera position and orientation can also be known in advance, then even the result of the BVH traversal could be pre-calculated and streamed in too, as if it were dynamic.
You are still focusing on setup. I'm not talking about that. I'm talking about the light loop. There is nothing streamed in for iterating through lights and evaluating shader materials. That is, by far, the most expensive code in a render (realtime or offline).

for (each light)
{
if (ray tracing)
{
compute random ray direction from a hemisphere equation; // this can not be streamed
for (each ray computed)
{
cast a ray into the scene and determine : any constants + diffuse color + (specular color * fresnel factor) * normalization term; // this can not be streamed
if (global illumination)
{
cast a ray from previous ray-hit; // this can not be streamed
compute random ray direction for occlusion and bounced light; // this can not be streamed
compute : ambient occlusion + RT GI bounce; // this can not be streamed
}
}
}
}

This very simple algorithm (without any refraction, caustics, hair materials, procedural textures, etc..) has to be computed on a pixel/sub-pixel level after all the streaming is in memory.
 
Last edited:

VFXVeteran

Banned
Didn't Nsomniac claimed clearly that R&C is just possible on ps5 hardware? Even via Twitter they replied to the straight question adding that who said otherwise just hadn't a clue how really work. Forgive me but I'm leaning more to believe to them than some tech guys theoryze on the net.
You are completely missing the point. I was talking about the PC gamers not needing to worry about having an SSD for any games due to slow load speeds of game levels. They will be fast enough such that people won't need to complain. If the PS5 can load a level in 1 sec and PC players get load speeds of 20s, no PC gamer is going to care when after that level is loaded they are getting better FPS, better quality graphics, and higher resolutions.

In the case of R&C, they can easily put 2-3 levels in PC memory to allow instant teleporting back and forth. The game only teleports in one direction (i.e. 2 very small linear levels back and forth).
 
Last edited:

PaintTinJr

Member
You are still focusing on setup. I'm not talking about that. I'm talking about the light loop. There is nothing streamed in for iterating through lights and evaluating shader materials. That is, by far, the most expensive code in a render (realtime or offline).

for (each light)
{
if (ray tracing)
{
compute random ray direction from a hemisphere equation; // this can not be streamed
for (each ray computed)
{
cast a ray into the scene and determine : any constants + diffuse color + (specular color * fresnel factor) * normalization term; // this can not be streamed
if (global illumination)
{
cast a ray from previous ray-hit; // this can not be streamed
compute random ray direction for occlusion and bounced light; // this can not be streamed
compute : ambient occlusion + RT GI bounce; // this can not be streamed
}
}
}
}

This very simple algorithm (without any refraction, caustics, hair materials, procedural textures, etc..) has to be computed on a pixel/sub-pixel level after all the streaming is in memory.
I fully appreciate you writing the pseudo code out to illustrate the exact part of the algorithm you believe can't be pre-calculated, but even looking through and understand your point, I would still argue that even the "random" parts can be faked/cheated with pre-calculated cyclical lookups, say using a Manderlbrot Julia set image as the source of the random value - because even well seeded generators in computer programs aren't random - at all, deep down - because computation is fully deterministic if you just unroll the algorithm.

Cheating well enough - for the untrained gamer eye - has been a staple part of 3D games since the start, where discrete pre-calculated trig tables samples were used instead of using trig functions, despite the user view point orientation and direction needing to appear to be from any possible random quaternion.

I'm not saying I currently have the answers to pre-calculate parts or all of your algorithm computational hotspots, but the industry will find ways to use 2-3 frame check-in, massive IO with massive decompression, along with pre-calculation to cheat in ways we don't expect - if history of gaming is anything to go by.

Edit:
If you come back to me and say that the amount of pre-calculated data needed to fake a certain part of an algorithm - even when heavily compressed - would exceed the data size of 200GB, then I would concede that in this generation IO isn't going to be suited to substituting that computation, but even then I still wouldn't rule out refactoring that data into smaller pre-calculated intermediate data with some computation to replace the high complex computation.
 
Last edited:

winjer

Member
Disagree. According to our local experts, we can just load everything in to the PC's massive pool of memory.

If there is enough memory that would be faster than any SSD.
Even an average DDR4 3200 kit has a memory bandwidth of 51 GB/s. This is more than any SSD on the market.
And if we go to DDR5, these values increase a lot.
Now I'm just talking about system ram. If we talk about vram, these values go much higher.
But the greatest different is in access times. SSD's have access times in the mili seconds. DRam has access times in the nano seconds. That is two orders of magnitude greater.
If you are arguing that an SSD is faster than ram or vram, then you are very, very wrong.

In fact there are games where we can disable texture streaming and force everything into memory.
With UE all it takes is the command line -NOTEXTURESTREAMING
 

Shmunter

Gold Member
If there is enough memory that would be faster than any SSD.
Even an average DDR4 3200 kit has a memory bandwidth of 51 GB/s. This is more than any SSD on the market.
And if we go to DDR5, these values increase a lot.
Now I'm just talking about system ram. If we talk about vram, these values go much higher.
But the greatest different is in access times. SSD's have access times in the mili seconds. DRam has access times in the nano seconds. That is two orders of magnitude greater.
If you are arguing that an SSD is faster than ram or vram, then you are very, very wrong.

In fact there are games where we can disable texture streaming and force everything into memory.
With UE all it takes is the command line -NOTEXTURESTREAMING
Point of note, not necessarily related to your conversation. Ram/vram needs to be ultra fast because it is a computation target. To render a scene, read/write operations on the ram are in the millions per frame by the cpu/GPU. Loading an asset into ram from secondary storage is a one off operation and detached from computational requirements.

The faster the I/o, the faster new asset are availability. E.g Zoom in on a character during gameplay and an ultra detail model can be loaded in with the rest of the scene assets flushed allowing for cutscene quality characters in real-time depending on proximity of the camera, and vice versa when zooming out. Or enter a building and high quality interiors can be switched into ram.

Next gen I/o is fast enough to load these assets in real-time seamlessly during gameplay without pauses or load screens.

Just simple examples, but in a nutshell next gen I/o extends what is possible under regular ram constraints.
 

Whitecrow

Gold Member
The amazing thing of having an SSD that fast, is that you can swap between two fully realized worlds in an instant.

Of course, you could do the trick having 32GB of RAM and not touching the SSD. But since no dev will ever make a game assuming that, we come back to the standard 8/16 GB setup.

Here, you could have 2 or 3 half empty worlds for the trick, or you could have an ssd to swap between any fully featured worlds you want, so thats why R&C only works on SSDs, because they are nit sacrificing world detail to do any trick.
 

Shmunter

Gold Member
The amazing thing of having an SSD that fast, is that you can swap between two fully realized worlds in an instant.

Of course, you could do the trick having 32GB of RAM and not touching the SSD. But since no dev will ever make a game assuming that, we come back to the standard 8/16 GB setup.

Here, you could have 2 or 3 half empty worlds for the trick, or you could have an ssd to swap between any fully featured worlds you want, so thats why R&C only works on SSDs, because they are nit sacrificing world detail to do any trick.
Not to mention storing assets in ram that you may not see for a long time or ever is horrendously wasteful. And initial load times suffer too copying mass data upfront instead of on demand.
 

Mister Wolf

Member
The amazing thing of having an SSD that fast, is that you can swap between two fully realized worlds in an instant.

Of course, you could do the trick having 32GB of RAM and not touching the SSD. But since no dev will ever make a game assuming that, we come back to the standard 8/16 GB setup.

Here, you could have 2 or 3 half empty worlds for the trick, or you could have an ssd to swap between any fully featured worlds you want, so thats why R&C only works on SSDs, because they are nit sacrificing world detail to do any trick.

I would be surprised if the 4.8 GB/s I/O Direct Storage is capable of currently without GPU decompression couldn't run Ratchet and Clank on PC.
 
I highly doubt some actually care for loading times over graphics/performance. Especially in a fraction of a second in loading times, hence MY comment, and the comment I was replying to... I'd take a fraction of a second to have better graphics and performance, and I guarantee over 90% of Gaf will agree, no fanboyism would be needed to paint the obvious picture.
Play Gran Turismo 7 on PS5 for a week with those 2 second loading times...then try to go back to the 20/30 second loading times from the PS4 version and tell me people don't care about loading times again...
 

Whitecrow

Gold Member
I would be surprised if the 4.8 GB/s I/O Direct Storage is capable of currently without GPU decompression couldn't run Ratchet and Clank on PC.
DirectStorage still relies on the weakest link of the chain.

The speeds the api can handle doesnt make a difference if the source is still slow.

Although I might be mistaken since I dont know exactly how DirectStorage works.
 
Last edited:
I thought we bought game for a it's good gameplay or story or a combination of both not just it's loading times, the last gameplay they show was better showing but I don't like to play anymore anything like old school MMO that i have to smash 1000 times the keyboard to bring a boss down.
 
I want to play a game with fast and fluid movement and combat freedom with many skills and depend the skills you use there will be different reactions not standing still to cast a stupid fireball when you are supposed to be an unstoppable force.
 

assurdum

Member
You are completely missing the point. I was talking about the PC gamers not needing to worry about having an SSD for any games due to slow load speeds of game levels. They will be fast enough such that people won't need to complain. If the PS5 can load a level in 1 sec and PC players get load speeds of 20s, no PC gamer is going to care when after that level is loaded they are getting better FPS, better quality graphics, and higher resolutions.

In the case of R&C, they can easily put 2-3 levels in PC memory to allow instant teleporting back and forth. The game only teleports in one direction (i.e. 2 very small linear levels back and forth).
I'm not missed the point. You clearly said R&C was feasible on all platform. Then you retreat.
 
I highly doubt some actually care for loading times over graphics/performance. Especially in a fraction of a second in loading times, hence MY comment, and the comment I was replying to... I'd take a fraction of a second to have better graphics and performance, and I guarantee over 90% of Gaf will agree, no fanboyism would be needed to paint the obvious picture.
Nobody sane cares about 1 vs 2 sec of loading times. Or even 4 vs 10 seconds like with Elden Ring. The only difference that is actually impactful is the one between last gen and this gen consoles, where the delta can be up to one minute.
 

assurdum

Member
Nobody sane cares about 1 vs 2 sec of loading times. Or even 4 vs 10 seconds like with Elden Ring. The only difference that is actually impactful is the one between last gen and this gen consoles, where the delta can be up to one minute.
Sane? Really now?
 

Dick Jones

Gold Member
Agreed , playing Elden Ring and barely having enough time to let out a fart before it loads after I die is amazing. If only we could get a blood borne update . That game needs faster load time.
There is a way to reduce loading on Bloodborne. It's called git gud
Man Body GIF


Seriously I agree with you on loading times and Bloodborne. It fucking sucks, more so after being spoilt by the quick PS5 Demon's Souls loading.
 

VFXVeteran

Banned
I fully appreciate you writing the pseudo code out to illustrate the exact part of the algorithm you believe can't be pre-calculated, but even looking through and understand your point, I would still argue that even the "random" parts can be faked/cheated with pre-calculated cyclical lookups, say using a Manderlbrot Julia set image as the source of the random value - because even well seeded generators in computer programs aren't random - at all, deep down - because computation is fully deterministic if you just unroll the algorithm.

Cheating well enough - for the untrained gamer eye - has been a staple part of 3D games since the start, where discrete pre-calculated trig tables samples were used instead of using trig functions, despite the user view point orientation and direction needing to appear to be from any possible random quaternion.

I'm not saying I currently have the answers to pre-calculate parts or all of your algorithm computational hotspots, but the industry will find ways to use 2-3 frame check-in, massive IO with massive decompression, along with pre-calculation to cheat in ways we don't expect - if history of gaming is anything to go by.

Edit:
If you come back to me and say that the amount of pre-calculated data needed to fake a certain part of an algorithm - even when heavily compressed - would exceed the data size of 200GB, then I would concede that in this generation IO isn't going to be suited to substituting that computation, but even then I still wouldn't rule out refactoring that data into smaller pre-calculated intermediate data with some computation to replace the high complex computation.
That's just not going to happen. Some things can be cheated like procedural noise texture lookups, etc.. but we aren't moving more and more towards doing alot of precalculation - we are moving away from it more and more. It would be a nightmare pipeline for artists and it removes any kind of "natural" computation to happen in any analytical way (i.e. the rendering equation) with reasonable results for dynamic scenes. Besides, pre-computation's biggest weakness is accuracy. Storing lookups will take an enormous amount of memory in order to get even feasibly "OK" approximations that analytical solutions just don't suffer from.
 
Last edited:
Play Gran Turismo 7 on PS5 for a week with those 2 second loading times...then try to go back to the 20/30 second loading times from the PS4 version and tell me people don't care about loading times again...
Well I don't own last gen or next gen consoles, so I never had those long loading times, so that never affected me. But I definitely get your point, coming from last gen consoles.
Nobody sane cares about 1 vs 2 sec of loading times. Or even 4 vs 10 seconds like with Elden Ring. The only difference that is actually impactful is the one between last gen and this gen consoles, where the delta can be up to one minute.
Exactly. That was the big change in things. But several people in here are thinking PC was like last gen consoles, which is the furthest thing from the truth. We had short load times in every game, for the past decade or so.
 

PaintTinJr

Member
That's just not going to happen. Some things can be cheated like procedural noise texture lookups, etc.. but we aren't moving more and more towards doing alot of precalculation - we are moving away from it more and more. It would be a nightmare pipeline for artists and it removes any kind of "natural" computation to happen in any analytical way (i.e. the rendering equation) with reasonable results for dynamic scenes. Besides, pre-computation's biggest weakness is accuracy. Storing lookups will take an enormous amount of memory in order to get even feasibly "OK" approximations that analytical solutions just don't suffer from.
I hear what you are saying, and as an industry we are moving away from offline lightmass type pre-calculation but probably because the results aren't known to be reliable prior to an overnight calculation, so if there is an error the time is wasted. But pre-calculating data or results that might only take 1second at most to render on a PS5/XsX, so you can easily observe in real-time the identical scene running on more powerful graphics workstation wouldn't be a problem for the art pipeline, and Lumen's incremental lighting already seems to be a step in that direction already IMO.

Either way it will be interesting to see how things pan out. My stance is that you are underestimating the depths to which game developer will go to cheat and have a renderer that in the eyes of gamers blows the competition out of the water - which gives their game a massive chance of great success on technical merit alone. But I'll happily acknowledge my error at the end of the generation if the IO complex/VA or RTX IO fail to help lift the visual bar above the Teraflop specs of their respective hardware.
 

avin

Member
But the greatest different is in access times. SSD's have access times in the mili seconds. DRam has access times in the nano seconds. That is two orders of magnitude greater.

The difference between 1 millisecond and 1 nanosecond is 6 orders of magnitude. Looking up the actual numbers, the difference in access times is closer to 3 - 4 orders of magnitude.

But yes, I think that means your conclusion is still right.

avin
 
Last edited:

Hezekiah

Member
Well I don't own last gen or next gen consoles, so I never had those long loading times, so that never affected me. But I definitely get your point, coming from last gen consoles.

Exactly. That was the big change in things. But several people in here are thinking PC was like last gen consoles, which is the furthest thing from the truth. We had short load times in every game, for the past decade or so.
You sold your PS5 DE already - let me guess, Bloodborne, than straight onto Ebay? The moment was so fleeting it's like you never even had one:
 
Last edited:

metaverse

Gold Member
If there is enough memory that would be faster than any SSD.
Even an average DDR4 3200 kit has a memory bandwidth of 51 GB/s. This is more than any SSD on the market.
And if we go to DDR5, these values increase a lot.
Now I'm just talking about system ram. If we talk about vram, these values go much higher.
But the greatest different is in access times. SSD's have access times in the mili seconds. DRam has access times in the nano seconds. That is two orders of magnitude greater.
If you are arguing that an SSD is faster than ram or vram, then you are very, very wrong.

In fact there are games where we can disable texture streaming and force everything into memory.
With UE all it takes is the command line -NOTEXTURESTREAMING
 
Top Bottom