Even with DirectStorage multiple latency problems remains. However, it will be an improvement, To really unleash the speed of a good SSD, the PC plattform needs to relinquish CPU control over I/O traffic both on the motherboard as well as across the PCIe bus. And that won't happen due to the security issues that would result in.
No, no, no. You don't need to do that, where did you get this idea from? PCIe 4.0 and 5.0 already enable byte-addressability schemes for NAND storage devices, thanks to things like SAM/BAR, etc.
The "security" thing you seem to be referring to is the lack of direct-memory access (DMA) in lieu of the current paging system used for unified address translation. However, that's exactly what things like DirectStorage are reforming, and you don't need to have the CPU relinquish any control if the CPU has the spare resources to handle those functions on CPU cores and threads.
From a gaming point of view, it is much more realistic that there will be small SSDs on high-end GPU cards where it can act as a slow VRAM pool without all the normal PC limitations once the SSD prices have gone down.
Are you forgetting about write endurance cycles? Page/block schemes for NAND vs. RAM? You won't have SSDs integrated on high-end GPUs (or any GPUs) because of reasons like that. If you're thinking of the NAND being used as a fast access pool for data to put into GPU VRAM, well GPUs already have that with things like GPUDirectStorage. Heck, Vega had this with HBCC (which is redundant now thanks to byte-addressability of PCIe 4.0 and later, SAM/BAR, and CXL (this is something only in enterprise spaces right now)).
Personally I think NAND's usage in future systems will be as ROM-like memory with enforced cache coherency on future interconnects like CXL 2.0/3.0 (maybe some of those features getting integrated in PCIe 5.0 revisions or 6.0), with smart caching schemes taking into account predictive use of processed data to ease on future processing of cumulative data results (freeing up bandwidth, memory accesses (aka energy usage), etc.).
Also, possibly locating memory close to processing components, which for GPUs
could involve M.2 slot interfaces with FMC ASICs built on them, for removable drives connected to the GPU directly. But soldered NAND modules? No way.
PC has had PCI-E 4.0 for how many years now? Tell me one game that loads in 1 or 2 seconds. When PC's take advantage of PCI-E 5.0, new consoles will be out by then.
Gotta keep in mind these new standards aren't being driven by the mass consumer side of the tech market; they're being driven by enterprise, medical, military, scientific research etc. fields.
These environments use tons of hardware clusters than need faster and better interconnects between clusters, various shared memory pools, etc. Open-standard interfaces like CXL can be layered on top of PCIe 4.0 and 5.0, and there are other standards like RapidIO that offer competitive features to those. Yet others like OMI (based on OpenCAPI) bring unified memory addressing schemes and low latency to various storage, RAM, and processor clusters (you can have a central chiplet-based processor for example with decoupled memory controllers and buffer chips providing OMI links to HBM2E memory stacks with OMI logical layer integration, other links for CXL connections layered on top of PCIe, others for fabric interconnects, etc.).
This stuff is WAYYY bigger than just the console gaming or PC gaming markets. In fact, every single feature the new consoles have in terms of various GPU abilities and even the storage I/O, have been in various enterprise and big-tech aerospace/military/medical etc. mass-computing environments for years if not over a decade & longer. Gaming consoles and PC GPUs are just getting the trickle-down benefits.