• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Samsung GDDR6W: Twice the Bandwidth and Density of GDDR6

winjer

Gold Member

High performance, high capacity and high bandwidth memory solutions are helping bring the virtual realm to a closer match with reality. To meet this growing market demand, Samsung Electronics has developed GDDR6W (x64): the industry's first next-generation graphics DRAM technology.

GDDR6W builds on Samsung's GDDR6 (x32) products by introducing a Fan-Out Wafer-Level Packaging (FOWLP) technology, drastically increasing memory bandwidth and capacity.

Since its launch, GDDR6 has already seen significant improvements. Last July, Samsung developed a 24 Gbps GDDR6 memory, the industry's fastest graphics DRAM. GDDR6W doubles that bandwidth (performance) and capacity while remaining the identical size of GDDR6. Thanks to the unchanged footprint, new memory chips can easily be put into the same production processes customers have used for GDDR6, with the use of the FOWLP construction and stacking technology, cutting manufacturing time and costs.

As shown in the picture below, since it can be equipped with twice as many memory chips in an identical size package, the graphic DRAM capacity has increased from 16Gb to 32Gb, while bandwidth and the number of I/Os has doubled from 32 to 64. In other words, the area required for memory has been reduced 50% compared to previous models.

Generally, the size of a package increases as more chips are stacked. But there are physical factors that limit the maximum height of a package. What's more, though stacking chips increases capacity, there is a trade-off in heat dissipation and performance. In order to overcome these trade-offs, we've applied our FOWLP technology to GDDR6W.

FOWLP technology directly mounts memory die on a silicon wafer, instead of a PCB. In doing so, RDL (Re-distribution layer) technology is applied, enabling much finer wiring patterns. Additionally, as there's no PCB involved, it reduces the thickness of the package and improves heat dissipation.

The height of the FOWLP-based GDDR6W is 0.7 mm - 36% slimmer than the previous package with a height of 1.1 mm. And despite the chip being multi-layered, it still offers the same thermal properties and performance as the existing GDDR6. Unlike GDDR6, however, the bandwidth of the FOWLP-based GDDR6W can be doubled thanks to the expanded I/O per single package.

Packaging refers to the process of cutting fabricated wafers into semiconductor shapes or connecting wires. In the industry, this is known as a 'back-end process.' While the semiconductor industry has continuously developed towards scaling circuits as much as possible during the front-end process, packaging technology is becoming more and more important as the industry approaches the physical limits of chip sizes limits. That's why Samsung is using its 3D IC package technology in GDDR6W, creating a single package by stacking a variety of chips in a wafer state. This is one of many innovations planned to make advanced packaging for GDDR6W faster and more efficient.

The newly developed GDDR6W technology can support HBM-level bandwidth at a system level. HBM2E has a system-level bandwidth of 1.6 TB/s based on 4K system-level I/O and a 3.2 Gbps transmission rate per pin. GDDR6W, on the other hand, can produce a bandwidth of 1.4 TB/s based on 512 system-level I/O and a transmission rate of 22Gpbs per pin. Furthermore, since GDDR6W reduces the number of I/O to about 1/8 compared with using HBM2E, it removes the necessity of using microbumps. That makes it more cost-effective without the need for an interposer layer.

n5pTSZr.jpg


Double the bandwidth and double the capacity seems like a pretty great advancement.
So this will probably become the standard for GPUs and consoles, for the next few years.
 

//DEVIL//

Member
That's great, we'll probably start to see them on video cards next year.

32Gb VRAM is godsend for gaming.
we didn't get a game that utilize 16 gigs to its fullest. let alone 24 gigs in these cards today. and you are thinking 32 will be godsend?


does....not..compute......


Even if you to think about the future, your GPU will be weak and wont run games at the settings that will require 32 gigs of Vram... this is like running 1060 6gig on 4k 144 native..


Does...not..compute....again...
 

HTK

Banned
Advancements are great and all but if developers are not utilizing any of it what's the point? PC state of affairs nobody has really utilized the high end hardware at all. It's all very blah to be honest.
 

Larogue

Member
we didn't get a game that utilize 16 gigs to its fullest. let alone 24 gigs in these cards today. and you are thinking 32 will be godsend?


does....not..compute......


Even if you to think about the future, your GPU will be weak and wont run games at the settings that will require 32 gigs of Vram... this is like running 1060 6gig on 4k 144 native..


Does...not..compute....again...
I don't know what kind of games do you play, but if you check VRAM during gaming you will see that it's almost always getting bottle necked.
Every game developer's dream is more VRAM, and less work wasted on optimizing loading/unloading of texture to due to lack of available VRAM.
Apple solved this by sharing the RAM between the CPU/GPU so it always up to 128Gb in the case of M1 Ultra.
 
Last edited:

JimboJones

Member
I don't know what kind of games do you play, but if you check VRAM during gaming you will see that it's almost always getting bottle necked.
Every game developer's dream is more VRAM, and less work wasted on optimizing loading/unloading of texture to due to lack of available VRAM.
Apple solved this by sharing the RAM between the CPU/GPU so it always up to 128Gb in the case of M1 Ultra.
I think you have to be careful with monitoring programs, some just show games using all the vram despite the game being fine with a fraction of the vram on other cards.
 
1) How is this different than GDDR6X that NVIDIA uses?
2) If AMD uses this, along with its infinity cache/3-D V-vache can we get more theoretical bandwidth exceeding 1.4TB/sec?

What do you think thicc_girls_are_teh_best thicc_girls_are_teh_best ? Is this better than HBM2E?
 

ToTTenTranz

Banned
32Gb VRAM is godsend for gaming.
That's just 8GB (1GB = 8Gbit) and we've had that for a while on GPUs. In theory this looks like it supports about the same capacity as two GDDR6 dies in clamshell mode but it probably won't support clamshell.

To summarize, it's not like this will drastically increase the memory amount of graphics cards, but we're probably going to see great memory performance out of a 16GB card (2.8 TB/s total?).
 
Last edited:

winjer

Gold Member
1) How is this different than GDDR6X that NVIDIA uses?

GDDR6X uses PAM4 to transfer data. So it can transfer 2 bits at a time, doubling GDDR6.

2) If AMD uses this, along with its infinity cache/3-D V-vache can we get more theoretical bandwidth exceeding 1.4TB/sec?

Cache bandwidth and memory bandwidth don't exactly ad-up.
But yes, it will provide more total bandwidth.

What do you think thicc_girls_are_teh_best thicc_girls_are_teh_best ? Is this better than HBM2E?

The real advantage of HBM is latency. Not so much memory bandwidth. And it's very expensive.
So it's used mostly for compute systems that have constrains on memory latency.
So GDDR6Wand HBM2 don't compete in the same market.
 
Last edited:
Samsung is preparing GDDR7 memory with 36Gbps bandwidth:

VideoCardz Samsung GDDR7 anouncement article

How would GDDR6W compare to GDDR7? I am thinking Pro consoles will use GDDR6W simply for cost saving measures, and then utilize 3-D v cache or Infinity cache to boost bandwidth more. Pro Consoles need to get past that 1TB/sec bandwidth threshold.
 

Drew1440

Member
I don't know what kind of games do you play, but if you check VRAM during gaming you will see that it's almost always getting bottle necked.
Every game developer's dream is more VRAM, and less work wasted on optimizing loading/unloading of texture to due to lack of available VRAM.
Apple solved this by sharing the RAM between the CPU/GPU so it always up to 128Gb in the case of M1 Ultra.
Does Apple not use regular DDR memory for their Ax Macs, which had much less bandwidth than typical VRAM?
 

Larogue

Member
Does Apple not use regular DDR memory for their Ax Macs, which had much less bandwidth than typical VRAM?
They use LPDDR5 with 800 GB/s system bandwidth. Impossible in normal PC design, but they achieved this by stacking memory literally next to the CPU/GPU inside the SoC.
So overall, it provide similar performance to GDDR6/X, but with up to 128 GB of totally usable high bandwidth memory for the GPU.
EMn3n8Q.jpg


GDDR6W should bridge the gap with double the bandwidth and capacity.
 
Last edited:

winjer

Gold Member
They use LPDDR5 with 800 GB/s system bandwidth. Impossible in normal PC design, but they achieved this by stacking memory literally next to the CPU/GPU inside the SoC.
So overall, it provide similar performance to GDDR6/X, but with up to 128 GB of totally usable high bandwidth memory for the GPU.
EMn3n8Q.jpg


GDDR6W should bridge the gap with double the bandwidth and capacity.

Apple uses a 1024bit bus, that is a very wide bus.
On PC GPU makers did make 512 bit bus for a while. But because of power usage and die space, they shifted to a narrower bus, with higher memory clock speeds.
Intel also dabbled for a while with triple and quad channel memory systems. But they also gave up on that, going for the traditional dual channel configuration.

The thing is that on PCs, the CPU and the GPU have their dedicated memory pool.
On the M1, Apple uses an SoC, where the CPU and GPU have to share resources. And then there are issues with memory contention.
 
we didn't get a game that utilize 16 gigs to its fullest. let alone 24 gigs in these cards today. and you are thinking 32 will be godsend?


does....not..compute......


Even if you to think about the future, your GPU will be weak and wont run games at the settings that will require 32 gigs of Vram... this is like running 1060 6gig on 4k 144 native..


Does...not..compute....again...
24GB is great for production tasks and AI. If you wanna game you don't need 24GB now hell if you wanna game you don't need a 3090 nor a 4090, that doesn't mean tech can't be pushed specially as certain tasks will always require more.
 
we didn't get a game that utilize 16 gigs to its fullest. let alone 24 gigs in these cards today. and you are thinking 32 will be godsend?


does....not..compute......


Even if you to think about the future, your GPU will be weak and wont run games at the settings that will require 32 gigs of Vram... this is like running 1060 6gig on 4k 144 native..


Does...not..compute....again...
Also Direct Storage is gonna put more pressure on VRAM requirements, we won't always be living on 2022 standards.
 

//DEVIL//

Member
24GB is great for production tasks and AI. If you wanna game you don't need 24GB now hell if you wanna game you don't need a 3090 nor a 4090, that doesn't mean tech can't be pushed specially as certain tasks will always require more.
I do need the 4090 and I am enjoying my card. I gave the 49 inch odyssey neo which is almost 4k ultra wide at 240 frames. I am enjoying these frames trust me lol.
 

LordOfChaos

Member
Advancements are great and all but if developers are not utilizing any of it what's the point? PC state of affairs nobody has really utilized the high end hardware at all. It's all very blah to be honest.

I've been feeling like that too. The pitch for higher end GPUs just seems to be pushing higher resolutions and framerates, it just feels like we don't really have the games that'll really make what we have already on the high end sweat at a more reasonable resolution and framerate target.
 
Top Bottom