• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

GeForce GTX 970s seem to have an issue using all 4GB of VRAM, Nvidia looking into it

Status
Not open for further replies.
PSA: AMD drops price of 290X to $299 ($279 with rebate from Newegg) and offers discounts to 970 owners who want to return their card.

Nvidia clarifies: No specific GTX 970 driver to improve memory allocation performance planned

Retailers and AIC partners are taking all the heat right now. NV really needs to come up with a plan.

Perfectly Functional GTX 970 Cards Being Returned Over Memory Controversy

Techpowerup said:
In what is a major fallout of the GeForce GTX 970 memory allocation controversy, leading retailers in the EU are reporting returns of perfectly functional GTX 970 cards citing "false advertising." Heise.de reports that NVIDIA is facing a fierce blowback from retailers and customers over incorrect specs. Heise comments that the specifications "cheating could mean the greatest damage to the reputation of the company's history."

I hope this hurts Nvidia real good. It's like companies don't learn that bad PR in the social media age can cause massive damage (see: Xbone reveal/launch and the fallout that followed). They're attempting to sweep this one under the rug and hope people will forget.

Hilarious mock interview with Nvidia about the 970 fiasco (Youtube)

Hitler Reacts (Youtube)

Wcq4OBo.gif

NVIDIA RESPONSE #2:

Anandtech: GeForce GTX 970: Correcting The Specs & Exploring Memory Allocation

Anadtech said:
RLkTZlDl.png


Most important for the conversation at hand, we were told that both possessed identical memory subsystems: 4GB of 7GHz GDDR5 on a 256-bit bus, split amongst 4 ROP/memory controller partitions. All 4 partitions would be fully active on the GTX 970, with 2MB of L2 cache and 64 ROPs available.

This, as it turns out, was incorrect.

As part of our discussion with NVIDIA, they laid out the fact that the original published specifications for the GTX 970 were wrong, and as a result the “unusual” behavior that users had been seeing from the GTX 970 was in fact expected behavior for a card configured as the GTX 970 was. To get straight to the point then, NVIDIA’s original publication of the ROP/memory controller subsystem was wrong; GTX 970 has a 256-bit memory bus, but 1 of the 4 ROP/memory controller partitions was partially disabled, not fully enabled like we were originally told. As a result GTX 970 only has 56 of 64 ROPs and 1.75MB of 2MB of L2 cache enabled. The memory controllers themselves remain unchanged, with all four controllers active and driving 4GB of VRAM over a combined 256-bit memory bus.

This revelation significantly alters how we perceive the performance of the GTX 970 in certain situations, and is the missing piece of the puzzle in understanding the memory allocation issues that originally brought all of this to light. The ability to “partially disable” a ROP/memory controller partition is new to Maxwell, and we’ll fully explore how that works in a moment, but the important part to take away is that the ROP/MC layout on the GTX 970 is not fully enabled like the GTX 980, and as a result will not behave identically to the GTX 980. All of the behavior from the GTX 970 we’ve seen in light of this correction now makes sense, and it is immediately clear that this is not a hardware or software bug in GTX 970, but rather the planned/intentional behavior of the product.

The biggest and most painful question about all of this then is how did this happen? How did we get fed incorrect specifications? NVIDIA’s explanation, in a nutshell, is that this was completely accidental and that all of this stems from assumptions made by NVIDIA’s technical marketing team.

...

This in turn is why the 224GB/sec memory bandwidth number for the GTX 970 is technically correct and yet still not entirely useful as we move past the memory controllers, as it is not possible to actually get that much bandwidth at once on the read side. GTX 970 can read the 3.5GB segment at 196GB/sec (7GHz * 7 ports * 32-bits), or it can read the 512MB segment at 28GB/sec, but not both at once; it is a true XOR situation. Furthermore because the 512MB segment cannot be read at the same time as the 3.5GB segment, reading this segment blocks accessing the 3.5GB segment for that cycle, further reducing the effective memory bandwidth of the card. The larger the percentage of the time the crossbar is reading the 512MB segment, the lower the effective memory bandwidth from the 3.5GB segment.

Extremetech said:
With that said, our 4K test did pick up a potential discrepancy in Shadows of Mordor. While the frame rates were equivalently positioned at both 4K and 1080p, the frame times weren’t. The graph below shows the 1% frame times for Shadows of Mordor, meaning the worst 1% times (in milliseconds).

S-of-M.png


The 1% frame times in Shadows of Mordor are significantly worse on the GTX 970 than the GTX 980. This implies that yes, there are some scenarios in which stuttering can negatively impact frame rate and that the complaints of some users may not be without merit. However, the strength of this argument is partly attenuated by the frame rate itself — at an average of 33 FPS, the game doesn’t play particularly smoothly or well even on the GTX 980.
http://www.extremetech.com/extreme/...idias-penultimate-gpu-have-a-memory-problem/2


Full article on pcper: NVIDIA Discloses Full Memory Structure and Limitations of GTX 970

pcper said:
NVIDIA’s performance labs continue to work away at finding examples of this occurring and the consensus seems to be something in the 4-6% range. A GTX 970 without this memory pool division would run 4-6% faster than the GTX 970s selling today in high memory utilization scenarios. Obviously this is something we can’t accurately test though – we don’t have the ability to run a GTX 970 without a disabled L2/ROP cluster like NVIDIA can. All we can do is compare the difference in performance between a reference GTX 980 and a reference GTX 970 and measure the differences as best we can, and that is our goal for this week.

Accessing that 500MB of memory on its own is slower. Accessing that 500MB as part of the 4GB total slows things down by 4-6%, at least according to NVIDIA. So now the difficult question: did NVIDIA lie to us?

At the very least, the company did not fully disclose the missing L2 and ROP partition on the GTX 970, even if it was due to miscommunication internally. The question “should the GTX 970 be called a 3.5GB card?” is more of a philosophical debate. There is 4GB of physical memory on the card and you can definitely access all 4GB of when the game and operating system determine it is necessary. But 1/8th of that memory can only be accessed in a slower manner than the other 7/8th, even if that 1/8th is 4x faster than system memory over PCI Express. NVIDIA claims that the architecture is working exactly as intended and that with competent OS heuristics the performance difference should be negligible in real-world gaming scenarios.

The performance of the GTX 970 is what the performance is. This information is incredibly interesting and warrants some debate, but at the end of the day, my recommendations for the GTX 970 really won’t change at all.
The configurability of the Maxwell architecture allowed NVIDIA to make this choice. Had the GeForce GTX 970 been built on the Kepler architecture, the company would have had to disable the entire L2/MC block on the right hand side, resulting in a 192-bit memory bus and a 3GB frame buffer. GM204 allows NVIDIA to expand that to a 256-bit 3.5GB/0.5GB memory configuration and offers performance advantages, obviously.

Let’s be clear – the performance of the GTX 970 is what the performance is. This information is incredibly interesting and warrants some debate, but at the end of the day, my recommendations for the GTX 970 really won’t change at all. It still offers incredible performance for your dollar and is able to run at 4K in my experience and testing. Yes, there might in fact be specific instances where performance drops are more severe because of this memory hierarchy design, but I don’t think it changes the outlook for the card as a whole.

...

For users that are attempting to measure the impact of this issue you should be aware that in some cases the software you are using report the in-use graphics memory could be wrong. Some applications are only aware of the first "pool" of memory and may only ever show up to 3.5GB in use for a game. Other applications, including MSI Afterburner as an example, do properly report total memory usage of up to 4GB. Because of the unique allocation of memory in the system, the OS and driver and monitoring application may not always be on the page. Many users, like bootski over at NeoGAF have done a job of compiling examples where the memory issue occurs, so look around for the right tools to use to test your own GTX 970. (Side note: we are going to try to do some of our own testing this afternoon.)

NVIDIA has come clean; all that remains is the response from consumers to take hold. For those of you that read this and remain affronted by NVIDIA calling the GeForce GTX 970 a 4GB card without equivocation: I get it. But I also respectfully disagree. Should NVIDIA have been more upfront about the changes this GPU brought compared to the GTX 980? Absolutely and emphatically. But does this change the stance or position of the GTX 970 in the world of discrete PC graphics? I don’t think it does.


NVIDIA RESPONSE:


pcper said:
Editor's Note: Commentary coming soon - wanted to get this statement out ASAP.

NVIDIA has finally responded to the widespread online complaints about GeForce GTX 970 cards only utilizing 3.5GB of their 4GB frame buffer. From the horse's mouth:

The GeForce GTX 970 is equipped with 4GB of dedicated graphics memory. However the 970 has a different configuration of SMs than the 980, and fewer crossbar resources to the memory system. To optimally manage memory traffic in this configuration, we segment graphics memory into a 3.5GB section and a 0.5GB section. The GPU has higher priority access to the 3.5GB section. When a game needs less than 3.5GB of video memory per draw command then it will only access the first partition, and 3rd party applications that measure memory usage will report 3.5GB of memory in use on GTX 970, but may report more for GTX 980 if there is more memory used by other commands. When a game requires more than 3.5GB of memory then we use both segments.

We understand there have been some questions about how the GTX 970 will perform when it accesses the 0.5GB memory segment. The best way to test that is to look at game performance. Compare a GTX 980 to a 970 on a game that uses less than 3.5GB. Then turn up the settings so the game needs more than 3.5GB and compare 980 and 970 performance again.

Here’s an example of some performance data:

xtiWrAL.png


On GTX 980, Shadows of Mordor drops about 24% on GTX 980 and 25% on GTX 970, a 1% difference. On Battlefield 4, the drop is 47% on GTX 980 and 50% on GTX 970, a 3% difference. On CoD: AW, the drop is 41% on GTX 980 and 44% on GTX 970, a 3% difference. As you can see, there is very little change in the performance of the GTX 970 relative to GTX 980 on these games when it is using the 0.5GB segment.

Nope, the bandwidth on the last 0.5 is about 90% slower but as they have used an average test it has been nicely disguised as a minor issue. Nvidia trying to pull a con with percentages.

An average FPS number tells you nothing about the low points, stuttering, and frametime inconsistencies. It's just an average, not representative of the actual performance and issues.

The only clever thing here is how Nvidia are trying to cover up the problem by using vague and imprecise data. No clever driver work can make up for a massive bandwidth deficit and the cases where stuttering occurs because of either a refusal to use more than 3.5 GBs or using more but it not being fast enough.

An AVERAGE performance delta of single digit percentage is misleading because the instantaneous performance degradation that leads to that lower average actually results in sporadic frametimes and hence annoying tearing and jutter - as people with 970s are reporting to be the case.

Nvidia is reporting the average delta so it sounds better in a PR statement.


--------------------------------- original OP follows --------------------------------

GeForce GTX 970s seem to have an issue using all 4GB of VRAM, Nvidia looking into it


This has been reported by some users for a while now, with others dismissing it or calling it into question, but now shit seems to be starting to hit the fan.

From the cards launch thread:


At first i didnt really payed attention but after some tests and other users reports it seems that 970 is really gimped (or a driver problem dunno)

taken from overclock.net:

2327280


the last ~700MB are really gimped.

Quick, rebrand the 970s as the 3.5GB 760Ti

Sometimes it really is too good to be true.

Though, maybe not that big a deal. Nvidia should just sell it as a 3GB card.

From Nvidia moderator on GeForce.com forums:

ManuelG said:
We are still looking into this and will have an update as soon as possible.

If I understand the issue correctly, the VRAM bandwidth tanks when you push the card to use more than ~3.5GB causing framerates to take a huge hit.

GTX 980 not affected.

If you're looking to buy a 970 right now, you might want to hold your horses until more is known about this. Hopefully it's a driver issue or something.

If it turns out to be a permanent hardware limitation, I swear this will be the last time I'll be an early adopter of anything. So tired of beta testing expensive stuff.

Edit:

You might want to add a note in the first post that this may only be affecting 970 users with Hynix memory as well as instructions for people to check to see who is their VRAM manufacturer.

Lots of manufacturers started going with Hynix after the initial batch of cards. Pretty shady shit. Samsung memory overclocks much better too.

To check what kind of memory you have install Nvidia Inspector v 1.9.7.3
GPU-Z also tells you what make the memory on your card is.

Edit 2: Don't know what to think anymore, a lot of people saying the benchmark shown above is flawd, while some saying it's a hardware issue for sure (and also affects Samsung memory and 980s) ¯_(ツ)_/¯

Edit 3: OK, everyone is going crazy running that Nai benchmark and posting their results, I'm not sure you should bother with that if you're reading this. What seems to be happening is that unless you are running Windows on your integrated graphics (by plugging the monitor to the port on the motherboard if it has one [and booting with no monitor plugged into your GPU?]) then Windows will use up part of the VRAM causing the driver to crash during the benchmark. Also people who checked the source code of the program say this would only pertain to compute performance? Don't know, just trying to update OP with new knowledge to avoid useless repeat reports. Best thing to do for now seems to be to wait, unless you know what you're doing and can present new evidence or insight to the table.

From Durante:
Everyone should at the very least
  • disable Shadowplay and
  • make note of their prior windows/application VRAM usage levels
before running this and posting their results. But it's still not particularly meaningful if you are using the same GPU for actual desktop graphics at the same time.

Also, I second the request for source code of the benchmark, I'd like to see how it decides where an allocated chunk is in physical memory, because I don't know a reliable way to do that.

Update:
OP needs to put this quote from GURU3D in

64bit memory controller, total 4 memory controllers = 256 bit memory controller.
Assuming if there are 3 raster engines with each three has one SMM disabled leaving 1 raster engine with 4 SMM intact.
Mathematically ;
16 SMM = 256 bit = 4096 Mb
13 SMM = 208 bit = 3328 Mb

208 bit = effective width after disabling SMM with 256 bit being actual memory controller width

I just can't get over the fact that they sold a 3gb 208bit card as 4gb 256bit

Or not:
Well this is an encouraging result..

http://www.overclock.net/t/1535502/gtx-970s-can-only-use-3-5gb-of-4gb-vram-issue/300#post_23448215

I don't think it has anything to do with the number of SMM's being cut as my 980M (which is cut down further to 12 SMM's vs 970's 13 SMM's) doesn't appear to be impacted by this nasty bug.

I hope this is resolved for you guys and gals ASAP. I hope it is a software issue only.

GTX 980M 4GB, stock clocks and 344.75 driver:
Maybe it isn't hardware afterall.

This looks pretty damning to me:

It's not that the 970s can't ever use more than 3.5 GBs, it's just that most of the time even when it could benefit from it, it doesn't. Extreme situations such as 8xMSAA with Ultra textures on AC Unity is enough to force it to use more, but it's not performing smoothly in that case either and most of the time my 970s try to avoid overfilling 3.5 whereas a 980 would just use more and have less/no stuttering.

Every time I've ever exceeded ~3600 MBs in any game or just pushed up against that limit, I get similar stuttering regardless of by how much.
No, the bench shows the last 500Mb can only be accessed at degraded bandwidth, hence the 970 driver is choosing to only use it as a last minute resort. And when it does the whole game stutters badly.

The reason why the driver is probably setup like this is because nVidia knew the hardware was gimped.

This is why in Watchdogs the 980 allocated 4Gb, the 970 3.5Gb at the same settings.

Users also note that as you pass 3.5Gb usage the GPU usage drops too. Indicating the 500Mb that is at really low bandwidth is bottlenecking the GPU core.

Update:
This is starting to pick up more steam. It'll be interesting to see bigger sites running tests.

guru3d: Does the GeForce GTX 970 have a memory allocation bug?
TECHPOWERUP: Geforce GTX 970 Design Flaw Caps Video Memory Usage to 3.3 GB: Report

They don't really add anything that wasn't said in this thread already, but Guru3D has a customized version of the benchmark and confirm that the 970 shows the issue while the 980 doesn't.
 

AJLma

Member
Been crushing everything over here from downsampled 4K to 1440p at 60FPS, so whatever the issue is probably isn't that major.
 
970 g1 gaming here and while the performance was good, it wasn't as much of a jump as i expected..
Satisfied yes, but i expected more, though i thought that it was driver maturity issue..
 
I JUST bought one.

damnit. Imma cry in the corner real quick

Just pretend you don't know about it and you'll feel better. The cards obviously still run well despite this since it's taken this long for anyone to notice (they came out in what, September 2014?)

If it's really gimped I'll ask for a refund since it was false advertising.

Wait until we see if it can be fixed with an update before we get the pitchforks out.
 
Been crushing everything over here from downsampled 4K to 1440p at 60FPS, so whatever the issue is probably isn't that major.

Your magic card be blessed. Its a pretty major problem for a lot of people though. It manifested when I was using my brother's PC and I did not understand why until just now.
 
Once my card hits 3.6GB things start getting choppy. I thought the memory usage was being mis-reported but I'm pretty sure this is the issue I'm having.
 

Ryne

Member
I bought the 970 FTW+ knowing this. Hopefully Nvidia can get it fixed (however apparently it is a hardware issue), but for me it is a stop-gap card until I build my new PC.

This issue sucks though, especially for such a highly regarded card.
 
D

Deleted member 17706

Unconfirmed Member
Once my card hits 3.6GB things start getting choppy. I thought the memory usage was being mis-reported but I'm pretty sure this is the issue I'm having.

What games have you experienced this in?
 
I had a very weird thing happen today, unity usually only uses 3.5gb vram for me but i heard about this and activated msaa 4x and it used all my vram 4gb, then i put back on fxaa and it was still using my 4gb vram and the framerate was actually a little higher.
 

SpotAnime

Member
If they can't fix this with a firmware update, there's no way they won't issue a recall or replacement program, right? RIGHT?

And how come all the hard tech reviews didn't catch this at launch?
 

Dryk

Member
Well my nebulous plan to hold out for the higher memory 900s just solidified

If they can't fix this with a firmware update, there's no way they won't issue a recall or replacement program, right? RIGHT?

And how come all the hard tech reviews didn't catch this at launch?
Worst case scenario they could do what Sony did with the Cell chips and seal off the last 800Mb of VRAM couldn't they?
 

hoserx

Member
the posts before that where the guy seemingly made up some fake dialog box to prove his point........ I died

oul2ylcnsba6.png
 

Orayn

Member
I am suddenly very glad that I've been dragging my feet on a planned upgrade to this exact card.

Suppose I'd also be willing to wait for a 960Ti based on the 970/980, if that's potentially happening still.
 

Vamphuntr

Member
If they can't fix this with a firmware update, there's no way they won't issue a recall or replacement program, right? RIGHT?

And how come all the hard tech reviews didn't catch this at launch?

A replacement program worldwide would cost a fortune if the problem is in the hardware.
 
A replacement program worldwide would cost a fortune if the problem is in the hardware.

It would probably cost less in the long run than a class action, though. Although it would depend on how they set up the replacement program.

EDIT: Not saying that it'll get that far, but if ends up being a hardware issue and they don't make efforts to bring it back up to a useable 4GB -- whether that's via replacements or something else -- it might very well end in a lawsuit.

Hopefully it's just a driver issue, though.
 

Zukuu

Banned
Maybe I'll hold off of buying one for now. Not that I wanted to get one immediately anyway.
Now I just have one more reason to wait a few more months.
 
What games have you experienced this in?

Shadow of Mordor with the ultra texture pack taps at around 3.6GB like I mentioned and then starts stuttering when you pan the camera.

Haven't really played any eye candy games since buying the card now that I think about it.
 

Nzyme32

Member
You're lucky to be living in the first world (I assume). I highly doubt the local retailer I bought this from would acknowledge this as a cause for return :(

Since the issue isn't your fault and is known, and if nvidia can't fix this easily, you should be sorted under warranty which everyone will have whether reference card or not. Wait for more info
 

Hip Hop

Member
wow.

I was about to buy one soon. Hope it gets sorted out and it's actually a driver issue or something.
 
D

Deleted member 17706

Unconfirmed Member
Shadow of Mordor with the ultra texture pack taps at around 3.6GB like I mentioned and then starts stuttering when you pan the camera.

Haven't really played any eye candy games since buying the card now that I think about it.

Hmmm... I've got a 970, too, and I don't recall experiencing this in Mordor. I actually went from a AMD 7970 (3GB) to a 970 midway through the game and it obviously ran a lot better on the 970 even after turning on Ultra textures. I didn't get any weird stuttering, but I definitely experienced some tearing.

Then again, I wasn't monitoring VRAM usage, so I can't be sure if I ever went to 3.6GB or higher.
 

jfoul

Member
Well, this memory issue actually being the real deal is disturbing. I just bought the 970 FTW+ to replace my Gigabyte G1.

How could Nvidia miss this issue before releasing the card? If this issue can't be fixed through drivers or firmware, Nvidia will need to possibly recall the cards. This will put the board vendors under heavy fire.

I'm at least in good hands with EVGA.
 

Derpcrawler

Member
I just ordered parts for new PC on X99 platform and bought 2 GTX970 for SLI. Should I just change to one GTX980 for now?

Where I live (China), replacement might be problematic with manufacturers and take months.
 

Negator

Member
I JUST bought this card. What the hell Nvidia!

Hope it's a driver issue or something. If it's hardware I might consider just sending it back.
 

LilJoka

Member
No way can they leave this unfixed, it's either drivers or hardware, and if it's hardware and there isn't a recall then I expect a humongous backlash.
 
Good thing I bought a PS4 instead of this card last year (Inb4 you get mad. The PS4 was $350CAD for the Destiny Bundle vs the GTX 970 $450CAD). My GTX 570 can continue so handle the business.
 
Status
Not open for further replies.
Top Bottom