The RTX 3000 cards are built on an architecture NVIDIA calls "Ampere," and its SM, in some ways, takes both the Pascal and the Turing approach. Ampere keeps the 64 FP32 cores as before, but the 64 other cores are now designated as "FP32 and INT32.” So, half the Ampere cores are dedicated to floating-point, but the other half can perform either floating-point or integer math, just like in Pascal.
With this switch, NVIDIA is now counting each SM as containing 128 FP32 cores, rather than the 64 that Turing had. The 3070's "5,888 cuda cores" are perhaps better described as "2,944 cuda cores, and 2,944 cores that can be cuda."
As games have become more complex, developers have begun to lean more heavily on integers. An NVIDIA slide from the original 2018 RTX launch suggested that integer math, on average, made up about a quarter of in-game GPU operations.
The downside of the Turing SM is the potential for under-utilization. If, for example, a workload is 25-percent integer math, around a quarter of the GPU’s cores could be sitting around with nothing to do. That’s the thinking behind this new semi-unified core structure, and, on paper, it makes a lot of sense: You can still run integer and floating-point operations simultaneously, but when those integer cores are dormant, they can run floating-point instead.