hu
2020-08-25T14:05:58+00:00
nv官方放出的安培架构白皮书:
[url]https://www.nvidia.com/content/dam/en-zz/Solutions/geforce/ampere/pdf/NVIDIA-ampere-GA102-GPU-Architecture-Whitepaper-V1.pdf[/url]
Ampere GPU Architecture In-Depth
GPC, TPC, and SM High-Level Architecture
Like prior NVIDIA GPUs, GA102 is composed of Graphics Processing Clusters (GPCs), Texture
Processing Clusters (TPCs), Streaming Multiprocessors (SMs), Raster Operators (ROPS), and
memory controllers. The full GA102 GPU contains seven GPCs, 42 TPCs, and 84 SMs.
The GPC is the dominant high-level hardware block with all of the key graphics processing units
residing inside the GPC. Each GPC includes a dedicated Raster Engine, and now also includes
two ROP partitions (each partition containing eight ROP units), which is a new feature for
NVIDIA Ampere Architecture GA10x GPUs and described in more detail below. The GPC
includes six TPCs that each include two SMs and one PolyMorph Engine.
Note: The GA102 GPU also features 168 FP64 units (two per SM), which are not depicted in this
diagram. The FP64 TFLOP rate is 1/64th the TFLOP rate of FP32 operations. The small number of FP64
hardware units are included to ensure any programs with FP64 code operate correctly, including FP64
Tensor Core code.
Figure 2. GA102 Full GPU with 84 SMs
Each SM in GA10x GPUs contain 128 CUDA Cores, four third-generation Tensor Cores, a 256
KB Register File, four Texture Units, one second-generation Ray Tracing Core, and 128 KB of
L1/Shared Memory, which can be configured for differing capacities depending on the needs of
the compute or graphics workloads.
Ampere GPU Architecture In-Depth
NVIDIA Ampere GA102 GPU Architecture 9
The memory subsystem of GA102 consists of twelve 32-bit memory controllers (384-bit total).
512 KB of L2 cache is paired with each 32-bit memory controller, for a total of 6144 KB on the
full GA102 GPU
128x84=10752sp,铁证如山
[quote][pid=449978211,23220349,1]Reply[/pid] Post by [uid=39658527]苹果质量效应[/uid] (2020-09-03 22:14):
加256个SP能带来多大性能增长?[s:ac:汗][/quote]你看980ti和titanx,titanp/1080ti和titanxp,还有2080ti跟rtx titan,都提升了近一成
[quote][pid=449978644,23220349,1]Reply[/pid] Post by [uid=62480205]起司块wii[/uid] (2020-09-03 22:17):
所以专门刀一块新卡出来就是为了这2.44%的提升?真就给蚊子割包皮呗 这边2080和2080s的差距还小[/quote]参照白泰坦到780ti/黑泰坦,泰坦p/1080ti到泰坦xp,2080ti到r泰坦,提频加性能呗
直接类比提升一成可还行 RTX泰坦规格比2080TI高了8% 80TI比80多40%的CUDA只能提升30%左右 宁这2%的CUDA提升给我来了个一成 感情你家一成是1%是吧[/quote][url]https://news.mydrivers.com/1/823/823191_4.htm[/url]
驱家评测结果,确实提升了一成~
预言家[s:ac:goodjob][/quote]我的七彩虹3090ti火神跑分结果:
cpu:9900ks
主板:rog z390 strix-h
内存:金士顿ddr4 3733 灯条32g(16x2双通道)
硬盘:905p 480g
散热:利民银箭IBE
电源:长城猎金部落G20(2000瓦!)
[img]https://img.nga.178.com/attachments/mon_202204/01/9aQ8mls-7m0vZlT3cSn7-fa.jpg[/img]
[img]https://img.nga.178.com/attachments/mon_202204/01/9aQ8mls-by1cZaT3cSog-fj.jpg[/img]
[img]https://img.nga.178.com/attachments/mon_202204/01/9aQ8mls-4yqZhT3cSsg-ku.jpg[/img]
你说有GA102的titan,我就跟你说一定会有一个阉割了双精度的GA100 titan来按着锤不管是3090还是102titan,不就是挣个卡皇么,天花板在那儿放着,你非要研究中间要细分几个等级。[/quote]40系都快出来了,请问你说的“GA100 titan”到底在哪里?
[url]https://www.nvidia.com/content/dam/en-zz/Solutions/geforce/ampere/pdf/NVIDIA-ampere-GA102-GPU-Architecture-Whitepaper-V1.pdf[/url]
Ampere GPU Architecture In-Depth
GPC, TPC, and SM High-Level Architecture
Like prior NVIDIA GPUs, GA102 is composed of Graphics Processing Clusters (GPCs), Texture
Processing Clusters (TPCs), Streaming Multiprocessors (SMs), Raster Operators (ROPS), and
memory controllers. The full GA102 GPU contains seven GPCs, 42 TPCs, and 84 SMs.
The GPC is the dominant high-level hardware block with all of the key graphics processing units
residing inside the GPC. Each GPC includes a dedicated Raster Engine, and now also includes
two ROP partitions (each partition containing eight ROP units), which is a new feature for
NVIDIA Ampere Architecture GA10x GPUs and described in more detail below. The GPC
includes six TPCs that each include two SMs and one PolyMorph Engine.
Note: The GA102 GPU also features 168 FP64 units (two per SM), which are not depicted in this
diagram. The FP64 TFLOP rate is 1/64th the TFLOP rate of FP32 operations. The small number of FP64
hardware units are included to ensure any programs with FP64 code operate correctly, including FP64
Tensor Core code.
Figure 2. GA102 Full GPU with 84 SMs
Each SM in GA10x GPUs contain 128 CUDA Cores, four third-generation Tensor Cores, a 256
KB Register File, four Texture Units, one second-generation Ray Tracing Core, and 128 KB of
L1/Shared Memory, which can be configured for differing capacities depending on the needs of
the compute or graphics workloads.
Ampere GPU Architecture In-Depth
NVIDIA Ampere GA102 GPU Architecture 9
The memory subsystem of GA102 consists of twelve 32-bit memory controllers (384-bit total).
512 KB of L2 cache is paired with each 32-bit memory controller, for a total of 6144 KB on the
full GA102 GPU
128x84=10752sp,铁证如山
[quote][pid=449978211,23220349,1]Reply[/pid] Post by [uid=39658527]苹果质量效应[/uid] (2020-09-03 22:14):
加256个SP能带来多大性能增长?[s:ac:汗][/quote]你看980ti和titanx,titanp/1080ti和titanxp,还有2080ti跟rtx titan,都提升了近一成
[quote][pid=449978644,23220349,1]Reply[/pid] Post by [uid=62480205]起司块wii[/uid] (2020-09-03 22:17):
所以专门刀一块新卡出来就是为了这2.44%的提升?真就给蚊子割包皮呗 这边2080和2080s的差距还小[/quote]参照白泰坦到780ti/黑泰坦,泰坦p/1080ti到泰坦xp,2080ti到r泰坦,提频加性能呗
2022-04-01 10:09
[quote][pid=452160787,23220349,4]Reply[/pid] Post by [uid=7926267]Gseed[/uid] (2020-09-13 18:15):直接类比提升一成可还行 RTX泰坦规格比2080TI高了8% 80TI比80多40%的CUDA只能提升30%左右 宁这2%的CUDA提升给我来了个一成 感情你家一成是1%是吧[/quote][url]https://news.mydrivers.com/1/823/823191_4.htm[/url]
驱家评测结果,确实提升了一成~
2022-04-01 22:26
[quote][pid=599899232,23220349,4]Reply[/pid] Post by [uid=61190429]果然这世间不存在法[/uid] (2022-04-01 13:01):预言家[s:ac:goodjob][/quote]我的七彩虹3090ti火神跑分结果:
cpu:9900ks
主板:rog z390 strix-h
内存:金士顿ddr4 3733 灯条32g(16x2双通道)
硬盘:905p 480g
散热:利民银箭IBE
电源:长城猎金部落G20(2000瓦!)
[img]https://img.nga.178.com/attachments/mon_202204/01/9aQ8mls-7m0vZlT3cSn7-fa.jpg[/img]
[img]https://img.nga.178.com/attachments/mon_202204/01/9aQ8mls-by1cZaT3cSog-fj.jpg[/img]
[img]https://img.nga.178.com/attachments/mon_202204/01/9aQ8mls-4yqZhT3cSsg-ku.jpg[/img]
2022-07-23 15:52
[quote][pid=450097722,23220349,2]Reply[/pid] Post by [uid=7177419]TurboWalker[/uid] (2020-09-04 12:47):你说有GA102的titan,我就跟你说一定会有一个阉割了双精度的GA100 titan来按着锤不管是3090还是102titan,不就是挣个卡皇么,天花板在那儿放着,你非要研究中间要细分几个等级。[/quote]40系都快出来了,请问你说的“GA100 titan”到底在哪里?