Last active
March 5, 2026 12:50
-
-
Save Ristovski/1b16db418c85d930dd30872bdf1233a3 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| vkperf (0.99.5) tests various performance characteristics of Vulkan devices. | |
| Devices in the system: | |
| AMD Radeon Graphics (RADV RENOIR) | |
| NVIDIA GeForce RTX 4070 Ti SUPER | |
| llvmpipe (LLVM 19.1.7, 256 bits) | |
| Selected device: | |
| AMD Radeon Graphics (RADV RENOIR) | |
| VendorID: 0x1002 (AMD/ATI) | |
| DeviceID: 0x1638 | |
| Vulkan version: 1.4.305 | |
| Driver version: 25.0.5 (104857605, 0x6400005) | |
| Driver name: radv | |
| Driver info: Mesa 25.0.5 | |
| DriverID: MesaRadv | |
| Driver conformance version: 1.4.0.0 | |
| GPU memory: 10GiB (10718MiB) | |
| Max memory allocations: 4294967295 | |
| Standard (non-sparse) buffer alignment: 16 | |
| Number of triangles for tests: 100000 | |
| Sparse mode for tests: None | |
| Timestamp number of bits: 64 | |
| Timestamp period: 10ns | |
| Vulkan Instance version: 1.4.328 | |
| Operating system: < unknown, non-Windows > | |
| Processor: AMD Ryzen 7 5700G with Radeon Graphics | |
| Triangle throughput: | |
| Triangle list (triangle list primitive type, | |
| single per-scene vkCmdDraw() call, attributeless, | |
| constant VS output): 759.6 mega-triangles/s | |
| Indexed triangle list (triangle list primitive type, single | |
| per-scene vkCmdDrawIndexed() call, no vertices shared between triangles, | |
| attributeless, constant VS output): 758.4 mega-triangles/s | |
| Indexed triangle list that reuses two indices of the previous triangle | |
| (triangle list primitive type, single per-scene vkCmdDrawIndexed() call, | |
| attributeless, constant VS output): 1.262 giga-triangles/s | |
| Triangle strips of various lengths | |
| (per-strip vkCmdDraw() call, 1 to 1000 triangles per strip, | |
| attributeless, constant VS output): | |
| strip length 1: 70.63 mega-triangles/s | |
| strip length 2: 139.7 mega-triangles/s | |
| strip length 5: 345.9 mega-triangles/s | |
| strip length 8: 541.2 mega-triangles/s | |
| strip length 10: 666.6 mega-triangles/s | |
| strip length 20: 1.322 giga-triangles/s | |
| strip length 25: 1.495 giga-triangles/s | |
| strip length 40: 1.798 giga-triangles/s | |
| strip length 50: 1.872 giga-triangles/s | |
| strip length 100: 2.039 giga-triangles/s | |
| strip length 125: 2.076 giga-triangles/s | |
| strip length 1000: 2.216 giga-triangles/s | |
| Indexed triangle strips of various lengths | |
| (per-strip vkCmdDrawIndexed() call, 1-1000 triangles per strip, | |
| no vertices shared between strips, each index used just once, | |
| attributeless, constant VS output): | |
| strip length 1: 70.78 mega-triangles/s | |
| strip length 2: 140.1 mega-triangles/s | |
| strip length 5: 346.6 mega-triangles/s | |
| strip length 8: 543.1 mega-triangles/s | |
| strip length 10: 668.9 mega-triangles/s | |
| strip length 20: 1.326 giga-triangles/s | |
| strip length 25: 1.626 giga-triangles/s | |
| strip length 40: 2.027 giga-triangles/s | |
| strip length 50: 2.140 giga-triangles/s | |
| strip length 100: 2.140 giga-triangles/s | |
| strip length 125: 2.173 giga-triangles/s | |
| strip length 1000: 2.214 giga-triangles/s | |
| Primitive restart indexed triangle strips of various lengths | |
| (single per-scene vkCmdDrawIndexed() call, 1-1000 triangles per strip, | |
| no vertices shared between strips, each index used just once, | |
| attributeless, constant VS output): | |
| strip length 1: 957.4 mega-triangles/s | |
| strip length 2: 1.508 giga-triangles/s | |
| strip length 5: 2.200 giga-triangles/s | |
| strip length 8: 2.202 giga-triangles/s | |
| strip length 1000: 2.202 giga-triangles/s | |
| Primitive restart, each triangle is replaced by one -1 | |
| (single per-scene vkCmdDrawIndexed() call, | |
| no fragments produced): 3.654 giga-triangles/s | |
| Primitive restart, only zeros in the index buffer | |
| (single per-scene vkCmdDrawIndexed() call, | |
| no fragments produced): 756.2 mega-triangles/s | |
| Instancing throughput of vkCmdDraw() | |
| (one triangle per instance, constant VS output, one draw call, | |
| attributeless): 759.4 mega-triangles/s | |
| Instancing throughput of vkCmdDrawIndexed() | |
| (one triangle per instance, constant VS output, one draw call, | |
| attributeless): 758.4 mega-triangles/s | |
| Instancing throughput of vkCmdDrawIndirect() | |
| (one triangle per instance, one indirect draw call, | |
| one indirect record, attributeless: 755.5 mega-triangles/s | |
| Instancing throughput of vkCmdDrawIndexedIndirect() | |
| (one triangle per instance, one indirect draw call, | |
| one indirect record, attributeless: 754.8 mega-triangles/s | |
| vkCmdDraw() throughput | |
| (per-triangle vkCmdDraw() in command buffer, | |
| attributeless, constant VS output): 70.64 mega-triangles/s | |
| vkCmdDrawIndexed() throughput | |
| (per-triangle vkCmdDrawIndexed() in command buffer, | |
| attributeless, constant VS output): 70.75 mega-triangles/s | |
| VkDrawIndirectCommand processing throughput | |
| (per-triangle VkDrawIndirectCommand, one vkCmdDrawIndirect() call, | |
| attributeless): 24.43 mega-indirectRecords/s | |
| VkDrawIndirectCommand processing throughput with stride 32 | |
| (per-triangle VkDrawIndirectCommand, one vkCmdDrawIndirect() call, | |
| attributeless): 24.43 mega-indirectRecords/s | |
| VkDrawIndexedIndirectCommand processing throughput | |
| (per-triangle VkDrawIndexedIndirectCommand, | |
| 1x vkCmdDrawIndexedIndirect() call, | |
| attributeless): 23.43 mega-indirectRecords/s | |
| VkDrawIndexedIndirectCommand processing throughput with stride 32 | |
| (per-triangle VkDrawIndexedIndirectCommand, | |
| 1x vkCmdDrawIndexedIndirect() call, | |
| attributeless): 18.49 mega-indirectRecords/s | |
| Vertex and geometry shader throughput: | |
| VS throughput using vkCmdDraw() - minimal VS that just writes | |
| constant output position (per-scene vkCmdDraw() call, | |
| no attributes, no fragments produced): 2.278 giga-vertices/s | |
| VS throughput using vkCmdDrawIndexed() - minimal VS that just writes | |
| constant output position (per-scene vkCmdDrawIndexed() call, | |
| no attributes, no fragments produced): 2.275 giga-vertices/s | |
| VS producing output position from VertexIndex and InstanceIndex | |
| using vkCmdDraw() (single per-scene vkCmdDraw() call, | |
| attributeless, no fragments produced): 2.278 giga-vertices/s | |
| VS producing output position from VertexIndex and InstanceIndex | |
| using vkCmdDrawIndexed() (single per-scene vkCmdDrawIndexed() call, | |
| attributeless, no fragments produced): 2.274 giga-vertices/s | |
| GS one triangle in and no triangle out | |
| (empty VS, attributeless): 759.4 mega-invocations/s | |
| GS one triangle in and single constant triangle out | |
| (empty VS, attributeless): 438.4 mega-invocations/s | |
| GS one triangle in and two constant triangles out | |
| (empty VS, attributeless): 316.9 mega-invocations/s | |
| Attributes and buffers: | |
| One attribute performance - 1x vec4 attribute | |
| (attribute used, per-scene draw call): 2.239 giga-vertices/s | |
| One buffer performance - 1x vec4 buffer | |
| (1x read in VS, per-scene draw call): 2.237 giga-vertices/s | |
| One buffer performance - 1x vec3 buffer | |
| (1x read in VS, one draw call): 2.262 giga-vertices/s | |
| Two attributes performance - 2x vec4 attribute | |
| (both attributes used): 1.518 giga-vertices/s | |
| Two buffers performance - 2x vec4 buffer | |
| (both buffers read in VS): 1.384 giga-vertices/s | |
| Two buffers performance - 2x vec3 buffer | |
| (both buffers read in VS): 2.024 giga-vertices/s | |
| Two interleaved attributes performance - 2x vec4 | |
| (2x vec4 attribute fetched from the single buffer in VS | |
| from consecutive buffer locations: 1.507 giga-vertices/s | |
| Two interleaved buffers performance - 2x vec4 | |
| (2x vec4 fetched from the single buffer in VS | |
| from consecutive buffer locations: 1.508 giga-vertices/s | |
| Packed buffer performance - 1x buffer using 32-byte struct unpacked | |
| into position+normal+color+texCoord: 1.502 giga-vertices/s | |
| Packed attribute performance - 2x uvec4 attribute unpacked | |
| into position+normal+color+texCoord: 1.524 giga-vertices/s | |
| Packed buffer performance - 2x uvec4 buffers unpacked | |
| into position+normal+color+texCoord: 1.521 giga-vertices/s | |
| Packed buffer performance - 2x buffer using 16-byte struct unpacked | |
| into position+normal+color+texCoord: 1.541 giga-vertices/s | |
| Packed buffer performance - 2x buffer using 16-byte struct | |
| read multiple times and unpacked | |
| into position+normal+color+texCoord: 1.529 giga-vertices/s | |
| Four attributes performance - 4x vec4 attribute | |
| (all attributes used): 790.1 mega-vertices/s | |
| Four buffers performance - 4x vec4 buffer | |
| (all buffers read in VS): 788.8 mega-vertices/s | |
| Four buffers performance - 4x vec3 buffer | |
| (all buffers read in VS): 1.041 giga-vertices/s | |
| Four interleaved attributes performance - 4x vec4 | |
| (4x vec4 fetched from the single buffer | |
| on consecutive locations: 787.8 mega-vertices/s | |
| Four interleaved buffers performance - 4x vec4 | |
| (4x vec4 fetched from the single buffer | |
| on consecutive locations: 790.8 mega-vertices/s | |
| Four attributes performance - 2x vec4 and 2x uint attribute | |
| (2x vec4f32 + 2x vec4u8, 2x conversion from vec4u8 | |
| to vec4): 1.250 giga-vertices/s | |
| Transformations: | |
| Matrix performance - one matrix as uniform for all triangles | |
| (maxtrix read in VS, | |
| coordinates in vec4 attribute): 2.236 giga-vertices/s | |
| Matrix performance - per-triangle matrix in buffer | |
| (different matrix read for each triangle in VS, | |
| coordinates in vec4 attribute): 1.255 giga-vertices/s | |
| Matrix performance - per-triangle matrix in attribute | |
| (triangles are instanced and each triangle receives a different matrix, | |
| coordinates in vec4 attribute: 2.068 giga-vertices/s | |
| Matrix performance - one matrix in buffer for all triangles and 2x uvec4 | |
| packed attributes (each triangle reads matrix from the same place in | |
| the buffer, attributes unpacked): 1.384 giga-vertices/s | |
| Matrix performance - per-triangle matrix in the buffer and 2x uvec4 packed | |
| attributes (each triangle reads a different matrix from a buffer, | |
| attributes unpacked): 888.4 mega-vertices/s | |
| Matrix performance - per-triangle matrix in buffer and 2x uvec4 packed | |
| buffers (each triangle reads a different matrix from a buffer, | |
| packed buffers unpacked): 930.4 mega-vertices/s | |
| Matrix performance - GS reads per-triangle matrix from buffer and 2x uvec4 | |
| packed buffers (each triangle reads a different matrix from a buffer, | |
| packed buffers unpacked in GS): 625.5 mega-vertices/s | |
| Matrix performance - per-triangle matrix in buffer and four attributes | |
| (each triangle reads a different matrix from a buffer, | |
| 4x vec4 attribute): 593.7 mega-vertices/s | |
| Matrix performance - 1x per-triangle matrix in buffer, 2x uniform matrix and | |
| and 2x uvec4 packed attributes (uniform view and projection matrices | |
| multiplied with per-triangle model matrix and with unpacked attributes of | |
| position, normal, color and texCoord: 880.3 mega-vertices/s | |
| Matrix performance - 2x per-triangle matrix (mat4+mat3) in buffer, | |
| 3x uniform matrix (mat4+mat4+mat3) and 2x uvec4 packed attributes | |
| (full position and normal computation with MVP and normal matrices, | |
| all matrices and attributes multiplied): 678.5 mega-vertices/s | |
| Matrix performance - 2x per-triangle matrix (mat4+mat3) in buffer, | |
| 2x non-changing matrix (mat4+mat4) in push constants, | |
| 1x constant matrix (mat3) and 2x uvec4 packed attributes (all | |
| matrices and attributes multiplied): 668.5 mega-vertices/s | |
| Matrix performance - 2x per-triangle matrix (mat4+mat3) in buffer, 2x | |
| non-changing matrix (mat4+mat4) in specialization constants, 1x constant | |
| matrix (mat3) defined by VS code and 2x uvec4 packed attributes (all | |
| matrices and attributes multiplied): 693.0 mega-vertices/s | |
| Matrix performance - 2x per-triangle matrix (mat4+mat3) in buffer, | |
| 3x constant matrix (mat4+mat4+mat3) defined by VS code and | |
| 2x uvec4 packed attributes (all matrices and attributes | |
| multiplied): 697.4 mega-vertices/s | |
| Matrix performance - GS five matrices processing, 2x per-triangle matrix | |
| (mat4+mat3) in buffer, 3x uniform matrix (mat4+mat4+mat3) and | |
| 2x uvec4 packed attributes passed through VS (all matrices and | |
| attributes multiplied): 509.2 mega-vertices/s | |
| Matrix performance - GS five matrices processing, 2x per-triangle matrix | |
| (mat4+mat3) in buffer, 3x uniform matrix (mat4+mat4+mat3) and | |
| 2x uvec4 packed data read from buffer in GS (all matrices and attributes | |
| multiplied): 521.1 mega-vertices/s | |
| Textured Phong and Matrix performance - 2x per-triangle matrix | |
| in buffer (mat4+mat3), 3x uniform matrix (mat4+mat4+mat3) and | |
| four attributes (vec4f32+vec3f32+vec4u8+vec2f32), | |
| no fragments produced: 618.3 mega-vertices/s | |
| Textured Phong and Matrix performance - 1x per-triangle matrix | |
| in buffer (mat4), 2x uniform matrix (mat4+mat4) and | |
| four attributes (vec4f32+vec3f32+vec4u8+vec2f32), | |
| no fragments produced: 807.6 mega-vertices/s | |
| Textured Phong and Matrix performance - 1x per-triangle matrix | |
| in buffer (mat4), 2x uniform matrix (mat4+mat4) and 2x uvec4 packed | |
| attribute, no fragments produced: 866.7 mega-vertices/s | |
| Textured Phong and Matrix performance - 1x per-triangle row-major matrix | |
| in buffer (mat4), 2x uniform not-row-major matrix (mat4+mat4), | |
| 2x uvec4 packed attributes, | |
| no fragments produced: 949.0 mega-vertices/s | |
| Textured Phong and Matrix performance - 1x per-triangle mat4x3 matrix | |
| in buffer, 2x uniform matrix (mat4+mat4) and 2x uvec4 packed attributes, | |
| no fragments produced: 1.045 giga-vertices/s | |
| Textured Phong and Matrix performance - 1x per-triangle row-major mat4x3 | |
| matrix in buffer, 2x uniform matrix (mat4+mat4), 2x uvec4 packed | |
| attribute, no fragments produced: 1.045 giga-vertices/s | |
| Textured Phong and PAT performance - PAT v1 (Position-Attitude-Transform, | |
| performing translation (vec3) and rotation (quaternion as vec4) using | |
| implementation 1), PAT is per-triangle 2x vec4 in buffer, | |
| 2x uniform matrix (mat4+mat4), 2x uvec4 packed attributes, | |
| no fragments produced: 1.176 giga-vertices/s | |
| Textured Phong and PAT performance - PAT v2 (Position-Attitude-Transform, | |
| performing translation (vec3) and rotation (quaternion as vec4) using | |
| implementation 2), PAT is per-triangle 2x vec4 in buffer, | |
| 2x uniform matrix (mat4+mat4), 2x uvec4 packed attributes, | |
| no fragments produced: 1.177 giga-vertices/s | |
| Textured Phong and PAT performance - PAT v3 (Position-Attitude-Transform, | |
| performing translation (vec3) and rotation (quaternion as vec4) using | |
| implementation 3), PAT is per-triangle 2x vec4 in buffer, | |
| 2x uniform matrix (mat4+mat4), 2x uvec4 packed attributes, | |
| no fragments produced: 1.174 giga-vertices/s | |
| Textured Phong and PAT performance - constant single PAT v2 sourced from | |
| the same index in buffer (vec3+vec4), 2x uniform matrix (mat4+mat4), | |
| 2x uvec4 packed attributes, | |
| no fragments produced: 1.404 giga-vertices/s | |
| Textured Phong and PAT performance - indexed draw call, per-triangle PAT v2 | |
| in buffer (vec3+vec4), 2x uniform matrix (mat4+mat4), 2x uvec4 packed | |
| attribute, no fragments produced: 1.089 giga-vertices/s | |
| Textured Phong and PAT performance - indexed draw call, constant single | |
| PAT v2 sourced from the same index in buffer (vec3+vec4), | |
| 2x uniform matrix (mat4+mat4), 2x uvec4 packed attributes, | |
| no fragments produced: 1.296 giga-vertices/s | |
| Textured Phong and PAT performance - primitive restart, indexed draw call, | |
| per-triangle PAT v2 in buffer (vec3+vec4), 2x uniform matrix (mat4+mat4), | |
| 2x uvec4 packed attributes, | |
| no fragments produced: 1.119 giga-vertices/s | |
| Textured Phong and PAT performance - primitive restart, indexed draw call, | |
| constant single PAT v2 sourced from the same index in buffer (vec3+vec4), | |
| 2x uniform matrix (mat4+mat4), 2x uvec4 packed attributes, | |
| no fragments produced: 1.317 giga-vertices/s | |
| Textured Phong and double precision matrix performance - double precision | |
| per-triangle matrix in buffer (dmat4), double precision per-scene view | |
| matrix in uniform (dmat4), both matrices converted to single precision | |
| before computations, single precision per-scene perspective matrix in | |
| uniform (mat4), single precision vertex positions, packed attributes | |
| (2x uvec4), no fragments produced: 676.3 mega-vertices/s | |
| Textured Phong and double precision matrix performance - double precision | |
| per-triangle matrix in buffer (dmat4), double precision per-scene view | |
| matrix in uniform (dmat4), both matrices multiplied in double precision, | |
| single precision vertex positions, single precision per-scene | |
| perspective matrix in uniform (mat4), packed attributes (2x uvec4), | |
| no fragments produced: 581.8 mega-vertices/s | |
| Textured Phong and double precision matrix performance - double precision | |
| per-triangle matrix in buffer (dmat4), double precision per-scene view | |
| matrix in uniform (dmat4), both matrices multiplied in double precision, | |
| double precision vertex positions (dvec3), single precision per-scene | |
| perspective matrix in uniform (mat4), packed attributes (3x uvec4), | |
| no fragments produced: 542.5 mega-vertices/s | |
| Textured Phong and double precision matrix performance using GS - double | |
| precision per-triangle matrix in buffer (dmat4), double precision | |
| per-scene view matrix in uniform (dmat4), both matrices multiplied in | |
| double precision, double precision vertex positions (dvec3), single | |
| precision per-scene perspective matrix in uniform (mat4), packed | |
| attributes (3x uvec4), | |
| no fragments produced: 222.0 mega-vertices/s | |
| Fragment throughput: | |
| Single full-framebuffer quad, | |
| constant color FS: 25.51 giga-fragments/s | |
| 10x full-framebuffer quad, | |
| constant color FS: 35.45 giga-fragments/s | |
| Four smooth interpolators (4x vec4), | |
| 10x fullscreen quad: 35.44 giga-fragments/s | |
| Four flat interpolators (4x vec4), | |
| 10x fullscreen quad: 35.47 giga-fragments/s | |
| Four textured phong interpolators (vec3+vec3+vec4+vec2), | |
| 10x fullscreen quad: 35.33 giga-fragments/s | |
| Textured Phong, packed uniforms (four smooth interpolators | |
| (vec3+vec3+vec4+vec2), 4x uniform (material (56 byte) + | |
| globalAmbientLight (12 byte) + light (64 byte) + sampler2D), | |
| 10x fullscreen quad): 8.501 giga-fragments/s | |
| Textured Phong, not packed uniforms (four smooth interpolators | |
| (vec3+vec3+vec4+vec2), 4x uniform (material (72 byte) + | |
| globalAmbientLight (12 byte) + light (80 byte) + sampler2D), | |
| 10x fullscreen quad): 8.500 giga-fragments/s | |
| Simplified Phong, no texture, no specular (2x smooth interpolator | |
| (vec3+vec3), 3x uniform (material (vec4+vec4) + globalAmbientLight | |
| (vec3) + light (48 bytes: position+attenuation+ambient+diffuse)), | |
| 10x fullscreen quad): 15.65 giga-fragments/s | |
| Simplified Phong, no texture, no specular, single uniform | |
| (2x smooth interpolator (vec3+vec3), 1x uniform | |
| (material+globalAmbientLight+light (vec4+vec4+vec4 + 3x vec4), | |
| 10x fullscreen quad): 15.64 giga-fragments/s | |
| Constant color from uniform, 1x uniform (vec4) in FS, | |
| 10x fullscreen quad: 35.38 giga-fragments/s | |
| Constant color from uniform, 1x uniform (uint) in FS, | |
| 10x fullscreen quad: 35.41 giga-fragments/s | |
| Transfer throughput: | |
| Transfer of consecutive blocks: | |
| 4 bytes: 101.212ns per transfer (0.0368068 GiB/s) | |
| 4 bytes: 96.904ns per transfer (0.0384431 GiB/s) | |
| 8 bytes: 108.368ns per transfer (0.0687526 GiB/s) | |
| 16 bytes: 122.692ns per transfer (0.121452 GiB/s) | |
| 32 bytes: 154.088ns per transfer (0.193411 GiB/s) | |
| 64 bytes: 182.656ns per transfer (0.326322 GiB/s) | |
| 128 bytes: 187.24ns per transfer (0.636666 GiB/s) | |
| 256 bytes: 163.691ns per transfer (1.45651 GiB/s) | |
| 512 bytes: 169.15ns per transfer (2.81901 GiB/s) | |
| 1024 bytes: 181.133ns per transfer (5.26506 GiB/s) | |
| 2048 bytes: 205.391ns per transfer (9.28644 GiB/s) | |
| 4096 bytes: 158.984ns per transfer (23.9942 GiB/s) | |
| 8192 bytes: 313.594ns per transfer (24.3289 GiB/s) | |
| 16384 bytes: 621.875ns per transfer (24.5367 GiB/s) | |
| 32768 bytes: 1243.12ns per transfer (24.5491 GiB/s) | |
| 65536 bytes: 2473.75ns per transfer (24.6731 GiB/s) | |
| 131072 bytes: 4970ns per transfer (24.5614 GiB/s) | |
| 262144 bytes: 9900ns per transfer (24.6607 GiB/s) | |
| 524288 bytes: 19830ns per transfer (24.6234 GiB/s) | |
| 1048576 bytes: 39620ns per transfer (24.6482 GiB/s) | |
| 2097152 bytes: 79280ns per transfer (24.6358 GiB/s) | |
| Transfer of spaced blocks: | |
| 4 bytes: 101.916ns per transfer (0.0365526 GiB/s) | |
| 4 bytes: 102.104ns per transfer (0.0364853 GiB/s) | |
| 8 bytes: 115.428ns per transfer (0.0645474 GiB/s) | |
| 16 bytes: 143.096ns per transfer (0.104134 GiB/s) | |
| 32 bytes: 143.728ns per transfer (0.207352 GiB/s) | |
| 64 bytes: 145.828ns per transfer (0.408733 GiB/s) | |
| 128 bytes: 157.236ns per transfer (0.758155 GiB/s) | |
| 256 bytes: 161.997ns per transfer (1.47175 GiB/s) | |
| 512 bytes: 167.139ns per transfer (2.85294 GiB/s) | |
| 1024 bytes: 185.586ns per transfer (5.13872 GiB/s) | |
| 2048 bytes: 218.594ns per transfer (8.72554 GiB/s) | |
| 4096 bytes: 161.797ns per transfer (23.5771 GiB/s) | |
| 8192 bytes: 312.812ns per transfer (24.3897 GiB/s) | |
| 16384 bytes: 625.625ns per transfer (24.3897 GiB/s) | |
| 32768 bytes: 1251.88ns per transfer (24.3775 GiB/s) | |
| 65536 bytes: 2508.75ns per transfer (24.3289 GiB/s) | |
| 131072 bytes: 5012.5ns per transfer (24.3532 GiB/s) | |
| 262144 bytes: 9935ns per transfer (24.5738 GiB/s) | |
| 524288 bytes: 19740ns per transfer (24.7356 GiB/s) | |
| 1048576 bytes: 39580ns per transfer (24.6731 GiB/s) | |
| 2097152 bytes: 78960ns per transfer (24.7356 GiB/s) | |
| Measurement statistics: | |
| Triangle throughput measurement time: 3.05 seconds using 288 test rounds. | |
| Vertex throughput measurement time: 0.349 seconds using 288 test rounds. | |
| Attribute and Buffer measurement time: 1.27 seconds using 288 test rounds. | |
| Transformation measurement time: 3.65 seconds using 288 test rounds. | |
| Fragment throughput measurement time: 3.21 seconds using 288 test rounds. | |
| Transfer throughput measurement time: 7.32 seconds using 288 test rounds. | |
| Total device time: 18.5 seconds. | |
| Total real time: 20 seconds. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment