A
Admin
Admin
Explorer Investor
20 Mar 2026 11:26
·401 views
📌 Pinned
QubGPU Benchmarks — Share Your Performance Results
⚡ QubGPU — Neural Datagram Protocol is designed for extreme throughput on modern GPU clusters. This thread is the community benchmark board.
When sharing results, please include:
- GPU model and VRAM
- Driver version and CUDA/ROCm version
- Batch size and precision (FP32 / BF16 / INT8)
- Throughput (tokens/sec or TFLOPS)
- Latency P50 / P95 / P99
- Any custom kernel patches applied
Comparing implementations? Use the standardised qubgpu-bench CLI tool included in the SDK — it ensures reproducible results across environments.
1 reply
You must be logged in to reply.
Login to ReplyCategory
QubGPU
Rules
- 1. This category is for QubGPU / Neural Datagram Protocol (NDP) discussions.
- 2. Benchmark posts must include hardware specs, driver version, and methodology.
- 3. Do not publish GPL/proprietary code without proper licence attribution.
- 4. Discussions of GPU overclocking or unofficial kernel patches are at your own risk.
- 5. QubitPage does not endorse unofficial modifications discussed in this forum.