98% GPU Utilization Achieved in 1K GPU-Scale AI Training Using Distributed Cache

In Sept 2023, MLPerf, the authoritative benchmark for AI performance, introduced its Storage Benchmark. This benchmark test allows for large-scale performance testing of storage systems in AI model training scenarios, simulating machine learning I/O workloads without the need for GPUs.

MLPerf supports two types of model training: BERT (natural language model) and UNet3D (3D medical image segmentation). Although it does not support large language models (LLMs) like GPT and LLaMA, BERT and LLMs share the multi-layer transformer structure. LLM users can still obtain valuable insights from the BERT training results.

CategoriesUncategorized