MinIO, a leader in high-performance storage for AI, has announced new optimizations and benchmarks for Arm®-based chipsets in its object store. These enhancements highlight the significance of low-power, computationally dense chips for AI-related tasks and demonstrate the Arm architecture’s readiness for modern AI and data processing workloads, including erasure coding, bit rot protection, and encryption.
Key Optimizations and Performance Improvements
- SVE Enhancements: MinIO leveraged the latest Scalable Vector Extension Version (SVE) improvements, which boost the performance and efficiency of vector operations essential for high-performance computing, AI, machine learning, and data-intensive applications.
- Reed Solomon Erasure Coding:
- Throughput Gains: MinIO’s existing Reed Solomon erasure coding library saw a 2x increase in throughput compared to the previous NEON instruction set implementation.
- Core Efficiency: The new implementation uses only 16 cores to achieve the same memory bandwidth consumption previously requiring 32 cores, effectively halving the memory bandwidth while doubling performance.
- Highway Hash Algorithm:
- Performance Scaling: The enhancements made to the Highway Hash algorithm for bit-rot detection demonstrate linear performance scaling with increased core count. For larger block sizes, it approaches the memory bandwidth limit around 50 to 52 cores.
- Efficient Memory Access: SVE’s support for lane masking via predicated execution and extensive scatter and gather instructions facilitates efficient memory access.
Industry Impact and Innovations
- Sustainable Infrastructure: Eddie Ramirez, vice president of marketing and ecosystem development at Arm, emphasized the importance of performance gains while improving power efficiency to create sustainable infrastructures for intensive data processing workloads demanded by AI.
- NVIDIA BlueField-3 DPU Integration:
- The latest NVIDIA® BlueField®-3 data processing unit (DPU) features an integrated 16-core Arm-based CPU, simplifying server designs by allowing NVMe drives to connect directly to the networking card, bypassing main server CPUs.
- With 400Gb/s Ethernet, BlueField-3 DPUs enhance software-defined networking, storage, security, and management functions, underscoring the power of disaggregating storage and compute in modern AI architectures.
Continuing Innovation with Arm
- Benchmarking Against Intel Skylake: MinIO’s recent benchmarks showcase the impressive performance of the Arm architecture in data-intensive tasks, following foundational work optimizing Arm-based AWS Graviton2 processors against Intel Skylake chips.
- Future of High-Performance Workloads: Manjusha Gangadharan, Head of Sales and Partnerships at MinIO, stated that as the tech industry evolves around GPUs and DPUs, the need for energy-efficient, computationally dense computing is critical. MinIO’s benchmarks affirm that Arm excels in high-performance data workloads, and the company looks forward to strengthening its partnership with Arm in the AI field.
MinIO’s latest optimizations and benchmarks for Arm-based chipsets not only enhance performance for data-intensive tasks but also set a new standard for sustainable, efficient computing in the AI landscape. The collaboration with Arm reinforces MinIO’s commitment to innovation and leadership in high-performance storage solutions.