Supercharge AI - GPU Power Meets Cyber Resilience
As artificial intelligence (AI) transforms enterprise operations - from customer engagement to product innovation - the demand for high-throughput, secure infrastructure has become paramount.
Central to this transformation is the graphics processing unit (GPU) - as made prominent by leading vendor NVIDIA, which now handles an overwhelming portion of AI model training and inference workloads.
According to KD Market Insights, the GPU market is projected to grow at a compound annual growth rate (CAGR) of 14.2% from 2024 to 2033, reaching an estimated revenue of USD 1,409.7 billion ($1.4 Trillion!) by the end of 2033.
However, two often-overlooked components are equally critical to the performance of AI platforms: storage system throughput and cybersecurity readiness. Gartner provides detailed information regarding the importance of these elements in ensuring an overall robust AI infrastructure.
Bridging The Gap: Storage systems must evolve to match GPU speed
Modern AI models process massive datasets, and their effectiveness increasingly hinges on how efficiently data can be delivered to GPU clusters. GPUs equipped with substantial internal memory configurations and high-speed networking technologies highlight the necessity for external storage systems to deliver way beyond what legacy storage can offer.
Given that each GPU typically can drive around 2GB per second of data throughput, an 8-GPU configuration demands 16GB per second - requirements that multiply significantly in larger AI superclusters. The priority, therefore, is not merely raw storage capacity but the throughput efficiency per petabyte.
File Systems, Object Storage & GPUDirect
POSIX-compliant file systems remain foundational to AI workflows, especially when paired with NVIDIA’s GPUDirect - a technology that enables direct I/O between storage and GPU memory, eliminating CPU bottlenecks. Yet, a transition is underway.
Object storage keeps gaining ground - particularly in cloud environments where hyperscale providers deploy it extensively. Given the advantages in scale of object storage, and its intrinsically lower overhead vs file systems, collaborations with storage vendors in the storage market suggest that soon an object-native storage access method for GPUDirect will become standard practice, including in on-prem deployments. Multiple articles and analysts have recently discussed these advantages of object storage for AI model processing.
However, real-time inference workloads, which rely on rapid “in-memory” processing of billions (and soon trillions) of model “tokens”, are less suited to large-scale external storage. These applications require ultra-low latency and compute-proximate storage, reinforcing the need for storage architectures that are finely tuned to specific use cases.
Storage: Still a blind spot in AI team strategies
Despite its omnipresence, storage is often deprioritised by AI and data science teams. Many projects continue to rely on legacy infrastructure, even as newer solutions emerge, tailored specifically for AI workloads. As AI models become more complex and data-intensive, the need for scalable, high-performance storage is critical. Disaggregated storage architectures, which separate storage resources from compute resources, enable independent scaling and efficient resource utilisation, addressing the high-performance demands of modern AI applications .
The Other Bottleneck: Security in high-performance AI environments
While performance remains at the forefront of our minds, the security posture of AI infrastructure is becoming increasingly critical. This is especially true as workloads move into multi-tenant, cloud-native environments. Technologies such as GPUDirect, while boosting throughput, can also introduce new security vulnerabilities.
For example, shared GPU memory can lead to unauthorised data access (leakage), allowing unauthorised access across tenants. Direct interface access opens pathways for malware injection via memory buffer exploits.
Furthermore, in insufficiently isolated environments, a compromised workload from one tenant can threaten the integrity of others.
These risks are amplified in cloud and high-performance computing (HPC) contexts, where hardware is virtualised and shared across users. Yet, many organisations continue to assume such environments are inherently secure - an assumption that may prove costly.
Securing AI workloads: A strategic framework
To effectively secure AI workloads in high-performance computing environments, enterprises must evolve beyond static, perimeter-based defences and adopt infrastructure-deep, workload-aware security strategies. This necessitates a resilient framework that embeds security into the core of AI infrastructure. Key components include the implementation of granular access controls that enforce strict, identity-based policies governing access to GPUs and storage resources.
Equally critical is the deployment of comprehensive encryption protocols that protect data across its entire lifecycle - at rest, in transit, and, where technically feasible, during processing - leveraging advanced technologies such as homomorphic encryption and Trusted Execution Environments (TEEs).
Additionally, organisations should adopt software-defined storage architectures that are inherently resilient, integrating cyber-defence mechanisms like data immutability, write-once-read-many (WORM) capabilities, and real-time anomaly detection. Finally, secure-by-design object storage solutions must be prioritised, particularly in cloud-native deployments, offering native telemetry, built-in threat detection, and automated recovery workflows to ensure data integrity and availability under adverse conditions.
Accelerate With Caution: Mastering the Speed-Security Balance
As AI platforms grow in scale and complexity, the trade-off between performance and security must be reimagined. In reality, organizations can no longer afford to prioritize one at the expense of the other–in fact, both are indispensable.
The future of AI infrastructure lies in high-throughput, low-latency storage systems, increasingly built around object storage paradigms with direct GPU integration, and hardened through modern, adaptive cybersecurity measures. Enterprises that align their infrastructure strategies with this vision will be positioned to harness AI’s transformative power securely and sustainably.
Paul Speciale is Chief Marketing Officer at Scality
You Might Also Read:
A New Era Of Accelerated Computing:
If you like this website and use the comprehensive 8,000-plus service supplier Directory, you can get unrestricted access, including the exclusive in-depth Directors Report series, by signing up for a Premium Subscription.
- Individual £5 per month or £50 per year. Sign Up
- Multi-User, Corporate & Library Accounts Available on Request
- Inquiries: Contact Cyber Security Intelligence
Cyber Security Intelligence: Captured Organised & Accessible