本报告的核心论点是,对于大多数企业而言,解决 AI 数据瓶颈的最佳路径并非对现有存储系统进行全盘的“推倒重来”(rip-and-replace)。更有效、更经济的策略是部署一个软件定义的“中间件”或数据编排层。该层能够统一异构、分散的数据资产,使其无论物理位置如何,都能被 AI 工作负载高效访问,从而在保留现有投资的同时,架起一座连接传统基础设施与现代 AI 性能需求的桥梁。本报告将深入剖析这一方法,聚焦其架构原理,并对市场上的主流解决方案进行比较分析。
遗留系统的困境:在现有存储基础设施上运行 AI
在企业寻求利用人工智能时,其现有的存储基础设施往往成为一个严峻的挑战。传统存储架构与现代 AI 工作负载的需求之间存在着根本性的不匹配,这主要体现在以下几个方面。首先,传统的存储区域网络(SAN)主要为结构化数据和交易型工作负载设计,其架构无法有效应对 AI 所需的大规模并行处理和复杂的混合 I/O 模式 9。特别是传统的横向扩展 NAS,其以控制器(机头)为中心的设计,在面对成千上万个 GPU 核心发起的元数据密集型操作和高并发访问时,极易成为性能瓶颈 9。
其次,成熟企业面临的首要问题是数据孤岛化和碎片化。宝贵的训练数据被困在相互隔离的 NAS、SAN、对象存储和云存储桶中,格式各异且互不兼容 20。这种割裂状态使得 AI 模型无法获得一个全面、统一的数据视图,从而严重限制了其分析能力和最终效果。最后,“数据重力”(Data Gravity)效应进一步加剧了这一困境。企业积累的海量数据具有巨大的“惯性”,使得数据迁移变得成本高昂、耗时漫长且充满风险 25。迁移的成本不仅在于数据传输本身,更在于需要重写应用依赖、重新培训员工以及应对潜在的业务中断,这些因素共同构成了一个强大的阻力,使得大规模、一次性的数据迁移方案往往不切实际 26。
面对这一困境,企业站在了一个战略性的十字路口:是选择将数据全部迁移到一个为 AI 量身打造的全新存储平台,还是在现有基础设施之上进行“就地现代化”改造。后者通过在现有存储系统之上叠加一个数据编排平台来实现。这两种策略在成本、风险和价值实现速度上存在显著差异,需要进行审慎的权衡。
表 1: 数据现代化策略的战略比较
维度
全面迁移 (Full Migration)
就地现代化 (Modernize-in-Place)
前期成本
高。需要采购全新的硬件平台和软件。
中低。主要为软件许可费用,可利旧现有硬件。
运营中断
高。需要应用停机、复杂的数据割接和验证流程。
低。通过非中断的数据“同化”(Assimilation)技术,对现有业务影响极小 29。
AI 价值实现时间
长。必须等待数据迁移完成后,AI 项目才能全面展开和扩展。
短。AI 项目几乎可以立即开始使用被统一视图纳管的数据。
峰值性能
潜力极高。在专为 AI 设计的硬件上可达到理论上的最高性能。
高。性能受限于底层硬件,但通过并行访问等技术得到极大提升。
风险状况
高。面临迁移失败、数据丢失、项目延期和超出预算等风险。
低。现有系统保持完整,可采用渐进式、分阶段的部署模式。
管理复杂性
降低。最终统一到一个新的管理平台。
可能增加。需要管理一个新的软件层,但该层旨在抽象和简化底层复杂性。
现有资产浪费
高。大量旧有存储硬件被淘汰,造成投资浪费。
低。延长了现有硬件的生命周期,最大化其剩余价值。
架构之桥:数据经纬(Data Fabric)与编排中间件
为了解答企业如何连接传统数据孤岛与现代 AI 计算集群的疑问,理解“中间件”在其中扮演的角色至关重要。实际上,这个“中间件”并非单一产品,而是由功能互补的多个软件层构成的技术栈。
其次是 数据经纬/全局数据平台(Data Fabric / Global Data Platform)。这是一个架构层,其核心价值在于提供一个统一的、逻辑化的数据视图,而无视数据物理上分散存储的现实。它自动化数据的管理和移动,将底层的存储孤岛完全抽象掉 36。如果说工作流编排是“指挥”,那么数据经纬就是“通用翻译器”和“数据传送系统”。它通过创建一个全局统一的命名空间(Global Namespace),让任何授权的应用或用户都能像访问本地文件一样访问全域数据。Hammerspace 全局数据平台、Microsoft Fabric 和 IBM Data Fabric 都是此类解决方案的代表 36。
这两个层次协同工作,构成了现代 AI 数据栈的核心。工作流编排工具负责定义“做什么”和“何时做”,而数据经纬平台则负责确保执行任务时所需的数据能够“即时、高效地送达”。
Hammerspace 并非一个存储硬件,而是一个纯软件平台,其核心目标是创建一个“全局数据环境”(Global Data Environment)29。其架构的基石是将元数据(Metadata)与数据本身(Data)彻底解耦,从而构建一个跨越所有底层存储(无论厂商、无论地点)的、单一且权威的全局命名空间 41。
本报告的分析表明,NVIDIA 所引领的强大 AI 计算能力的普及,已将企业 IT 基础设施的瓶颈从计算端彻底转移到了数据端。为了应对这一挑战,一类新型的、以软件为核心的数据平台应运而生。一个清晰的趋势是,行业的价值正在从存储硬件本身,迁移到能够跨越复杂、混合、多厂商环境,实现数据统一、智能编排和访问加速的软件层。
与其宣告某个单一的“赢家”,本报告为企业架构师和决策者提供一个务实的、分步走的战略框架,以应对 AI 时代的挑战:
RAPIDS Suite of AI Libraries - NVIDIA Developer, accessed July 12, 2025, https://developer.nvidia.com/rapids
How legacy storage infrastructure could endanger your future - Blocks and Files, accessed July 12, 2025, https://blocksandfiles.com/2025/01/29/how-legacy-storage-infrastructure-could-endanger-your-future/
An Inside Look at Hammerspace's HPC-Grade Architecture - theCUBE Research, accessed July 12, 2025, https://thecuberesearch.com/an-inside-look-at-hammerspaces-hpc-grade-architecture/
Hitachi Accelerates AI-Driven Transformation for Physical and Industrial Applications, accessed July 12, 2025, https://www.hitachi.com/New/cnews/month/2025/03/250319.html
Hitachi Collaborates with NVIDIA to Accelerate Digital Transformation with Generative AI, accessed July 12, 2025, https://www.hitachi.com/New/cnews/month/2024/03/240319.html
Virtual Storage Platform One (VSP One) - Hitachi Vantara Federal, accessed July 12, 2025, https://www.hitachivantarafederal.com/what-we-do/products/data-storage/vsp-one/
Introducing Virtual Storage Platform (VSP) One: One Solution for Government Data Infrastructures - Hitachi Vantara Federal, accessed July 12, 2025, https://www.hitachivantarafederal.com/resources-insights/virtual-storage-platform-one-solution-profile/
AI Software Suite for Enterprise IT - NVIDIA, accessed July 12, 2025, https://www.nvidia.com/en-in/data-center/products/ai-enterprise-suite/
NVIDIA - Hammerspace, accessed July 12, 2025, https://hammerspace.com/hs-partners/technology-partners/nvidia/
Accelerate AI with NeuralMesh™: An Adaptive Storage System Built for AI - WEKA, accessed July 12, 2025, https://www.weka.io/
High-Performance AI Data Platform for AI at Scale - VAST Data, accessed July 12, 2025, https://www.vastdata.com/technology/nvidia
DDN on files, objects and AI training and inference - Blocks and Files, accessed July 12, 2025, https://blocksandfiles.com/2025/02/24/ddn-thinking-on-files-objects-and-ai-training-and-inference/
AI Integration into Legacy Systems: Challenges and Strategies - Optimum, accessed July 12, 2025, https://optimumcs.com/insights/ai-integration-into-legacy-systems-challenges-and-strategies/
Data Storage Solutions | NAS, SAN and DAS | Proactive, accessed July 12, 2025, https://proactive.co.in/solution/data-center/storage
Hammerspace Unveils the Fastest File System in the World for Training Enterprise AI Models at Scale, accessed July 12, 2025, https://hammerspace.com/hammerspace-introduces-high-performance-nas/
The AI Revolution Has a File Problem — And Your Current Storage Can't Solve It - Nasuni, accessed July 12, 2025, https://www.nasuni.com/blog/the-ai-revolution-has-a-file-problem-and-your-current-storage-cant-solve-it/
Enterprise AI and Legacy Systems: A Double-Edged Sword on the Path to Modernization, accessed July 12, 2025, https://medium.com/snowflake/enterprise-ai-and-legacy-systems-a-double-edged-sword-on-the-path-to-modernization-9f54e1da1fab
On-Premises vs. Cloud for AI Workloads - Redapt, accessed July 12, 2025, https://www.redapt.com/blog/on-premises-vs-cloud-for-ai-workloads
How AI Simplifies and Guards Data Migration | Built In, accessed July 12, 2025, https://builtin.com/articles/ai-assisted-data-migration
Automated Data Migration: The Future of Data Transfer - Functionize, accessed July 12, 2025, https://www.functionize.com/ai-agents-automation/automated-data-migration
Data Conversion vs Data Migration: Key Differences - Shinydocs, accessed July 12, 2025, https://shinydocs.com/blog-home/blog/data-conversion-vs-data-migration-key-differences/
Global Data Platform - Hammerspace, accessed July 12, 2025, https://hammerspace.com/software/
Hammerspace Software: Building a Global Data Environment, accessed July 12, 2025, https://www.titandatasolutions.com/ae/wp-content/uploads/sites/2/2022/10/Hammerspace-Data-Sheet-Oct-2022-v2.1.pdf
What is Machine Learning Orchestration | Giskard, accessed July 12, 2025, https://www.giskard.ai/glossary/machine-learning-orchestration
Best Machine Learning Workflow and Pipeline Orchestration Tools - Neptune.ai, accessed July 12, 2025, https://neptune.ai/blog/best-workflow-and-pipeline-orchestration-tools
Top 17 Data Orchestration Tools for 2025: Ultimate Review - lakeFS, accessed July 12, 2025, https://lakefs.io/blog/data-orchestration-tools-2023/
Build production-grade data and ML workflows, hassle-free with Flyte, accessed July 12, 2025, https://flyte.org/
The 6 Best ML Orchestration Tools for Developers - DuploCloud, accessed July 12, 2025, https://duplocloud.com/blog/ml-orchestration/
What is Microsoft Fabric - Microsoft Fabric - Learn Microsoft, accessed July 12, 2025, https://learn.microsoft.com/en-us/fabric/fundamentals/microsoft-fabric-overview
Data Fabric Solutions - IBM, accessed July 12, 2025, https://www.ibm.com/solutions/data-fabric
Workflow Data Fabric - ServiceNow, accessed July 12, 2025, https://www.servicenow.com/now-platform/workflow-data-fabric.html
Category: White Paper - Hammerspace, accessed July 12, 2025, https://hammerspace.com/category/white-paper/
Hammerspace Global Data Platform - dve advanced systems GmbH, accessed July 12, 2025, https://dveas.de/wp-content/uploads/2024/06/02-White-Paper-Global-Data-Platform_v1b-compressed_1.pdf
Hammerspace Global Data Environment, accessed July 12, 2025, https://4911315.fs1.hubspotusercontent-na1.net/hubfs/4911315/HS%20Site%202022%20Downloads/HS%20Whitepaper%202022%20-%20Final.pdf
Simplifying Data Management With Hammerspace - Insights From Analytics, accessed July 12, 2025, https://www.insightsfromanalytics.com/post/simplifying-data-management-with-hammerspace
How To Create a Global Namespace With Hammerspace - Medium, accessed July 12, 2025, https://medium.com/@hammerspace/how-to-create-a-global-namespace-with-hammerspace-a40618de4d45
Automating Data Management With Metadata - YouTube, accessed July 12, 2025, https://m.youtube.com/watch?v=2ev4Nk5WBUA&pp=ygUOI3N1cGVybWV0YWRhdGE%3D
Data Orchestration Services - Hammerspace, accessed July 12, 2025, https://hammerspace.com/data-orchestration/
How the Power of Metadata can Automate Data Management - Hammerspace, accessed July 12, 2025, https://hammerspace.com/how-the-power-of-metadata-can-automate-data-management/
Feed Your Models Faster: Optimizing AI Training Pipelines with Smart Data Storage Orchestration - Hammerspace, accessed July 12, 2025, https://hammerspace.com/feed-your-models-faster-optimizing-ai-training-pipelines-with-smart-data-storage-orchestration/
MetadataHub - Hammerspace, accessed July 12, 2025, https://hammerspace.com/hs-partners/technology-partners/metadatahub/
Hammerspace challenges object storage norms for AI - Blocks and Files, accessed July 12, 2025, https://blocksandfiles.com/2025/02/07/hammerspace-file-vs-object-ai-training/
Category: Technical Brief - Hammerspace, accessed July 12, 2025, https://hammerspace.com/category/technical-brief/
Hyperscale NAS For Dummies - Hammerspace, accessed July 12, 2025, https://hammerspace.com/hyperscale-nas-for-dummies/
Hammerspace Hyperscale NAS Achieves GPUDIRECT Storage Self-Certification from NVIDIA, accessed July 12, 2025, https://hammerspace.com/hammerspace-hyperscale-nas-achieves-gpudirect-storage-self-certification-from-nvidia/
GPUDirect Demystified: Why Your File System is Crucial for Maximum GPU Throughput & Efficient AI Data Storage - Hammerspace, accessed July 12, 2025, https://hammerspace.com/gpudirect-demystified-why-your-file-system-is-crucial-for-maximum-gpu-throughput-efficient-ai-data-storage/
Hammerspace boosts GPU server performance with latest update - Blocks and Files, accessed July 12, 2025, https://blocksandfiles.com/2024/11/14/hammerspace-opens-up-fast-local-gpu-server-nvme-storage-to-speed-ai-training/
Meet VAST Data's DASE Architecture, accessed July 12, 2025, https://www.vastdata.com/platform/how-it-works
EXAScaler Product Family, accessed July 12, 2025, https://www.aspsys.com/wp-content/uploads/2022/01/ddn-exa-data-sheet.pdf
WEKA Data Platform for Generative AI, accessed July 12, 2025, https://www.weka.io/solutions/weka-for-generative-ai/
WEKA Data Platform for NVIDIA DGX BasePOD with H100 Systems, accessed July 12, 2025, https://www.weka.io/wp-content/uploads/files/resources/2024/01/weka-basepod-certification-sb.pdf
Rethinking Data Architecture: VAST Data's Performance, Simplicity & Flexibility, accessed July 12, 2025, https://www.vastdata.com/blog/rethinking-modern-data-architectures
Simplify AI Data Management with DDN EXAscaler Hot Pools, accessed July 12, 2025, https://www.ddn.com/blog/optimized-ai-data-management-using-ddn-exascaler-hot-pools/
Hammerspace Advances GPU Data Orchestration Capabilities, accessed July 12, 2025, https://www.dbta.com/Editorial/News-Flashes/Hammerspace-Advances-GPU-Data-Orchestration-Capabilities-164541.aspx
All-Flash Data Storage for Artificial Intelligence - VAST Data, accessed July 12, 2025, https://www.vastdata.com/usecase/artificial-intelligence
DDN Powers Google Cloud Managed Lustre for AI and HPC Workloads - HPCwire, accessed July 12, 2025, https://www.hpcwire.com/off-the-wire/ddn-powers-google-cloud-managed-lustre-for-ai-and-hpc-workloads/
EXAScaler Cloud - DDN - Microsoft Azure Marketplace, accessed July 12, 2025, https://azuremarketplace.microsoft.com/en-us/marketplace/apps/ddn-whamcloud-5345716.exascaler_cloud_app?tab=overview
High-Performance Storage For NVIDIA Cloud Partners - WEKA, accessed July 12, 2025, https://www.weka.io/wp-content/uploads/files/resources/2024/09/weka-ncp-reference-architecture.pdf
High-Performance Storage for NVIDIA Cloud Partners - WEKA, accessed July 12, 2025, https://www.weka.io/resources/reference-architecture/high-performance-storage-for-nvidia-cloud-partners/
Spotlight: NVIDIA BlueField DPUs Power the VAST Data Platform for AI Workload Optimization, accessed July 12, 2025, https://developer.nvidia.com/blog/spotlight-nvidia-bluefield-dpus-power-the-vast-data-platform-for-ai-workload-optimization/