当前位置:
文档之家› NetApp 重复数据删除技术
NetApp 重复数据删除技术
16
A-SIS Deduplication: 命令
授权激活
– license add <a_sis>
启动
– sis on <vol>
重复项处理已存在的数据
– sis start -s <vol>
规划何时进行 deduplicate 或是手动
– sis config [-s schedule] <vol> – sis start <vol>
Original Data Deduplicated Data New Data
Actual Storage Consumed
基于时间的重复项扫描 从若干个备份拷贝中去除重复项 空间节省率随者时间的推移而提 高 每次备份结束运行Deduplication 重复项扫描 节省可见空间率: 20:1甚至更多
Volume Deduplication
Original Data Volume Duplicates Identified And Removed Actual Storage Consumed
卷的重复项扫描 在单一的卷中去除重复的数据 适用于归档和压力不大的主存储 系统 Deduplication周期性地基于变化 进行重复项扫描 节省体现为全卷的百分比
排序
Path /vol/vol5
State Enabled
Status Progress Active 25 MB Searched
e Enabled
Status Progress Active 40MB (20%) done
核验
Path /vol/vol5
业界第一个普遍意义的重复数据删除技术 到2008年5月,已经安装了~6,600 个许可
– 系统总容量约 185PB – 平均空间节省达 30%
© 2008 NetApp. All rights reserved.
NetApp Confidential -- Do Not Distribute
2
支持Deduplication的FAS系统 的 支持 系统
应用透明的重复项合并 显著的容量节省:
– 备份数据 – 归档数据 – 访问压力不大的主数据
© 2008 NetApp. All rights reserved. NetApp Confidential -- Do Not Distribute 7
实现的技术:WAFL 数据块共享
Deduplication 在 WAFL 文件系统树中实现数据块共享 一个单独的数据块可被索引 256 次
Update Inode
qsort
qsort ... qsort Duplicate Merge Sort Entry File Block Ref Count File
Deduplicating
Sorting
Fingerprint File
© 2008 NetApp. All rights reserved.
Fingerprint File
© 2008 NetApp. All rights reserved. NetApp Confidential -- Do Not Distribute 9
A-SIS Deduplication: How it really works!
Block Write Log New FPs
空间节省变化基于你的数据类型 NetApp 空间节省估算工具用于 POC 的测试
100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%
Tag line, tag line
NetApp 重复数据删除技术
NetApp Deduplication History
NetApp Deduplication for FAS:
以前的名称 “A-SIS deduplication” Supports R200, FAS2000, FAS3000, and FAS6000 注:最小支持版本 7.2.4
Initialization (only necessary on pre-existing volume)
Block Write Log New FPs
Change Log File Change Log File
Gather Gatherer File
qsort
qsort ... qsort Merge Sort
© 2008 NetApp. All rights reserved.
NetApp Confidential -- Do Not Distribute
6
Deduplication “数据块级 重复项合并 数据块级” 数据块级
原始数 据文件 重复数据 块确认 重复数据 块移除
(在字节级校验后)
对于应用和用户来 说文件没有任 何变化
NetApp Confidential -- Do Not Distribute 14
A-SIS Deduplication: How it really works!
Initialization (only necessary on pre-existing volume
Block Write Log New FPs
A-SIS Deduplication: How it really works!
Block Write Log New FPs
Change Log File Change Log File
Byte-by-Byte Compare Increment and decrement Block Ref. Count File Update new inode
Sort by Inode
Update Inode
qsort
qsort ... qsort Duplicate Merge Sort Entry File Block Ref Count File
Fingerprint File
© 2008 NetApp. All rights reserved. NetApp Confidential -- Do Not Distribute 13
NetApp Confidential -- Do Not Distribute
3
为什么需要 Deduplication for FAS? 降低存储成本
FC – Based Systems
$/GBeffective
SATA – Based Systems RAID-DP
Primary (FC) Primary & NearStore (SATA) Dedupe Space Savings “Other” Space Savings
A-SIS Deduplication Upcoming Features
© 2008 NetApp. All rights reserved.
NetApp Confidential -- Do Not Distribute
4
Deduplication for FAS
高级单一实例存储
– 数据块级重复识别
INODE 1
INODE 2
IND
IND
IND
IND
DATA
DATA
DATA
DATA
© 2008 NetApp. All rights reserved.
NetApp Confidential -- Do Not Distribute
8
A-SIS Deduplication: How it really works!
A-SIS Deduplication: How it really works!
Block Write Log New FPs
Change Log File Change Log File
Fingerprint File
© 2008 NetApp. All rights reserved.
SIS Check
Change Log File Change Log File
Byte-by-Byte Compare Increment and decrement Block Ref. Count File Update new inode
Sort by Inode
Update Inode
qsort
qsort ... qsort Duplicate Merge Sort Entry File Block Ref Count File
State Enabled OR
Status Progress Active 30MB Verified
/vol/vol5
Enabled
Active
10% Merged
© 2008 NetApp. All rights reserved.
NetApp Confidential -- Do Not Distribute
检查状态
– sis status [-l] <vol>
检查节省的空间!
– df –s <vol>
© 2008 NetApp. All rights reserved. NetApp Confidential -- Do Not Distribute 17
A-SIS Deduplication 空间节省
Change Log File Change Log File
Gather Gatherer File
Gathering
Byte-by-Byte Compare Increment and decrement Block Ref. Count File Update new inode