SAR: SSD assisted restore optimization for deduplication-based storage systems in the cloud
- 软件学院－会议论文 
The explosive growth of digital content results in enormous strains on the storage systems in the cloud environment. The data deduplication technology has been demonstrated to be very effective in shortening the backup window and saving the network bandwidth and storage space in cloud backup, archiving and primary storage systems such as VM platforms. However, the delay and power consumption of the restore operations from a deduplicated storage can be significantly higher than those without deduplication. The main reason lies in the fact that a file or block is split into multiple small data chunks that are often located in non-sequential locations on HDDs after deduplication, which can cause a subsequent read operation to invoke many HDD I/O requests involving multiple disk seeks. To address this problem, in this paper we propose SAR, an SSD Assisted Restore scheme, that effectively exploits the high random-read performance and low powerconsumption properties of SSDs and the unique data sharing characteristic of deduplication-based storage system by storing in SSDs the unique data chunks with high reference count, small size and non-sequential characteristics. In this way, many critical random-read requests to HDDs are replaced by read requests to SSDs, thus significantly improving the system performance and energy efficiency. The extensive trace-driven and VM restore evaluations on the prototype implementation of SAR show that SAR outperforms the traditional deduplication-based schemes significantly, in terms of both restore performance and energy efficiency. 漏 2012 IEEE.