OSCA (ATC20) 论文阅读

Abstract

略

cloud block storage (CBS)

Cloud infrastructures typically employ cache servers, consisting of multiple cache instances competing for the same pool of resources.
The currently used even-allocation policy called EAP or equal cache partitioning determines the cache requirements in advance according to the respective subscribed SLOs and then provisions cache resources for each cache instance. However, this static configuration method is often suboptimal for the cloud environment and induces resource wastage, because the cloud I/O workloads are commonly highly-skewed.

Contributions:

online cache model based on re-access ratio (not collect+process date, but trace unique data blocks)
total hit traffic as metric to gauge cache efficency
search optimal configuration using dynamic programming

暂略

RAR-CM 相比 SHARDS 的好处是用更多的空间换时间，注意，两者都可以 online。

Simulation-based cache modeling: concurrently run multiple simulation instances to determine the cache hit ratio in different cache sizes

略

这块是重点，论文说实话看得我有点乱，引用 slides 里面的例子：

reuse distance 是一个块从上一次被挪到队头，往后移动的实际距离

那么具体如何设计实现呢？

Goal: Overall hit traffic

略

首先是几个问题：

How to manage reuse records (hash map)?
- How to refresh start timestamp? How to do aging? (In this paper, once per day refresh or even not)
- When lots of items being evicted, how long should such records in hash map be kept? How to efficiently/sufficiently clean them?
How fast is MRC construct? (In this paper, sequential construction)
- Maybe coarse-grained bucket is enough

本文还缺省了很多 sensitivity analysis，比如：