Pesto: Online Storage Performance Management in Virtualized Datacenters

(没有搜到其他的中文博客,尽量写写)

Abstract

Previous: resort to gross overprovisioning or static partitioning of storage

Pesto:

  • completely automate storage performance management for virtualized datacenters
  • provide IO load balancing with cost-benefit analysis, per-device congestion management, and initial placement of new workloads

Introduction

2 types of existing approaches:

  • offline measurements
  • analytical approaches relying on detailed device information

Background

Our goal is to monitor the array from the outside and predict performance metrics (e.g., expected average latency, peak device IOPS, proximity to peak performance) by utilizing a black-box approach. 这个 black-box approach 可以用最本质的方式解释,也就是 queueing theory,我在后面的位置解释。

Define Outstanding IOs: When an application has requested for certain IO to be performed (reads or writes), we’ve already seen how its broken down into IO commands.These commands are sent to storage devices, and until they return and complete, they are termed as outstanding IO commands. –> 其实就是 queueing

Pesto System Overview

LQ-slope Performance Model

简单来说,这在 queueing theory 里,就是一个 closed-system benchmark(因此也不需要考虑 workload patterns,相比 open system),高 MPL (multi-player level) -> “Unlike previous techniques, we do not try to model service times or IOPS for light workloads”(因此只需要知道 bottleneck 的数字,也就是 LQ-slope 代表的)。

latency 和 queue length 之间满足 Little’s Law,throughput 在高 MPL 的情况下被 bottleneck 限制,可以直接算。

后面略。

评论

不能说不漂亮,虽然处理的是最简单的一类假设情况下的问题,但是我略掉的一部分也处理了很多根据预测出来的 latency 和 throughput 如何做 placement / load balancing / datastore removal or addition / cost analysis。