C2DN: How to Harness Erasure Codes at the Edge for Efficient Content Delivery

Abstract

introduce erasure coding into the CDN architecture and use the parity chunks to re-balance the write load across servers

Introduction

逻辑是这样的:在 CDN cache 里,miss 会带来诸多坏处,而且更重要的是,tail performance 也非常值得关注: “The design goal of a CDN is to consistently improve download performance for all objects on a content provider’s site, in every time window, and for each client location.”

本文非常想避免 miss spike,来源是 server unavailibilty(并不少见): A server is declared to be “unavailable” and (temporarily) taken out of service if it is deemed incapable of serving content to users within specified performance bounds.

虽然直观上备份 cache 就可以做到大概率避免 miss spike(因为总能在另一台机器里找到数据),但这样会浪费 50% 的空间(如果备份更多就更 space-inefficient 了)。本文希望用 erasure-coding 解决这个问题,同时顺带优化一下 write imbalance to SSD (due to DNS based load balancing),分配数据考虑为 Max Flow problem。

Background

CDN request routing stands in contrast to sharding in key-value caches, such as Memcached and Redis, where consistent hashing is often applied at a per-object granularity. In CDNs, load balancing decisions are taken on the granularity of groups of objects called buckets. Each bucket, in a DNS-based load balancer, correspond to a domain name that is resolved to obtain one or more server IPs that host objects in that bucket. This resolution is computed using consistent hashing.

Operating Costs of a CDN: 25% Bandwidth (miss traffic, midgress, 20% + egress, 5%) + 20% server deprecation

Production CDN Trace Analysis

Unavailability rates can appear high compared to published failure rates in large data centers and HPC-systems, due to less efficient cooling systems, less power redundancy, and rigorous definition of server unavailability.

In contrast to storage systems, where replica- tion guarantees durability, in CDN clusters, servers perform cache evictions independently. Objects that are admitted to two caches at the same time may be evicted at different times. This is particularly common if the two caches evict objects at very different rates, making replication ineffective.

C2DN System Design

C2DN Implementation

后略

评论

疑问:

  • 虽然一直说 “the goal is to ensure good “tail” performance in every time interval for every content provider from every cluster.” 但真的有必要那么在意 tail latency 吗?偶尔一个文件下载比较慢真的会很影响吗?(当然我理解了少一点 miss traffic 可以省钱)
  • 另外虽然一直说这是个cache,以及 write rate difference 会影响 replication 或者在 erasure coding 的情况下会出现 partial hits,但对于一个 caching 工作,却不提 eviction / promotion strategies?