Traditional cache solutions treat each entry as an immutable blob of data, which poses problems for the append-heavy ingestion workloads that are common in Pravega. Each Event appended to a Stream would either require its own cache entry or need an expensive read-modify-write operation to be included in the Cache. To enable high-performance ingestion of events, big or small, while also providing near-real-time tail reads and high-throughput historical reads, Pravega needs a specialized cache that can natively support the types of workloads that are prevalent in Streaming Storage Systems.
The Streaming Cache, introduced in Pravega with release 0.7, has been designed from the ground up with streaming data in mind and optimizes for appends while organizing the data in a layout that makes eviction and disk spilling easy.
Not all caches are created equal. It is essential to choose a cache that fits the requirements of the system where it will be used, and streaming solutions are no exception to that rule. In this blog post, we describe an innovative way to look at caching that works well with streaming use cases. Continue Reading