The fundamentals of stream semantics in Pravega are learned through familiarity with its client APIs. In this article, we will overview Pravega’s client APIs with a handful of simple examples. As we reach the end, you should see Pravega in action, understand the guarantees afforded by Pravega streams, and have some familiarity with several of the facilities provided by the client API.
Pravega client APIs provide read and write access to data streams. Streams store sequences of bytes. Writers commit new sequences of bytes at the tail position(s) of a stream. Writes to a single stream can be split across shards or segments; and, when writes are accompanied by routing keys, these writes can be ordered within their determined segments.
Streams have scaling policies that allow them to split into several parallel segments. In UNIX, a stream is akin to writing a file in append mode, where several writers are guaranteed to append after each other and not overwrite each other’s contents. An open append-only file typically has one data stream, whereas a stream in Pravega can have many parallel data streams, called segments, allowing an influx of writes to scale horizontally across a cluster, as according to scaling policies and routing keys. Unlike most other distributed message passing or data storage systems, the parallelism of a stream can change over time according to write throughput factors. Writes can be distributed across these parallel stream segments, and there can become more of them or fewer of them over time.