Several of the difficulties with tailing a data stream boil down to the dynamics of the source and of the stream processor. For example, if the source increases its production rate in an unplanned manner, then the ingestion system must be able to accommodate such a change. The same happens in the case a processor downstream experiences issues and struggles to keep up with the rate. To be able to accommodate all such variations, it is critical that a system for storing stream data, like Pravega, is sufficiently flexible.
The flexibility of Pravega comes from breaking stream data into segments: append-only sequences of bytes that are organized both sequentially and in parallel to form streams. Segments enable important features, like parallel reads and writes, auto-scaling, and transactions; they are designed to be inexpensive to create and maintain. We can create new segments for a given stream when it needs more parallelism, when it needs to scale, or when it needs to begin a transaction.
The control plane in Pravega is responsible for all the operations that affect the lifecycle of a stream, e.g., create, delete, and scale. The data plane stores and serves the data of segments. The following figure depicts the high-level Pravega architecture with its core components. Continue Reading