Deploying Pravega on Kubernetes

By Raúl Gracia on Posted on June 20, 2020 in Best Practices

Pravega is a storage system for data streams that has an innovative design and an attractive set of features to cope with today’s Stream processing requirements (e.g., event ordering, scalability, performance, etc.). The project has plenty of documentation and great blog posts that explain in detail every technical aspect of Pravega. But, if you are now more interested in having your first Pravega cluster up and running in the cloud and leave the technical readings for later, then you are at the right place.

We show you how to deploy your “first Pravega cluster in Kubernetes”. We provide a step-by-step guide to deploy Pravega in both Google Kubernetes Engine (GKE) and Amazon Elastic Kubernetes Service (EKS). Our goal is to keep things as simple as possible, and, at the same time, provide you with valuable insights on the services that form a Pravega cluster and the operators we developed to deploy them. For this reason, while we have a “one-click” Pravega installation ready to use, we commend you to complete this blog post. If you do so, you will have a Pravega cluster ready for serving sample applications and, why not, even your first application working against Pravega. Let’s get started!

Creating and Setting Up the Kubernetes Cluster

First, we need to create the Kubernetes cluster to deploy Pravega. We assume as a pre-requisite that you have an account with at least one of the cloud providers mentioned above. If you already have an account for Google Cloud and/or AWS, then it is time to create a Kubernetes cluster for Pravega.

GKE

Creating a Kubernetes cluster in GKE is straightforward. The defaults in general are enough for running a demo Pravega cluster, but we suggest just a couple of setting changes to deploy Pravega:

Go to Kubernetes Engine drop-down menu and select Clusters > Create Cluster option.
Pick a name for your Kubernetes cluster (i.e., pravega-gke).
As an important point, in Master version section you should select a Kubernetes version 1.15. The reason is that we are going to exercise the latest Pravega and Bookkeeper Operators, which requires Kubernetes version 1.15+.
Also, as the Pravega cluster consists of several services, we need to select a slightly larger node flavor compared to the default one. Thus, go to default-pool > Nodes > Machine type and select n1-standard-4 nodes (4vCPUs, 15GB of RAM) and select 4 nodes instead of 3 (default). Note that this deployment is still accessible with the trial account.
Press the Create button, and that’s it.

Note that we use the Cloud Shell provided by GKE to deploy Pravega from the browser itself without installing locally any CLI (but feel free to use the Google Cloud CLI instead).

Pravega and Bookkeeper Operators also require elevated privileges in order to watch for custom resources. For this reason, in GKE you need as a pre-requisite to grant those permissions first by executing:

EKS

In the case of AWS, we are going to use the EKS CLI, which automates and simplifies different aspects of the cluster creation and configuration (e.g., VPC, subnets, etc.). You will need to install and configure the EKS CLI before proceeding with the cluster creation.

Once the EKS CLI is installed, we just require one command to create an EKS cluster:

Similar to the GKE case, the previous command uses a larger node type compared to the default one (--node-type t3.xlarge). Note that the --ssh-public-key parameter expects a public key that has been generated when installing the AWS CLI to securely connect with your cluster (for more info, please read this document). Also, take into account that the region for the EKS cluster should match the configured region in your AWS CLI.

Now, we are ready to prepare our Kubernetes cluster for the installation of Pravega.

Install Helm

To simplify the deployment of Pravega, we use Helm charts. You will need to install a Helm 3 client to proceed with the installation instructions in this blog post.

Once you install the Helm client, you just need to get the public charts we provide to deploy a Pravega cluster:

Webhook conversion and Cert-Manager

The most recent versions of Pravega Operator resort to the new Webhook Conversion feature, which is beta since 1.15. For this reason, Cert-Manager or some other certificate management solution must be deployed for managing webhook service certificates. To install Cert-Manager, just execute this command:

Deploying Pravega

Next, we show you step by step how to deploy Pravega, which involves the deployment of Apache Zookeeper, Bookkeeper (journal), and Pravega (as well as their respective Operators). Also, given that Pravega moves “cold” data to what we call long-term storage (a.k.a Tier 2), we need to instantiate a storage backend for such purpose.

Apache Zookeeper

Apache Zookeeper is a distributed system that provides reliable coordination services, such as consensus and group management. Pravega uses Zookeeper to store specific pieces of metadata as well as to offer a consistent view of data structures used by multiple service instances.

As part of the Pravega project, we have developed a Zookeeper Operator to manage the deployment of Zookeeper clusters in Kubernetes. Thus, deploying the Zookeeper Operator is the first step to deploy Zookeeper:

With the Zookeeper Operator up and running, the next step is to deploy Zookeeper. We can do so with the helm chart we published for Zookeeper:

This chart instantiates a Zookeeper cluster made of 3 instances and their respective Persistent Volume Claims (PVC) of 20GB of storage each, which is enough for a demo Pravega cluster.

Once the previous command has been executed, you can see both Zookeeper Operator and Zookeeper running in the cluster:

$ kubectl get pods
NAME                                  READY   STATUS    RESTARTS   AGE
zookeeper-0                           1/1     Running   0          3m46s
zookeeper-1                           1/1     Running   0          3m6s
zookeeper-2                           1/1     Running   0          2m25s
zookeeper-operator-6b9759bbcb-9j25s   1/1     Running   0          4m

Apache Bookkeeper

Apache Bookkeeper is a distributed and reliable storage system that provides a distributed log abstraction. Bookkeeper excels on achieving low latency, append-only writes. This is the reason why Pravega uses Bookkeeper for journaling: Pravega writes data to Bookkeeper, which provides low latency, persistent, and replicated storage for stream appends. Pravega uses the data in BookKeeper to recover from failures, and that data is truncated once it is flushed to tiered long-term storage.

As in the case of Zookeeper, we have also developed a Bookkeeper Operator to manage the lifecycle of Bookkeeper clusters deployed in Kubernetes. Thus, the next step is to deploy the Bookkeeper Operator:

Once running, we can proceed to deploy Bookkeeper. In this case, we will use the Helm chart publicly available to quickly spin up a Bookkeeper cluster:

As a result, you can see below both Zookeeper and Bookkeeper up and running:

$ kubectl get pods
NAME                                   READY   STATUS    RESTARTS   AGE
bookkeeper-operator-85568f8949-d652z   1/1     Running   0          4m10s
bookkeeper-pravega-bk-bookie-0         1/1     Running   0          2m10s
bookkeeper-pravega-bk-bookie-1         1/1     Running   0          2m10s
bookkeeper-pravega-bk-bookie-2         1/1     Running   0          2m10s
zookeeper-0                            1/1     Running   0          8m59s
zookeeper-1                            1/1     Running   0          8m19s
zookeeper-2                            1/1     Running   0          7m38s
zookeeper-operator-6b9759bbcb-9j25s    1/1     Running   0          9m13s

Long-Term Storage

We mentioned before that Pravega automatically moves data to Long-Term Storage (or Tier 2). This feature is very interesting, because it positions Pravega in a “sweet spot” in the latency vs throughput trade-off: Pravega achieves low latency writes by using Bookkeeper for appends. At the same time, it also provides high throughput reads when accessing historical data.

As our goal is to keep things as simple as possible, we deploy a simple storage option: the NFS Server provisioner. With such a provisioner, we have a pod that acts as an NFS Server for Pravega. To deploy it, you need to execute the next command:

Once the NFS Server provisioner is up and running, Pravega will require a PVC for long-term storage pointing to the NFS Server provisioner that we have just deployed. To create the PVC, you can just copy the following manifest (namely tier2_pvc.yaml):

And create the PVC for long-term storage as follows:

As you may notice, the long-term storage option suggested in this post is just for demo purposes, just to keep things simple. But, if you really want to have a real Pravega cluster running in the cloud, then we suggest you to use actual storage services like FileStore in GKE and EFS in AWS. There are instructions on how to deploy production long-term storage options in the documentation of Pravega Operator.

Pravega

We are almost there! The last step is to deploy Pravega Operator and Pravega, pretty much as what we have already done for Zookeeper and Bookkeeper. As usual, we first need to deploy the Pravega Operator (and its required certificate) as follows:

Once deployed, we can deploy Pravega with the default Helm chart publicly available as follows:

That’s it! Once this command gets executed, you will have your first Pravega cluster up and running:

$ kubectl get pods
NAME                                         READY   STATUS    RESTARTS  AGE
bookkeeper-operator-85568f8949-d652z         1/1     Running   0         11m
bookkeeper-pravega-bk-bookie-0               1/1     Running   0         9m6s
bookkeeper-pravega-bk-bookie-1               1/1     Running   0         9m6s
bookkeeper-pravega-bk-bookie-2               1/1     Running   0         9m6s
nfs-server-provisioner-1592297085-0          1/1     Running   0         5m26s
pravega-operator-6c6d9db459-mpjr4            1/1     Running   0         4m19s
pravega-pravega-controller-5b447c85b-t8jsx   1/1     Running   0         2m56s
pravega-pravega-segment-store-0              1/1     Running   0         2m56s
zookeeper-0                                  1/1     Running   0         15m
zookeeper-1                                  1/1     Running   0         15m
zookeeper-2                                  1/1     Running   0         14m
zookeeper-operator-6b9759bbcb-9j25s          1/1     Running   0         16m

Executing a Sample Application

Finally, we would like to help you to exercise the Pravega cluster you just deployed. Let’s deploy a pod in our Kubernetes cluster to run samples and applications, like the one we propose in the manifest below (test-pod.yaml):

You can directly use this manifest and create your Ubuntu 18.04 pod as follows:

Once the pod is up and running, we suggest you to login into the pod and build the Pravega samples to interact with the Pravega cluster by executing the following commands:

With this, we can go to the location where the Pravega samples executable files have been generated and execute one of them, making sure that we point to the Pravega Controller service:

That’s it, you have executed your first sample against the Pravega cluster! With the consoleWriter, you will be able to write to Pravega regular events or transactions. We also encourage you to execute on another terminal the consoleReader, so you will see how events are both written and read at the same time (for more info, see the Pravega samples documentation). There are many other interesting samples for Pravega in the repository, so please be curious and try them out.

Wrap Up

In this blog post, we have provided you with a step-by-step guide to deploy Pravega in GKE and EKS. While the resulting Pravega cluster is mainly for demo purposes, it will be enough to help you understand the deployment process of Pravega as well as to know the ecosystem of operators contributed in this project. Besides, we have demonstrated how to quickly build the Pravega samples in a pod, so you can exercise your new Pravega cluster. We really hope that this helps you to get familiar with Pravega and get excited with the set of features it provides.

Acknowledgements

Thanks to Srishti Takkar, Derek Moore and Flavio Junqueira for their work and feedback on this blog post.

About the Author

Raúl Gracia is a Principal Engineer at DellEMC and part of the Pravega development team. He holds a M.Sc. in Computer Engineering and Security (2011) and a Ph.D. in Computer Engineering (2015) from Universitat Rovira i Virgili (Tarragona, Spain). During his PhD, Raúl has been an intern at IBM Research (Haifa, Israel) and Tel-Aviv University. Raúl is a researcher interested in distributed systems, cloud storage, data analytics and software engineering, with more than 20 papers published in international conferences and journals.