When to use VolumeClaimTemplates versus predefined Persistent Volumes and Claims versus standard volumes

909 Views Asked by At

I am struggling to understand the advantages and disadvantages for VolumeClaimTemplates versus predefined PVs/PVCs and standard Volumes.

We have StatefulSet A that comprises a couple hundred pods, each needing a handful of static databases mounted - DB1-DB4. We don't currently save any state explicitly but instead use the StatefulSet to provide determinism and ordering to a large swath of parallelized real time computing. Each replica does a portion of the computing that is later ordered and processed further.

We currently use NFS Volumes to mount in the databases. My understanding of NFS volumes as described in Kubernetes documentation are more or less persistent by nature. If my understanding is correct, are there really any benefit of using PVs/PVCs as opposed to normal Volumes for static, pre-populated data other than maybe abstracting away the NFS configuration info from the individual helm templates? What about when we compare the use case of VolumeClaimTemplates versus predefined PVs/PVCs?

My understanding of VolumeClaimTemplates is that the primary benefit is to provide each replica its own PV/PVC to use (and possibly re-use between scaling/helm reinstalls). What I don't necessarily understand is why that is a benefit? With NFS specifically, are they all not using the same underlying storage anyway?

Also, is there any potential negative to standing up so many PVs/PVCs? Is there any potential impact to node level resources or standup time? I feel like I understand most of the basics but am missing a few key details that'll make it all click.

I've tried playing around with switching to manually defined PVs/PVCs in the past - again using NFS - to mostly abstract away the NFS configuration info for our volumes (to be taken over by the Cluster Administators). That is, I created a single PV and PVC for each of our static databases and modified the StatefulSet to use the PVC.

In practice, this meant that when I installed the chart hundreds of pods were being started up, all trying to make use of the claim. Because of this, I would occasionally see some issues of pods failing to mount a volume.

And unfortunately due to the current infrastructure around us and needs of the application, we cannot currently install the chart once, scale as needed and leave it be. Instead, each time the user wants to do some processing, they set up the specific parameters and kick it off.
This means in practice, we will have a handful of separate helm releases at a time. When a run is done, that release is uninstalled. When a new one is desired, it is created and is expected to be running in a reasonable time.

1

There are 1 best solutions below

3
David Maze On

For the setup you're using, and particularly in a Helm context, I'd probably put inline volumes: in the Pod spec.

The PersistentVolume(Claim) mechanism works better where the cluster itself is allocating the storage. Just from the top-level description of "I'm trying to run a database in a StatefulSet", I'd expect the normal (non-NFS) path to look like

  1. Your Kubernetes YAML declares a StatefulSet with volumeClaimTemplates.
  2. The cluster creates a PersistentVolumeClaim for each replica; you do not create a PVC yourself.
  3. The cluster creates a PersistentVolume for each PVC; you do not create a PV yourself.
  4. The cluster allocates storage (on AWS/EKS, for example, an EBS volume); you do not allocate the storage yourself.

This runs in the other direction too. If you delete the PVC, the cluster will delete the PV and the underlying storage.

In your case, though, you don't want the cluster to manage the storage; you have an existing out-of-cluster NFS store you want to use. I don't think there are particular benefits to setting up a PersistentVolume(Claim) to manually point at this. This is doubly true since a PersistentVolume is a cluster-global object, so you have to be careful about naming it so there aren't conflicts between multiple installations.

In a Helm context more specifically, you could have a configuration option that chose whether to use a specific NFS store or to use volumeClaimTemplates. In-cluster storage might make sense for a developer setup but out-of-cluster storage for (pre-)production.

At a scale of merely hundreds of PV(C)s, I wouldn't expect this to stress the cluster. In my day job we have a shared development cluster where each developer namespace can easily create a dozen PVCs and that hasn't been a problem for us.