devops-stack-module-velero

A DevOps Stack module to deploy a backup solution based on Velero.

This module add the ability to perform backups of a cluster. This includes all Kubernetes objects, although with the GitOps approach of the Devops Stack, it is not necessary to have a backup of all Kubernetes objects, as they are generally disposable and recreated by ArgoCD.

The Velero chart used by this module is shipped in this repository as well, in order to avoid any unwanted behaviors caused by unsupported versions.

Current Chart Version Original Repository Default Values

5.0.2

Chart

values.yaml

Velero needs a storage location for backups, either a S3 compatible storage (AWS S3, Minio, etc.), Azure Blob storage, etc.

Since this module is meant to be instantiated using its variants, the usage documentation is available in each variant ( EKS ).

Usage

Backup

For an example of deployment, please see the variants documentation.

Once the Velero controller is deployed on the server, it is also possible to perform manual backup from the velero client:

velero backup create \
    --included-namespaces wordpress,loki \
    --included-resources pv,pvc,pods \
    --storage-location my-s3-bucket \
    my-new-backup

Restore

The backup can be restored manually with the following commands:

# Restore from a backup
velero restore create --from-backup my-new-backup
# Restore from the last backup of a schedule
velero restore create --from-schedule restic-schedule

More options are described in the Velero documentation.

In order to perform restore successfully, Velero need that the object doesn’t already exists in the target cluster. As a disaster recovery tool, it is not able to merge an existing object such as a PVC with its backup.

Monitoring

If enabled, this module exposes Velero metrics and setup a scrape target for Prometheus.

Grafana dashboard

This module contains a Grafana dashboard to monitor the frequency, state and duration of backups, as shown here below.

Grafana dashboard with backup informations

Alerts

This module also setups some alerts in Alertmanager for the following cases:

  • Partially failed backups ratio higher than a specified amount (alert_partially_failed_ratio);

  • Failed backups ratio higher than a specified amount (alert_failed_ratio);

  • No successful backup for a specified amount of time (alert_backup_timeout);

Limitations

KinD setup

Velero cannot perform backups of hostPath volumes, which are the volumes used in a KinD platform. Therefore, this module cannot be used in a DevOps Stack on KinD.

Technical Reference

Requirements

The following requirements are needed by this module:

Providers

The following providers are used by this module:

Resources

The following resources are used by this module:

Required Inputs

The following input variables are required:

cluster_name

Description: Name given to the cluster. Value used for naming some the resources created by the module.

Type: string

base_domain

Description: Base domain of the cluster. Value used for the ingress' URL of the application.

Type: string

Optional Inputs

The following input variables are optional (have default values):

argocd_namespace

Description: Namespace used by Argo CD where the Application and AppProject resources should be created.

Type: string

Default: "argocd"

target_revision

Description: Override of target revision of the application chart.

Type: string

Default: "v1.0.0"

cluster_issuer

Description: SSL certificate issuer to use. Usually you would configure this value as letsencrypt-staging or letsencrypt-prod on your root *.tf files.

Type: string

Default: "ca-issuer"

namespace

Description: Namespace where the applications’s Kubernetes resources should be created. Namespace will be created in case it doesn’t exist.

Type: string

Default: "velero"

helm_values

Description: Helm chart value overrides. They should be passed as a list of HCL structures.

Type: any

Default: []

app_autosync

Description: Automated sync options for the Argo CD Application resource.

Type:

object({
    allow_empty = optional(bool)
    prune       = optional(bool)
    self_heal   = optional(bool)
  })

Default:

{
  "allow_empty": false,
  "prune": true,
  "self_heal": true
}

dependency_ids

Description: IDs of the other modules on which this module depends on.

Type: map(string)

Default: {}

backup_schedules

Description: TBD

Type:

map(object({
    disabled    = optional(bool, false)
    labels      = optional(map(string), {})
    annotations = optional(map(string), {})
    schedule    = string
    template = object({
      # labels             = optional(map(string), {}) # TODO: test
      # annotations        = optional(map(string), {}) # TODO: test
      storageLocation    = optional(string)
      ttl                = optional(string)
      includedNamespaces = list(string)
      includedResources  = list(string)
      # enableSnapshot     = optional(bool, true)
    })
  }))

Default: null

enable_monitoring_dashboard

Description: Boolean to enable the provisioning of a Velero dashboard for Grafana.

Type: bool

Default: true

alert_partially_failed_ratio

Description: Percentage of partially failed backups before triggering a Prometheus alert

Type: number

Default: 0.25

alert_failed_ratio

Description: Percentage of failed backups before triggering a Prometheus alert

Type: number

Default: 0.25

alert_backup_timeout

Description: Timeout in seconds before triggering the last successful backup alert

Type: number

Default: 86400

Outputs

The following outputs are exported:

id

Description: ID to pass other modules in order to refer to this module as a dependency.

restic_repo_password

Description: the password to access the restic repositories

Reference in table format

Show tables

= Requirements

Name Version

>= 4

~> 2

>= 3

>= 1

= Providers

Name Version

n/a

~> 2

>= 1

>= 4

>= 3

= Resources

Name Type

resource

resource

resource

resource

resource

resource

resource

data source

= Inputs

Name Description Type Default Required

Name given to the cluster. Value used for naming some the resources created by the module.

string

n/a

yes

Base domain of the cluster. Value used for the ingress' URL of the application.

string

n/a

yes

Namespace used by Argo CD where the Application and AppProject resources should be created.

string

"argocd"

no

Override of target revision of the application chart.

string

"v1.0.0"

no

SSL certificate issuer to use. Usually you would configure this value as letsencrypt-staging or letsencrypt-prod on your root *.tf files.

string

"ca-issuer"

no

Namespace where the applications’s Kubernetes resources should be created. Namespace will be created in case it doesn’t exist.

string

"velero"

no

Helm chart value overrides. They should be passed as a list of HCL structures.

any

[]

no

Automated sync options for the Argo CD Application resource.

object({
    allow_empty = optional(bool)
    prune       = optional(bool)
    self_heal   = optional(bool)
  })
{
  "allow_empty": false,
  "prune": true,
  "self_heal": true
}

no

IDs of the other modules on which this module depends on.

map(string)

{}

no

TBD

map(object({
    disabled    = optional(bool, false)
    labels      = optional(map(string), {})
    annotations = optional(map(string), {})
    schedule    = string
    template = object({
      # labels             = optional(map(string), {}) # TODO: test
      # annotations        = optional(map(string), {}) # TODO: test
      storageLocation    = optional(string)
      ttl                = optional(string)
      includedNamespaces = list(string)
      includedResources  = list(string)
      # enableSnapshot     = optional(bool, true)
    })
  }))

null

no

Boolean to enable the provisioning of a Velero dashboard for Grafana.

bool

true

no

Percentage of partially failed backups before triggering a Prometheus alert

number

0.25

no

Percentage of failed backups before triggering a Prometheus alert

number

0.25

no

Timeout in seconds before triggering the last successful backup alert

number

86400

no

= Outputs

Name Description

id

ID to pass other modules in order to refer to this module as a dependency.

the password to access the restic repositories