Install the Controller in Kubernetes
ziti-controller
Host an OpenZiti controller in Kubernetes
Requirements
Repository | Name | Version |
---|---|---|
https://charts.jetstack.io | cert-manager | ~1.11.0 |
https://charts.jetstack.io | trust-manager | ~0.7.0 |
https://kubernetes.github.io/ingress-nginx/ | ingress-nginx | ~4.5.2 |
Note that ingress-nginx is not strictly required, but the chart is parameterized to allow for conveniently declaring pass-through TLS.
You must patch the ingress-nginx
deployment to enable the SSL passthrough option.
kubectl patch deployment "ingress-nginx-controller" \
--namespace ingress-nginx \
--type json \
--patch '[{"op": "add",
"path": "/spec/template/spec/containers/0/args/-",
"value":"--enable-ssl-passthrough"
}]'
Overview
This chart runs a Ziti controller in Kubernetes. It uses the custom resources provided by cert-manager and trust-manager, i.e., Issuer, Certificate, and Bundle. Delete the controller pod after an upgrade for the new controller configuration to take effect.
Requirements
Add the OpenZiti Charts Repo to Helm
helm repo add openziti https://docs.openziti.io/helm-charts/
This chart requires Certificate, Issuer, and Bundle resources to be applied before installing the chart. Sub-charts cert-manager
, and trust-manager
will be installed automatically. You may disable the sub-charts if you wish to provide these resources separately, but if you do so then please use the sub-chart values at the foot of Values.yaml to ensure those charts are correctly configured.
Install Required Custom Resource Definitions
This step satisfies Helm's requirement that the CRDs used in the umbrella chart already exist in Kubernetes before installing the controller chart.
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/latest/download/cert-manager.crds.yaml
kubectl apply -f https://raw.githubusercontent.com/cert-manager/trust-manager/v0.9.0/deploy/crds/trust.cert-manager.io_bundles.yaml
Minimal Installation
This first example shows a minimal installation for a Kubernetes distribution that provides TLS pass-through for Service type LoadBalancer, e.g., K3S, Minikube.
You must supply one value when you install the chart.
Key | Type | Default | Description |
---|---|---|---|
clientApi.advertisedHost | string | nil | the DNS name that edge clients and routers will resolve to reach this controller's edge client API |
clientApi.advertisedPort | string | nil | the TCP port associated with the advertisedHost to advertise to edge clients and routers |
helm install \
--namespace ziti-controller ziti-controller-minimal1 \
openziti/ziti-controller \
--set clientApi.advertisedHost="ziti-controller-minimal.example.com" \
--set clientApi.advertisedPort="443"
A default admin user and password will be generated and saved to a secret during installation. The credentials can be retrieved using this command:
kubectl get secret \
-n ziti-controller ziti-controller-minimal1-admin-secret \
-o go-template='{{range $k,$v := .data}}{{printf "%s: " $k}}{{if not $v}}{{$v}}{{else}}{{$v | base64decode}}{{end}}{{"\n"}}{{end}}'
You may log in the ziti
CLI with one command or omit the -p
part to prompt:
ziti edge login ziti-controller-minimal.example.com:1280 \
--yes \
--username admin \
--password $(
kubectl -n ziti-controller \
get secrets ziti-controller-minimal1-admin-secret \
-o go-template='{{index .data "admin-password" | base64decode }}'
)
Managed Kubernetes Installation
Managed Kubernetes providers typically configure server TLS for a Service of type LoadBalancer. Ziti needs pass-through TLS because edge clients and routers authenticate with client certificates. We'll accomplish this by changing the Service type to ClusterIP and creating Ingress resources with pass-through TLS for each cluster service.
This example demonstrates creating TLS pass-through Ingress resources for use with ingress-nginx.
Ensure you have the ingress-nginx
chart installed with controller.extraArgs.enable-ssl-passthrough=true
. You can verify this feature is enabled by running kubectl describe pods {ingress-nginx-controller pod}
and checking the args for --enable-ssl-passthrough=true
.
Create a Helm chart values file like this.
# /tmp/controller-values.yml
clientApi:
advertisedHost: ziti-controller-managed.example.com
advertisedPort: 443
service:
type: ClusterIP
ingress:
enabled: true
ingressClassName: nginx
annotations:
kubernetes.io/ingress.allow-http: "false"
nginx.ingress.kubernetes.io/ssl-passthrough: "true"
Now install or upgrade this controller chart with your values file.
helm install \
--namespace ziti-controller ziti-controller-managed1 \
openziti/ziti-controller \
--values /tmp/controller-values.yml
Expose the Router Control Plane
This is applicable if you have any routers outside the Ziti controller's cluster. You must configure pass-through TLS LoadBalancer or Ingress for the control plane service. Routers running in the same cluster as the controller can use the cluster service named {controller release}-ctrl
(the "ctrl" endpoint). This example demonstrates a pass-through Ingress resource for nginx-ingress
.
Merge this with your Helm chart values file before installing or upgrading.
ctrlPlane:
advertisedHost: ziti-controller-managed-ctrl.example.com
advertisedPort: 443
service:
enabled: true
ingress:
enabled: true
ingressClassName: nginx
annotations:
kubernetes.io/ingress.allow-http: "false"
nginx.ingress.kubernetes.io/ssl-passthrough: "true"
Extra Security for the Management API
You can split the client and management APIs into separate cluster services by setting managementApi.service.enabled=true
. With this configuration, you'll have an additional cluster service named {controller release}-mgmt
that is the management API, and the client API will not have management features.
This Helm chart's values allow for both operational scenarios: combined and split. The default choice is to expose the combined client and management APIs as the cluster service named {controller release}-client
, which is convenient because you can use the ziti
CLI immediately. For additional security, you may shelter the management API by splitting these two sets of features, exposing them as separate API servers. After the split, you can access the management API in several ways:
- running the web console inside the cluster,
- hosting a Ziti service, or
kubectl port-forward
.
Advanced PKI
The default configuration generates a singular PKI root of trust for all the controller's servers and the edge signer CA. Optionally, you may provide the name of a cert-manager Issuer or ClusterIssuer to become the root of trust for the Ziti controller's identity.
Merge this with your Helm chart values file before installing or upgrading.
ctrlPlane:
issuer:
kind: ClusterIssuer
name: my-alternative-cluster-issuer
You may also configure the Ziti controller to use separate PKI roots of trust for its three main identities: control plane, edge signer, and web bindings.
For example, to use a separate CA for the edge signer function, merge this with your Helm chart values file before installing or upgrading.
edgeSignerPki:
enabled: true
Prometheus Monitoring
This chart provides a default disabled ziti-controller-prometheus
k8s service for prometheus,
which can be enabled with prometheus.service.enabled
. Enabling it will create a prometheus ServiceMonitor
for configuring the prometheus endpoint. It is also important that you enable
fabric.events.enabled
for getting a full set of metrics.
For more information, please check here.
Values Reference
Key | Type | Default | Description |
---|---|---|---|
additionalVolumes | list | [] | additional volumes to mount to ziti-controller container |
affinity | object | {} | deployment template spec affinity |
ca.clusterDomain | string | "cluster.local" | Set a custom cluster domain if other than cluster.local |
ca.duration | string | "87840h" | Go time.Duration string format |
ca.renewBefore | string | "720h" | Go time.Duration string format |
cert-manager.enableCertificateOwnerRef | bool | true | clean up secret when certificate is deleted |
cert-manager.enabled | bool | false | install the cert-manager subchart |
cert-manager.installCRDs | bool | false | CRDs must be applied in advance of installing the parent chart |
cert.duration | string | "87840h" | server certificate duration as Go time.Duration string format |
cert.renewBefore | string | "720h" | rewnew server certificates before expiry as Go time.Duration string format |
clientApi.advertisedHost | string | nil | global DNS name by which routers can resolve a reachable IP for this service |
clientApi.advertisedPort | int | 443 | cluster service, node port, load balancer, and ingress port |
clientApi.containerPort | int | 1280 | cluster service target port on the container |
clientApi.dnsNames | list | [] | additional DNS SANs |
clientApi.ingress.annotations | string | nil | ingress annotations, e.g., to configure ingress-nginx |
clientApi.ingress.enabled | bool | false | create an ingress for the cluster service |
clientApi.service.enabled | bool | true | create a cluster service for the deployment |
clientApi.service.type | string | "LoadBalancer" | expose the service as a ClusterIP, NodePort, or LoadBalancer |
ctrlPlane.advertisedHost | string | nil | global DNS name by which routers can resolve a reachable IP for this service: default is cluster service DNS name which assumes all routers are inside the same cluster |
ctrlPlane.advertisedPort | int | 443 | cluster service, node port, load balancer, and ingress port |
ctrlPlane.alternativeIssuer | string | nil | kind and name of alternative issuer for the controller's identity |
ctrlPlane.containerPort | int | 6262 | cluster service target port on the container |
ctrlPlane.dnsNames | list | [] | additional DNS SANs |
ctrlPlane.ingress.annotations | string | nil | ingress annotations, e.g., to configure ingress-nginx |
ctrlPlane.ingress.enabled | bool | false | create an ingress for the cluster service |
ctrlPlane.service.enabled | bool | true | create a cluster service for the deployment |
ctrlPlane.service.type | string | "ClusterIP" | expose the service as a ClusterIP, NodePort, or LoadBalancer |
ctrlPlaneCasBundle.namespaceSelector | object | {} | namespaces where trust-manager will create the Bundle resource containing Ziti's trusted CA certs (default: empty means all namespaces) |
dbFile | string | "ctrl.db" | name of the BoltDB file |
edgeSignerPki.admin_client_cert.duration | string | "8760h" | admin client certificate duration as Go time.Duration |
edgeSignerPki.admin_client_cert.renewBefore | string | "720h" | renew admin client certificate before expiry as Go time.Duration |
edgeSignerPki.enabled | bool | true | generate a separate PKI root of trust for the edge signer CA |
fabric.events.enabled | bool | false | enable fabric event logger and file handler |
fabric.events.fileName | string | "fabric-events.json" | |
fabric.events.mountDir | string | "/var/run/ziti" | |
fabric.events.network.intervalAgeThreshold | string | "5s" | matching interval age and reporting interval ensures coherent metrics from fabric events |
fabric.events.network.metricsReportInterval | string | "5s" | matching interval age and reporting interval ensures coherent metrics from fabric events |
fabric.events.subscriptions[0].type | string | "fabric.circuits" | |
fabric.events.subscriptions[1].type | string | "fabric.links" | |
fabric.events.subscriptions[2].type | string | "fabric.routers" | |
fabric.events.subscriptions[3].type | string | "fabric.terminators" | |
fabric.events.subscriptions[4].metricFilter | string | ".*" | |
fabric.events.subscriptions[4].sourceFilter | string | ".*" | |
fabric.events.subscriptions[4].type | string | "metrics" | |
fabric.events.subscriptions[5].type | string | "edge.sessions" | |
fabric.events.subscriptions[6].type | string | "edge.apiSessions" | |
fabric.events.subscriptions[7].type | string | "fabric.usage" | |
fabric.events.subscriptions[7].version | int | 3 | |
fabric.events.subscriptions[8].type | string | "services" | |
fabric.events.subscriptions[9].interval | string | "5s" | |
fabric.events.subscriptions[9].type | string | "edge.entityCounts" | |
highAvailability.mode | string | "standalone" | Ziti controller HA mode |
highAvailability.replicas | int | 1 | Ziti controller HA swarm replicas |
image.args | list | ["{{ include \"configMountDir\" . }}/ziti-controller.yaml"] | args for the entrypoint command |
image.command | list | ["ziti","controller","run"] | container entrypoint command |
image.homeDir | string | "/home/ziggy" | homeDir for admin login shell must align with container image's ~/.bashrc for ziti CLI auto-complete to work |
image.pullPolicy | string | "IfNotPresent" | deployment image pull policy |
image.repository | string | "docker.io/openziti/ziti-controller" | container image repository for app deployment |
image.tag | string | "" | override the container image tag specified in the chart |
ingress-nginx.controller.extraArgs.enable-ssl-passthrough | string | "true" | configure subchart ingress-nginx to enable the pass-through TLS feature |
ingress-nginx.enabled | bool | false | install the ingress-nginx subchart |
managementApi.advertisedHost | string | nil | global DNS name by which routers can resolve a reachable IP for this service |
managementApi.advertisedPort | int | 443 | cluster service, node port, load balancer, and ingress port |
managementApi.containerPort | int | 1281 | cluster service target port on the container |
managementApi.dnsNames | list | [] | additional DNS SANs |
managementApi.ingress.annotations | string | nil | ingress annotations, e.g., to configure ingress-nginx |
managementApi.ingress.enabled | bool | false | create an ingress for the cluster service |
managementApi.service.enabled | bool | false | create a cluster service for the deployment |
managementApi.service.type | string | "ClusterIP" | expose the service as a ClusterIP, NodePort, or LoadBalancer |
network.createCircuitRetries | int | 2 | createCircuitRetries controls the number of retries that will be attempted to create a path (and terminate it) for new circuits. |
network.cycleSeconds | int | 15 | Defines the period that the controller re-evaluates the performance of all of the circuits running on the network. |
network.initialLinkLatency | string | "65s" | Sets the latency of link when it's first created. Will be overwritten as soon as latency from the link is actually reported from the routers. Defaults to 65 seconds. |
network.minRouterCost | int | 10 | Sets router minimum cost. Defaults to 10 |
network.pendingLinkTimeoutSeconds | int | 10 | pendingLinkTimeoutSeconds controls how long we'll wait before creating a new link between routers where there isn't an established link, but a link request has been sent |
network.routeTimeoutSeconds | int | 10 | routeTimeoutSeconds controls the number of seconds the controller will wait for a route attempt to succeed. |
network.routerConnectChurnLimit | string | "1m" | Sets how often a new control channel connection can take over for a router with an existing control channel connection Defaults to 1 minute |
network.smart.rerouteCap | int | 4 | Defines the hard upper limit of underperforming circuits that are candidates to be re-routed. If smart routing detects 100 circuits that are underperforming, and smart.rerouteCap is set to 1 , and smart.rerouteFraction is set to 0.02 , then the upper limit of circuits that will be re-routed in this cycleSeconds period will be limited to 1. |
network.smart.rerouteFraction | float | 0.02 | Defines the fractional upper limit of underperforming circuits that are candidates to be re-routed. If smart routing detects 100 circuits that are underperforming, and smart.rerouteFraction is set to 0.02 , then the upper limit of circuits that will be re-routed in this cycleSeconds period will be limited to 2 (2% of 100). |
nodeSelector | object | {} | deployment template spec node selector |
persistence.VolumeName | string | nil | PVC volume name |
persistence.accessMode | string | "ReadWriteOnce" | PVC access mode: ReadWriteOnce (concurrent mounts not allowed), ReadWriteMany (concurrent allowed) |
persistence.annotations | object | {} | annotations for the PVC |
persistence.enabled | bool | true | required: place a storage claim for the BoltDB persistent volume |
persistence.existingClaim | string | "" | A manually managed Persistent Volume and Claim Requires persistence.enabled=true. If defined, PVC must be created manually before volume will be bound. |
persistence.size | string | "2Gi" | 2GiB is enough for tens of thousands of entities, but feel free to make it larger |
persistence.storageClass | string | nil | Storage class of PV to bind. By default it looks for the default storage class. If the PV uses a different storage class, specify that here. |
podAnnotations | object | {} | annotations to apply to all pods deployed by this chart |
podSecurityContext | object | {"fsGroup":2171} | deployment template spec security context |
podSecurityContext.fsGroup | int | 2171 | the GID of the group that should own any files created by the container, especially the BoltDB file |
prometheus.advertisedHost | string | "" | DNS name to advertise in place of the default internal cluster name built from the Helm release name |
prometheus.advertisedPort | int | 443 | cluster service, node port, load balancer, and ingress port |
prometheus.containerPort | int | 9090 | cluster service target port on the container |
prometheus.service.annotations | object | {} | |
prometheus.service.enabled | bool | false | create a cluster service for the deployment |
prometheus.service.labels | object | {"app":"prometheus"} | extra labels for matching only this service, ie. serviceMonitor |
prometheus.service.type | string | "ClusterIP" | expose the service as a ClusterIP, NodePort, or LoadBalancer |
prometheus.serviceMonitor.annotations | object | {} | ServiceMonitor annotations |
prometheus.serviceMonitor.enabled | bool | true | If enabled, and prometheus service is enabled, ServiceMonitor resources for Prometheus Operator are created |
prometheus.serviceMonitor.interval | string | nil | ServiceMonitor scrape interval |
prometheus.serviceMonitor.labels | object | {} | Additional ServiceMonitor labels |
prometheus.serviceMonitor.metricRelabelings | list | [] | ServiceMonitor relabel configs to apply to samples as the last step before ingestion https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#relabelconfig (defines metric_relabel_configs ) |
prometheus.serviceMonitor.namespace | string | nil | Alternative namespace for ServiceMonitor resources |
prometheus.serviceMonitor.namespaceSelector | object | {} | Namespace selector for ServiceMonitor resources |
prometheus.serviceMonitor.relabelings | list | [] | ServiceMonitor relabel configs to apply to samples before scraping https://github.com/prometheus-operator/prometheus-operator/blob/master/Documentation/api.md#relabelconfig (defines relabel_configs ) |
prometheus.serviceMonitor.scheme | string | "https" | ServiceMonitor will use http by default, but you can pick https as well |
prometheus.serviceMonitor.scrapeTimeout | string | nil | ServiceMonitor scrape timeout in Go duration format (e.g. 15s) |
prometheus.serviceMonitor.targetLabels | list | [] | ServiceMonitor will add labels from the service to the Prometheus metric https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#servicemonitorspec |
prometheus.serviceMonitor.tlsConfig | object | {"insecureSkipVerify":true} | ServiceMonitor will use these tlsConfig settings to make the health check requests |
prometheus.serviceMonitor.tlsConfig.insecureSkipVerify | bool | true | set TLS skip verify, because the SAN will not match with the pod IP |
resources | object | {} | deployment container resources |
securityContext | object | {} | deployment container security context |
spireAgent.enabled | bool | false | if you are running a container with the spire-agent binary installed then this will allow you to add the hostpath necessary for connecting to the spire socket |
spireAgent.spireSocketMnt | string | "/run/spire/sockets" | file path of the spire socket mount |
tolerations | list | [] | deployment template spec tolerations |
trust-manager.app.trust.namespace | string | "{{ .Release.Namespace }}" | trust-manager needs to be configured to trust the namespace in which the controller is deployed so that it will create the Bundle resource for the ctrl plane trust bundle |
trust-manager.crds.enabled | bool | false | CRDs must be applied in advance of installing the parent chart |
trust-manager.enabled | bool | false | install the trust-manager subchart |
webBindingPki.enabled | bool | true | generate a separate PKI root of trust for web bindings, i.e., client, management, and prometheus APIs |
TODO's
- replicas - Each controller replica needs to be it's own HA member. We have to wait until HA https://github.com/openziti/ziti/blob/release-next/doc/ha/overview.md is officially released.
- lower CA / Cert lifetime; how to refresh stuff when Certs are updated?
- Deploy Prometheus scraper configuration when
prometheus.enabled = true
- cert-manager allows issuing only one cert per key, i.e., ClientCertKeyReuseIssue prevents us from issuing a user cert and server cert backed by same private key, hence the controller config.yaml re-uses server certs in place of user certs to allow startup and testing to continue