Once you have installed Provider Connectivity Assurance, you can install the components used to ingest Faults and Mobility performance monitoring data.
To do this, follow the instructions in the three articles below, in this exact order:
RKE2 Cluster Installation
This article details the prerequisite requirements for installing RKE2 in an offline clustering state. It is written for an on-prem cluster environment.
A standard RKE2 installation consists of servers dedicated to specific tasks:
RKE2 - Security-focused Kubernetes
Longhorn - Unified storage layer
Deployment Prerequisite
VMs as per below resources (For Dev/Staging Environment):
VM Name | CPU | RAM | DISK |
---|---|---|---|
Registry VM (Internet Enabled) | 8 | 16 GB | 300 GB |
3 Control Plane VM | 12 | 16 GB | 300 GB |
N Agent (Worker VM) | 12 | 16 GB | 300 GB |
VM Interfaces should be configured for Dual Stack (IPv4 and IPv6).
VM should have a single interface.
Setup Access (Server access).
SELinux should be disabled on all the cluster nodes and Internet machine.
Firewall should be disabled on all nodes.
VM Partition should be as per the below table.
NTP should be configured on all the nodes.
Note: Below table should be followed for all the nodes except image registry server where /var
partition should be more than 150 GB (minimum 150 GB), instead of 50 GB.
Mount Point | Partition Size | File System Type |
---|---|---|
/ | 50 GB | xfs |
/boot | 1 GB | xfs |
/boot/efi (optional) | 1 GB | xfs |
/home | 20 GB | xfs |
Swap | 10 GB | xfs |
/var | 50 GB | xfs |
/matrix | 90% of Remaining | xfs |
Registry Server Creation
The following section outlines the installation of application services across each server role.
Note: Ensure that the operating system matches that of the other nodes within the cluster.
Download RKE2 Scripts
With the setup ready, it's time to deploy the RKE2 cluster. Download the necessary scripts and RPM packages from the SharePoint link below and upload them to the internet machine and rke2 servers(control plane and worker nodes):
Cross-Domain Analytics - RKE2_Scripts - All Documents
Description | Upload Machine/Server | Upload Path |
---|---|---|
rke2_deployment.zip | Internet Machine & RKE2 Servers (CP) |
|
rke2_docker_rpm_packages.zip | Internet Machine |
|
rke2_rpm_packages.zip | Internet Machine & RKE2 Servers |
|
Step 1: Login to the Internet Machine Server and Unzip the build files on the Internet Machine
# Connect to the server via SSH:
ssh root@<internet_machine_ip>
# Navigate to the /opt directory:
cd /opt
# Verify the script folder exists:
ls -lrt
Ensure rke2_deployment.zip is listed.
rke2_deployment.zip
# Unzip the rke2_deployment.zip file:
unzip -o rke2_deployment.zip -d /opt
# Check if the scripts are present in the /opt directory:
ls -lrt /opt/rke2_deployment
Ensure the following scripts are listed:
rke2_longhorn_build.sh
rke2_control.sh
rke2_worker.sh
rke2_ltp.sh
# Grant execution permissions to the scripts:
chmod +x *.sh
Step 2: Unzip and Install RPM Packages on the Internet Machine and Start Docker
# Continue from above.
# Navigate to the /matrix directory:
cd /matrix
# Unzip the required RPM packages:
unzip rke2_docker_rpm_packages.zip
unzip rke2_rpm_packages.zip
# Install the RPM packages:
rpm -ivh --force --nodeps /matrix/rke2_docker_rpm_packages/*.rpm
rpm -ivh --force --nodeps /matrix/rke2_rpm_packages/*.rpm
Note: Incase VM flavor different from Alma Linux to another OS please install similar packages according to the base OS.
# To enable and start Docker, run the following commands:
systemctl enable docker
systemctl start docker
Step 3: Create the .zst Build and Transfer to the Main Plain Server
# Continue from above.
# Navigate to the /opt directory:
cd /opt/rke2_deployment
Note: Before running the build script, ensure that the /opt directory has at least 25 GB of free space.
# Execute the build script:
./rke2_longhorn_build.sh build
# Run the following command to check if the .zst file exists in the /opt directory:
ls -lhrt /opt
cd /opt
# To securely copy the rke2_rancher_longhorn.zst file to the target server, use the following command:
scp rke2_longhorn.zst <username>@<controld_plane_server-ip>:/opt/
Create a Registry Server
Step 1: Add below entries in /etc/docker/daemon.json
on registry server
#Create the daemon.json
vi /etc/docker/daemon.json
{
"ipv6": true,
"fixed-cidr-v6": "2405:420:54ff:84::/64",
"live-restore": true,
"userns-remap": "default",
"log-level": "info"
}
#reload docker daemon
systemctl restart docker
Step 2: to Create SSL Certificates for Docker Registry
#Update OpenSSL Configuration with VM IP
# Add the following entries to the /etc/hosts file on each node in the cluster:
vim /etc/hosts
...
127.0.0.1 <registry-name>
<Registry-Server-IPv4> <registry-name>
<Registry-Server-IPv6> <registry-name>
...
#Locate and edit the OpenSSL configuration file:
find / -name openssl.cnf
vi /etc/pki/tls/openssl.cnf
#Add the following entry under [ v3_ca ] section.
...
[ v3_ca ]
subjectAltName=IP:IP_ADDRESS_OF_YOUR_VM
[ alt_names ]
DNS.1 = <registry-name>
IP.1 = IP_ADDRESS_OF_YOUR_VM
...
#Create a Directory for Certificates.
mkdir -p /certificates && cd /certificates
mkdir -p /matrix/docker_data
# Generate SSL Certificates
openssl req \
-newkey rsa:4096 -nodes -sha256 -keyout docker.key \
-x509 -days 365 -out docker.crt
###During the prompt, enter the required details. When asked for:
Common Name (e.g. server FQDN or YOUR name) []: IP_ADDRESS_OF_YOUR_VM
#for others press enter key.
# Generate SSL Certificates
chmod 775 /certificates /matrix/docker_data
chmod 444 /certificates/docker.key
#Create a Directory for Authentication Files
mkdir -p ~/registry/auth
# Install httpd-tools (if not already installed)
yum install httpd-tools -y
# Install httpd-tools (if not already installed)
htpasswd -Bbn admin admin123 > ~/registry/auth/htpasswd
sudo mkdir -p /etc/docker/certs.d/<registry-name>:5000
#Please make sure to copy docker.crt as ca.crt (or rename later on)
sudo cp /certificates/docker.crt /etc/docker/certs.d/<registry-name>:5000/ca.crt
sudo ls /etc/docker/certs.d/<registry-name>:5000/
#reload docker daemon to use the ca.crt certificate
systemctl restart docker
Step 3: Download the Registry Image from Docker Hub
Download the Registry Image from docker hub as per below table.
Service Name | Image Details |
---|---|
Registry | dockerhub.cisco.com/matrixcx-docker/matrix4/rke2-local-registry:3.0.0 |
Step 4: Create the Local Registry on Registry Server
#Download the registry image on host machine
#Ensure required directories and files exist:
ls -ld /certificates /root/registry/auth /matrix/docker_data
ls -l /certificates/docker.crt /certificates/docker.key /root/registry/auth/htpasswd
#If any are missing, create them and set proper permissions.
#Install the registry image from dockerhub.
docker pull <image_name>
#Run the registry container.
docker run -d \
--name registry \
--restart=on-failure:5 \
--read-only \
-v /certificates:/certificates \
-v /root/registry/auth:/auth \
-v /matrix/docker_data:/var/lib/registry \
-e REGISTRY_HTTP_TLS_CERTIFICATE=/certificates/docker.crt \
-e REGISTRY_HTTP_TLS_KEY=/certificates/docker.key \
-e REGISTRY_AUTH=htpasswd \
-e REGISTRY_AUTH_HTPASSWD_REALM="Registry Realm" \
-e REGISTRY_AUTH_HTPASSWD_PATH=/auth/htpasswd \
-p <VM_IPv4>:5000:5000 \
-p <VM_IPv6>:5000:5000 \
<Registry_Image_name>:<Tag>
Step 5: Push all Images to the Local Registry Server
(As per above mentioned table.)
# login in your local registry
docker login <registry-name>
# Navigate to the /opt directory:
cd /opt/rke2_deployment
# Execute the build to push cert image script:
#Note : It will ask registry name please provide as per actual
#example : Enter the registry name (e.g., caloregistry.io):
./rke2_ltp.sh /opt/rancher/images/longhorn
Deploy RKE2 (Control Plane)
Step 1: Login to the Control Plane: Login to the Control Plane Server and Unzip the Build Files
# Connect to the server via SSH:
ssh root@<control plane_ip>
# Navigate to the /opt directory:
cd /opt
# Verify the script folder exists:
ls -lrt
Ensure rke2_deployment.zip is listed.
rke2_deployment.zip
# Unzip the rke2_deployment.zip file:
unzip -o rke2_deployment.zip -d /opt
# Check if the scripts are present in the /opt directory:
ls -lrt /opt/rke2_deployment
Ensure the following scripts are listed:
rke2_longhorn_build.sh
rke2_control.sh
rke2_worker.sh
rke2_ltp.sh
# Grant execution permissions to the scripts:
chmod +x *.sh
# move scripts to /opt dir
mv * /opt
cd ..
rm -rf rke2_deployment
Step 2: Unzip and Install RPM Packages on the Control Plane
# Continue from above.
# Navigate to the /matrix directory:
cd /matrix
# Unzip the required RPM packages:
unzip rke2_rpm_packages.zip
# Install the RPM packages:
rpm -ivh --force --nodeps /matrix/rke2_rpm_packages/*.rpm
Step 3: Disable SELinux and Firewall on the on the System
# edit selinux config file
sed -i 's/^SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config
# Stop and disable the firewall service to prevent it from starting on boot
systemctl stop firewalld
systemctl disable firewalld
# Reboot the node
reboot
Step 4: Create the main control-plane node
# Navigate to the script directory:
cd /opt/
# Run the following command to check if the .zst file exists in the /opt directory:
ls -lhrt /opt/
# Execute the build script:
./rke2_control.sh control
Step 5: Verify RKE2 Server Status and Enable kubectl Binary
# Check if the rke2-server service is running:
systemctl status rke2-server.service
# Enable the kubectl binary by updating environment variables:
echo 'export KUBECONFIG=/etc/rancher/rke2/rke2.yaml' >> ~/.bashrc
echo 'export CRI_CONFIG_FILE=/var/lib/rancher/rke2/agent/etc/crictl.yaml' >> ~/.bashrc echo 'export PATH=$PATH:/var/lib/rancher/rke2/bin' >> ~/.bashrc
# Apply the changes immediately
source ~/.bashrc
Step 6: Create registries.yaml Configuration File
#Transfer /certificates from resistry machine to path /opt/rancher
# Connect to the server via SSH:
ssh root@<internet_machine_ip>
# scp registry certificates to control plain server
scp -r /certificates root@<control-plane-ip>:/opt/rancher/
# create the file for editing in control plane:
vi /etc/rancher/rke2/registries.yaml
# Add the following configuration:
mirrors:
"<registry_name>": #example: caloregistry5.io
endpoint:
- "https://<registry_name>:5000" #example: caloregistry5.io
configs:
"<registry_name>:5000": #example: caloregistry5.io
auth:
username: admin
password: admin123
tls:
cert_file: /opt/rancher/certificates/docker.crt
key_file: /opt/rancher/certificates/docker.key
insecure_skip_verify: true
# Update the /etc/hosts file:
vi /etc/hosts
# Add the following configuration:
x.x.x.x <registry_name>
# To apply changes and ensure the RKE2 server is running properly, restart the service:
systemctl restart rke2-server.service
# After restarting, check if the service is active and running:
systemctl status rke2-server.service
Deploy Worker Nodes
Step 1: Login to the Worker Server and Unzip the build files
# Connect to the server via SSH:
ssh root@<worker_ip>
# Create the /opt/rancher directory:
mkdir -p /opt/rancher
# Verify the script folder exists:
mount -t nfs <control-plane-ip>:/opt/rancher /opt/rancher
do scp from control plane
scp -r rke2_worker.sh root@worker_ip:/opt
# Grant execution permissions to the scripts (if not):
chmod +x *.sh
Step 2: Unzip and Install RPM Packages on the worker-nodes
TO COPY CODE:
# Continue from above.
# Navigate to the /matrix directory:
cd /matrix
# Unzip the required RPM packages:
unzip rke2_rpm_packages.zip
# Install the RPM packages:
rpm -ivh --force --nodeps /matrix/rke2_rpm_packages/*.rpm
Step 3: Disable SELinux and Firewall on the System
# edit selinux config file
sed -i 's/^SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config
# Stop and disable the firewall service to prevent it from starting on boot
systemctl stop firewalld
systemctl disable firewalld
# Reboot the node
reboot
Step 4: Add the Mount Point and Create the Main Worker Node
# Navigate to the /opt directory:
cd /opt
# Run the following command to check if the .zst file exists in the /opt directory:
ls -lhrt /opt
# Execute the build script:
./rke2_worker.sh worker
Step 5: Verify RKE2 Agent Status
# Check if the rke2-agent service is running:
systemctl status rke2-agent.service
Step 6: Create registries.yaml Configuration File
#Transfer /certificates from resistry machine to path /opt/rancher
# Connect to the server via SSH:
ssh root@<internet_machine_ip>
# scp registry certificates to worker server
scp -r /certificates root@<worker1-ip>:/opt/rancher/
# create the file for editing in worker:
vi /etc/rancher/rke2/registries.yaml
# Add the following configuration:
mirrors:
"<registry_name>": #example: caloregistry5.io
endpoint:
- "https://<registry_name>:5000" #example: caloregistry5.io
configs:
"<registry_name>:5000": #example: caloregistry5.io
auth:
username: admin
password: admin123
tls:
cert_file: /opt/rancher/certificates/docker.crt
key_file: /opt/rancher/certificates/docker.key
insecure_skip_verify: true
# Update the /etc/hosts file:
vi /etc/hosts
# Add the following configuration:
x.x.x.x <registry_name>
# To apply changes and ensure the RKE2 agent is running properly, restart the service:
systemctl restart rke2-agent.service
# After restarting, check if the agent is active and running:
systemctl status rke2-agent.service
Adding Master Nodes for HA
Follow and complete steps 1 to 5 above, from section: Steps to Deploy Worker Nodes and then follow below step to transition to control plane HA nodes.
Step 1: Transition to control plane HA nodes:
# Stop the RKE2 Agent Service:
systemctl stop rke2-agent.service
# Disable the RKE2 Agent Service to prevent it from starting on boot:
systemctl disable rke2-agent.service
# Start the RKE2 Server Service for the control plane:
systemctl enable --now rke2-server.service
# Verify that the server service is running:
systemctl status rke2-server.service
# Enable the kubectl binary by updating environment variables:
echo 'export KUBECONFIG=/etc/rancher/rke2/rke2.yaml' >> ~/.bashrc
echo 'export CRI_CONFIG_FILE=/var/lib/rancher/rke2/agent/etc/crictl.yaml' >> ~/.bashrc echo 'export PATH=$PATH:/var/lib/rancher/rke2/bin' >> ~/.bashrc
# Apply the changes immediately
source ~/.bashrc
# Verify kubectl is working:
kubectl get nodes
Next, you need to follow and complete Step 6 from Section Deploy RKE2. Once that is done you need to follow the steps below to transition to control plane HA nodes:
Install Helm on 2nd and 3rd Control Plane Nodes
Step 1: Install Helm on an Control plane
# Run the following commands on the control-plane 2 and control plane 3 machine:
cd /opt/rancher/helm
tar -zxvf helm-v3.14.3-linux-amd64.tar.gz > /dev/null 2>&1
rsync -avP linux-amd64/helm /usr/local/bin/ > /dev/null 2>&1
Deploy Longhorn on 1st Control Plane
Note: By default, Longhorn creates 2 replicas, but it provides the flexibility to adjust the replica set configuration by modifying the Values.yaml file on path” cd /opt/rancher/helm/longhorn “
Step 1: Configure Longhorn Replica Settings in values.yaml
# Navigate to the Longhorn Helm Chart Directory
cd /opt/rancher/helm/longhorn
# Open the values.yaml File for Editing
vi values.yaml
# Locate and update these configurations:
# Image Repository Configuration
image:
repository: <local_registry_name> # Example: 10.126.87.14/longhornio/livenessprobe
tag: <update_tag> # Example: v2.9.0
# Persistence Settings
persistence:
defaultClassReplicaCount: 2
# Default Settings
defaultSettings:
defaultDataPath: /matrix
# Install the longhorn the first control plane:
helm install longhorn /opt/rancher/helm/longhorn --namespace longhorn-system --create-namespace --version {{ LONGHORN_VERSION }}
# Install the cert to the first control plane:
helm upgrade -i cert-manager /opt/rancher/helm/cert-manager-{{ CERT_VERSION }}.tgz --namespace cert-manager --create-namespace --set installCRDs=true --set image.repository={{ registry_name }}/cert/cert-manager-controller --set webhook.image.repository={{ registry_name }}/cert/cert-manager-webhook --set cainjector.image.repository={{ registry_name }}/cert/cert-manager-cainjector --set startupapicheck.image.repository={{ registry_name }}/cert/cert-manager-ctl
Step 2: Node Scheduling Disable for Longhorn-PVC.
Node scheduling should be disabled for longhorn-pvc on the master nodes and database nodes.
1) go to the Longhorn –UI via executing the below cli on master node, then it will open the longhorn UI.
Kubectl get po –n longhorn-system
Kubectl port-forward pod/<longhorn-UI-pod-name> --n longhorn-system 8000:8000 --address=’0.0.0.0’
2) Go to the browser
3) https://[ip6]:8000
4) go to under “nodes” section
5) select the database and master nodes
6) Disable the node scheduling.
Verify the Nodes
#To check the nodes details
kubectl get nodes
kubectl get nodes -o wide
#To validate deployed longhorn system
kubectl get all -n longhorn-system
kubectl get pods -n longhorn-system
# To validate deployed cattle system
kubectl get all -n cattle-system
kubectl get pods -n cattle-system
# To validate all deployment
kubectl get all -A
Verify Core-DNS Pod with Kube-System Namespace
If you find that kube-system pods are in a pending state, follow the steps below:
A file /var/lib/rancher/rke2/server/manifests/rke2-coredns-config.yaml should be created with similar content. After that, the rke2-server service should be restarted to apply the change.
# Create the following file:
vi /var/lib/rancher/rke2/server/manifests/rke2-coredns-config.yaml
# add the below content:
...
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
name: rke2-coredns
namespace: kube-system
spec:
valuesContent: |-
zoneFiles:
- filename: doit.tech.conf
domain: doit.tech
contents: |
doit.tech:53 {
errors
cache 30
forward . 10.0.254.1
}
...
# Restart rke-server service
systemctl restart rke-server.service
# Configmap rke2-coredns-rke2-coredns can be reviewed to determine if the change was successful.
kubectl -n kube-system get configmap rke2-coredns-rke2-coredns -o json
Longhorn PVC Issue (Optional)
If you encounter PVC-related issues while connecting the Longhorn storage class to service deployments, follow these instructions.
Since we are using the Alma 8.8 operating system, certain bugs may be present. Applying this patch is recommended to resolve the issue, though its effectiveness may vary depending on the operating system.
# Edit the RKE2 configuration file:
kubectl -n longhorn-system patch lhsm “pvc-name” --type=merge --subresource status --patch 'status: {state: error}'
# Restart the RKE2 server:
systemctl restart rke2-server.service.
© 2025 Cisco and/or its affiliates. All rights reserved.
For more information about trademarks, please visit: Cisco trademarks
For more information about legal terms, please visit: Cisco legal terms
For legal information about Accedian Skylight products, please visit: Accedian legal terms and tradmarks