UI-Access Details
All services that have a web-based GUI page are listed in the below table. More details can be found in their respective section:
Service | Link | Credentials |
---|---|---|
Matrix-web |
if nginx port changed then use below URL to open GUI
| Username: Password: |
Matrix-rabbitmq |
nginx port changed then use below URL to open GUI
| Username : Password: |
Matrix-kafka |
nginx port changed then use below URL to open GUI
| Username: Password: |
Matrix-Flower |
nginx port changed then use below URL to open GUI
| NA |
Matrix-pgadmin4 |
nginx port changed then use below URL to open GUI
| Username: Password : |
Collector-syfter |
| Username: Password: |
Collector-rabbitMq |
| Username : Password: |
AO-Alert-Manager |
| NA |
Grafana |
| Username: Password: |
Prometheus |
| NA |
NOTE: Alert Manager UI needs some configuration if you are trying to open it in IPV6.
Alert manager UI can open using the port forwarding command it doesn’t support ipv6 endpoint UI. So for that we need to add dns in alertmanager if we have requirement to open alertmanager UI.
Enable the ingress in alertmanager Helm chart first:
ingress:
enabled: false # Change this value as true to enable the ingress
className: ""
annotations: {}
# kubernetes.io/ingress.class: nginx
# kubernetes.io/tls-acme: "true"
hosts:
- host: alertmanager.domain.com
Upgrade the alertmanager Helm chart:
helm upgrade matrix-prometheus -n matrix-ao ./
Execute port forwarding command for alertmanager UI in master node for ipv6:
kubectl port-forward --namespace matrix-ao service/matrix-prometheus-alertmanager 9093:9093 --address='::'
After this configuration is done you need to map the ipv6 master vm ip with dns same device where you are trying to open the alertmanager UI. IMPORTANT: Every user must do these steps in his personal device before opening Alert manager UI.
Windows
C:\WINDOWS\system32\drivers\etc
notepad.exe hosts
Add these lines in hosts files for mapping and save the file.
Replace the IPv6 IP with master node IPv6 IP address:
2001:420:54ff:84::203 alertmanager.domain.com
Access the alertmanager UI after enabling all the settings and executing port forwarding command:
URL- http://alertmanager.domain.com:9093
Validation
Once installation is complete, it is necessary to validate that the servers are running properly and as expected. This section will detail the validation steps for the Provider Connectivity Assurance Fault Management and Mobility Performance Monitoring feature On-premise.
POD Validation for PM components
kubectl get pods -n matrix-pm-analytics
NAME READY STATUS RESTARTS AGE
10-t20-alarm-consumer-deployment-796b6f5987-k87ch 1/1 Running 0 17d
10-t20-alarm-consumer-deployment-796b6f5987-rnblf 1/1 Running 0 17d
11-t20-dnac-event-consumer-deployment-549b66f5c6-lps7c 1/1 Running 0 17d
11-t20-dnac-event-consumer-deployment-549b66f5c6-m474h 1/1 Running 1 (4h47m ago) 17d
20-hemtyagi-alarm-consumer-deployment-67559bfb-q9wnt 1/1 Running 0 4d23h
20-hemtyagi-alarm-consumer-deployment-67559bfb-xnf9d 1/1 Running 0 4d23h
22-t30-metrics-consumer-deployment-f6c4d466-d454q 1/1 Running 6 (3h52m ago) 3h55m
22-t30-metrics-consumer-deployment-f6c4d466-m7hwz 1/1 Running 5 (3h53m ago) 3h55m
23-t30-inventory-consumer-deployment-766478df89-mjcpd 1/1 Running 6 (3h52m ago) 3h56m
23-t30-inventory-consumer-deployment-766478df89-zqhlp 1/1 Running 6 (3h52m ago) 3h56m
24-t30-alarm-consumer-deployment-8489787764-wbfpc 1/1 Running 6 (3h52m ago) 3h56m
24-t30-alarm-consumer-deployment-8489787764-xcqzz 1/1 Running 6 (3h52m ago) 3h56m
25-t40-metrics-consumer-deployment-856d548dc-mggl4 1/1 Running 0 3h39m
25-t40-metrics-consumer-deployment-856d548dc-s5r7q 1/1 Running 0 3h39m
26-t40-inventory-consumer-deployment-6749898c8-9c4qs 1/1 Running 0 3h39m
26-t40-inventory-consumer-deployment-6749898c8-lhhdd 1/1 Running 0 3h39m
27-t40-alarm-consumer-deployment-79487b5969-d2jvj 1/1 Running 0 3h39m
27-t40-alarm-consumer-deployment-79487b5969-w44gx 1/1 Running 0 3h39m
31-biplab-alarm-consumer-deployment-767f867786-56mxf 1/1 Running 0 166m
31-biplab-alarm-consumer-deployment-767f867786-k6t8x 1/1 Running 0 166m
8-t20-metrics-consumer-deployment-796cc9b5db-ntr2x 1/1 Running 0 17d
8-t20-metrics-consumer-deployment-796cc9b5db-v69zn 1/1 Running 0 17d
9-t20-inventory-consumer-deployment-6b7dddcd5-bmh4z 1/1 Running 0 17d
9-t20-inventory-consumer-deployment-6b7dddcd5-xxgtb 1/1 Running 0 17d
matrix-celerybeat-6444bdc547-gz87q 1/1 Running 0 3d4h
matrix-celeryworker-6cb6d6c4b8-6mq57 1/1 Running 21 (2d23h ago) 17d
matrix-celeryworker-6cb6d6c4b8-xp7vj 1/1 Running 20 (2d23h ago) 17d
matrix-coordinator-6547585d9b-6mc66 1/1 Running 0 3h57m
matrix-dbsync-deployment-6567bc6c69-nrhln 1/1 Running 0 18d
matrix-fileservice-775cb68d45-m8qht 1/1 Running 0 18d
matrix-flower-6c8b986bc-4qs4f 1/1 Running 0 5d3h
matrix-kafka-0 1/1 Running 0 17d
matrix-kafka-1 1/1 Running 0 2d23h
matrix-kafka-2 1/1 Running 0 17d
matrix-nginx-ff44bdc65-qjrjz 1/1 Running 0 4d4h
matrix-nginx-ff44bdc65-wjg5v 1/1 Running 0 4d4h
matrix-nginx-ff44bdc65-z5nx2 1/1 Running 1 (4h47m ago) 4d4h
matrix-pgadmin-pgadmin4-85d45c65ff-tz4mj 1/1 Running 0 6d
matrix-rabbitmq-0 1/1 Running 1 (4h47m ago) 4d23h
matrix-rabbitmq-1 1/1 Running 0 4d23h
matrix-rabbitmq-2 1/1 Running 0 2d23h
matrix-redis-cluster-0 1/1 Running 2 (4h47m ago) 18d
matrix-redis-cluster-1 1/1 Running 1 (18d ago) 18d
matrix-redis-cluster-2 1/1 Running 0 2d23h
matrix-redis-cluster-3 1/1 Running 2 (4h47m ago) 18d
matrix-redis-cluster-4 1/1 Running 1 (18d ago) 18d
matrix-redis-cluster-5 1/1 Running 1 (2d23h ago) 2d23h
matrix-timescaledb-0 2/2 Running 0 3d4h
matrix-timescaledb-1 2/2 Running 0 3d4h
matrix-ui-kafka-manager-5c6c87bc5f-vtj89 1/1 Running 0 5d4h
matrix-webapp-7c595d9f4b-4p5j6 1/1 Running 0 4d5h
matrix-webapp-7c595d9f4b-cvrpj 1/1 Running 0 4h22m
matrix-webapp-7c595d9f4b-wqqt4 1/1 Running 0 4h26m
matrix-zookeeper-0 1/1 Running 1 (4h47m ago) 18d
matrix-zookeeper-1 1/1 Running 0 18d
matrix-zookeeper-2 1/1 Running 0 6d5h
Service Validation for PM pipeline components
kubectl get service -n matrix-pm-analytics
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
matrix-coordinator ClusterIP 10.0.20.53 <none> 8010/TCP 17d
matrix-fileservice ClusterIP 10.0.240.66 <none> 8005/TCP,8004/TCP 18d
matrix-flower ClusterIP 10.0.52.71 <none> 8888/TCP 13d
matrix-kafka LoadBalancer 10.0.196.127 10.224.3.223 29092:31765/TCP,9092:31941/TCP 18d
matrix-kafka-0-external LoadBalancer 10.0.42.21 10.224.3.220 9092:32222/TCP 18d
matrix-kafka-1-external LoadBalancer 10.0.184.82 10.224.3.221 9092:31660/TCP 18d
matrix-kafka-2-external LoadBalancer 10.0.233.11 10.224.3.222 9092:32403/TCP 18d
matrix-kafka-headless ClusterIP None <none> 29092/TCP,9093/TCP 18d
matrix-nginx ClusterIP 10.0.143.1 <none> 8080/TCP,8443/TCP 17d
matrix-pgadmin-pgadmin4 ClusterIP 10.0.94.222 <none> 5050/TCP 13d
matrix-rabbitmq ClusterIP 10.0.175.217 <none> 5672/TCP,4369/TCP,25672/TCP,15672/TCP,9419/TCP 5d
matrix-rabbitmq-headless ClusterIP None <none> 4369/TCP,5672/TCP,25672/TCP,15672/TCP 5d
matrix-redis-cluster ClusterIP 10.0.67.92 <none> 6379/TCP 18d
matrix-redis-cluster-headless ClusterIP None <none> 6379/TCP,16379/TCP 18d
matrix-timescaledb ClusterIP 10.0.179.134 <none> 6432/TCP,5432/TCP 18d
matrix-timescaledb-config ClusterIP None <none> <none> 2d22h
matrix-timescaledb-replica ClusterIP 10.0.170.84 <none> 6432/TCP,5432/TCP 18d
matrix-ui-kafka-manager ClusterIP 10.0.48.34 <none> 9000/TCP 17d
matrix-webapp ClusterIP 10.0.169.74 <none> 8080/TCP 17d
matrix-zookeeper ClusterIP 10.0.107.188 <none> 2181/TCP,2888/TCP,3888/TCP 18d
matrix-zookeeper-headless ClusterIP None <none> 2181/TCP,2888/TCP,3888/TCP 18d
matrix-zookeeper-metrics ClusterIP 10.0.168.123 <none> 9141/TCP 18d
PVC Validation for PM pipeline components
kubectl get pvc -n matrix-pm-analytics
[root@matrix-devops-m2 ~]# kubectl get pvc -n matrix-pm-analytics
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
consumer-pv Bound pvc-7232aba0-9290-46c6-b69d-dd3cef537fdd 20Gi RWX longhorn-replica2 2d5h
data-matrix-kafka-0 Bound pvc-96cc236a-aa6e-4180-99e4-8ba6897a4a99 20Gi RWO longhorn 14d
data-matrix-kafka-1 Bound pvc-2d6147ff-7841-43c1-a76d-60160969d825 20Gi RWO longhorn 14d
data-matrix-kafka-2 Bound pvc-4e5ed7eb-f047-4c02-8eba-538aabc2e887 20Gi RWO longhorn 14d
data-matrix-rabbitmq-0 Bound pvc-934b7b9f-0e88-433e-9311-bc7a8740a1fe 20Gi RWO longhorn 14d
data-matrix-rabbitmq-1 Bound pvc-31ad9671-2ea2-4607-8fbd-9e67b1edc305 20Gi RWO longhorn 14d
data-matrix-rabbitmq-2 Bound pvc-442bf5d0-9e3a-4e44-8dd4-e38e728dde1d 20Gi RWO longhorn 14d
matrix-celeryworker-pvc Bound pvc-105c9740-b32e-4361-a977-fa6a8bc044eb 40Gi RWX longhorn 13d
matrix-fileservice-pvc Bound pvc-e011bcfa-ee80-412c-bf5a-2c202953bb12 20Gi RWX longhorn 14d
matrix-pgadmin-pgadmin4 Bound pvc-75dd1289-6a0e-4a7b-a90b-3347041e6971 10Gi RWO longhorn 14d
matrix-webapp-pvc Bound pvc-f7bbaf4c-713e-4949-b98c-9ac900f94b85 20Gi RWX longhorn-replica2 2d5h
redis-data-matrix-redis-cluster-0 Bound pvc-380ba229-be6a-466d-96d6-78a6eab133fd 8Gi RWO longhorn 14d
redis-data-matrix-redis-cluster-1 Bound pvc-fe186af1-3f95-4d2d-97e5-b0f8d35a1555 8Gi RWO longhorn 14d
redis-data-matrix-redis-cluster-2 Bound pvc-7fecd6dd-6cf4-4980-a39e-f5e480e3552f 8Gi RWO longhorn 14d
redis-data-matrix-redis-cluster-3 Bound pvc-cf2f3cc6-781a-4ef0-8a59-b54eeebcbf87 8Gi RWO longhorn 14d
redis-data-matrix-redis-cluster-4 Bound pvc-e9fa20de-dfdc-4a94-a74f-9f8e3a175bde 8Gi RWO longhorn 14d
redis-data-matrix-redis-cluster-5 Bound pvc-7a8c4701-3b01-4907-ba7e-307be69b7518 8Gi RWO longhorn 14d
storage-volume-matrix-timescaledb-0 Bound timescaledb-data-vol-0-node2 2Gi RWO local-storage-data 15d
storage-volume-matrix-timescaledb-1 Bound timescaledb-data-vol-0-node1 2Gi RWO local-storage-data 15d
wal-volume-matrix-timescaledb-0 Bound timescaledb-data-wal-0-node2 1Gi RWO local-storage-wal 15d
wal-volume-matrix-timescaledb-1 Bound timescaledb-data-wal-0-node1 1Gi RWO local-storage-wal 15d
[root@matrix-devops-m2 ~]#
Service and Role Bindings Validation for PM Pipeline
kubectl get rolebinding,role,serviceaccount -n matrix-pm-analytics
NAME ROLE AGE
rolebinding.rbac.authorization.k8s.io/matrix-coordinator-deployment-management Role/matrix-deployment-management 17d
rolebinding.rbac.authorization.k8s.io/matrix-rabbitmq-endpoint-reader Role/matrix-rabbitmq-endpoint-reader 5d
rolebinding.rbac.authorization.k8s.io/matrix-timescaledb Role/matrix-timescaledb 18d
NAME CREATED AT
role.rbac.authorization.k8s.io/matrix-deployment-management 2023-11-16T12:41:37Z
role.rbac.authorization.k8s.io/matrix-rabbitmq-endpoint-reader 2023-11-29T11:16:24Z
role.rbac.authorization.k8s.io/matrix-timescaledb 2023-11-16T06:19:42Z
NAME SECRETS AGE
serviceaccount/default 0 18d
serviceaccount/matrix-coordinator-serviceaccount 0 17d
serviceaccount/matrix-kafka 0 18d
serviceaccount/matrix-rabbitmq 1 5d
serviceaccount/matrix-timescaledb 0 18d
serviceaccount/matrix-ui-kafka-manager 0 17d
POD Validation for FM Components
kubectl get pods -n matrix-fm-analytics
NAME READY STATUS RESTARTS AGE
matrix-of-alertmanager-6964fff568-qkmn6 1/1 Running 0 11d
matrix-of-alertmanager-6964fff568-stk4b 1/1 Running 0 11d
matrix-of-alertmanager-6964fff568-vkgjc 1/1 Running 0 11d
matrix-of-alertservice-645ffc7869-9sbrp 1/1 Running 0 8d
matrix-of-alertservice-645ffc7869-gf6wl 1/1 Running 0 8d
matrix-of-alertservice-645ffc7869-pwff6 1/1 Running 0 8d
matrix-of-consumer-5d4f69b9f5-7br7m 1/1 Running 239 (29h ago) 11d
matrix-of-consumer-5d4f69b9f5-jn7mj 1/1 Running 237 (29h ago) 11d
matrix-of-consumer-5d4f69b9f5-t426b 1/1 Running 238 (29h ago) 11d
matrix-of-framework-bdbf78bb9-42kbn 1/1 Running 0 11d
matrix-of-framework-bdbf78bb9-tglpd 1/1 Running 0 11d
matrix-of-framework-bdbf78bb9-v9lms 1/1 Running 0 11d
matrix-of-snmppipeline-f6596c45f-dgzwg 1/1 Running 0 11d
matrix-of-snmppipeline-f6596c45f-kl22t 1/1 Running 0 11d
matrix-of-snmppipeline-f6596c45f-xstm6 1/1 Running 0 11d
matrix-of-snmptrapd-b5fc7cf74-sg6x4 1/1 Running 0 11d
matrix-of-snmptrapd-b5fc7cf74-sw8bp 1/1 Running 0 11d
matrix-of-snmptrapd-b5fc7cf74-x5bcq 1/1 Running 0 11d
Service Validation for FM Components
kubectl get service -n matrix-fm-analytics
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
matrix-of-alertmanager ClusterIP 10.43.7.40 <none> 9094/TCP 11d
matrix-of-alertservice LoadBalancer 10.43.50.238 10.126.87.99,2001:420:54ff:84::263 8090:31580/TCP 11d
matrix-of-framework ClusterIP 10.43.45.188 <none> 8082/TCP 11d
matrix-of-snmppipeline ClusterIP 10.43.29.29 <none> 5044/TCP 11d
matrix-of-snmptrapd LoadBalancer 10.43.193.177 10.126.87.98,2001:420:54ff:84::262 1162:30507/UDP 11d
PVC Validation for FM components
kubectl get pvc -n matrix-fm-analytics
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
matrix-of-alertmanager-pvc Bound pvc-00aa89e2-d1de-4908-8a3e-fab46aaaa64d 1Gi RWX longhorn 11d
Service and Role Bindings Validation for FM Pipeline
kubectl get rolebinding,role,serviceaccount -n matrix-fm-analytics
NAME SECRETS AGE
serviceaccount/default 0 12d
Note: Similarly you can validate pod, service, pvc, role bindings for AO and Collector components also with their namespace.
Check Pod Level Logs
kubectl logs -f <pod_name> -n <name_space>
Validate Kafka-UI
To validate the kafka-ui first check for pod is running or not.
Access Kafka Manager UI by putting below url in web browser:
http://<kafkaui_url>
Example: https://lb/kafka
Validate RabbitMQ Cluster
Kubectl get pod –n <namespace> | grep rabbitmq
matrix-rabbitmq-0 1/1 Running 1 (28h ago) 5d23h
matrix-rabbitmq-1 1/1 Running 0 5d23h
matrix-rabbitmq-2 1/1 Running 0 3d23h
Rabbitmq-UI
https://lb/rabbitmq/
Validate Redis Cluster
kubectl get pod -n <namespace> | grep redis
matrix-redis-cluster-0 1/1 Running 2 (29h ago) 19d
matrix-redis-cluster-1 1/1 Running 1 (19d ago) 19d
matrix-redis-cluster-2 1/1 Running 0 4d
matrix-redis-cluster-3 1/1 Running 2 (29h ago) 19d
matrix-redis-cluster-4 1/1 Running 1 (19d ago) 19d
matrix-redis-cluster-5 1/1 Running 1 (4d ago) 4d
kubectl logs <pod-name> -n <namespace>
1:M 05 Dec 2023 11:39:02.052 * Background saving started by pid 465625
465625:C 05 Dec 2023 11:39:02.072 * DB saved on disk
465625:C 05 Dec 2023 11:39:02.072 * Fork CoW for RDB: current 0 MB, peak 0 MB, average 0 MB
1:M 05 Dec 2023 11:39:02.153 * Background saving terminated with success
1:M 05 Dec 2023 11:54:03.095 * 1 changes in 900 seconds. Saving...
1:M 05 Dec 2023 11:54:03.096 * Background saving started by pid 469624
469624:C 05 Dec 2023 11:54:03.110 * DB saved on disk
469624:C 05 Dec 2023 11:54:03.111 * Fork CoW for RDB: current 0 MB, peak 0 MB, average 0 MB
1:M 05 Dec 2023 11:54:03.196 * Background saving terminated with success
1:M 05 Dec 2023 12:09:04.057 * 1 changes in 900 seconds. Saving...
1:M 05 Dec 2023 12:09:04.058 * Background saving started by pid 473611
473611:C 05 Dec 2023 12:09:04.073 * DB saved on disk
473611:C 05 Dec 2023 12:09:04.074 * Fork CoW for RDB: current 0 MB, peak 0 MB, average 0 MB
1:M 05 Dec 2023 12:09:04.159 * Background saving terminated with success
Validate Timescaledb Service
kubectl get pod -n <namespace> | grep timesca
matrix-timescaledb-0 2/2 Running 0 4d5h
matrix-timescaledb-1 2/2 Running 0 4d5h
kubectl logs <pod-name> -n <namespace>
2023-12-05 07:55:05 UTC [295784]: [656ed759.48368-2] postgres@postgres,app=[unknown] [00000] LOG: connection authenticated: identity="postgres" method=peer (/var/lib/postgresql/data/pg_hba.conf:3)
2023-12-05 07:55:05 UTC [295784]: [656ed759.48368-3] postgres@postgres,app=[unknown] [00000] LOG: connection authorized: user=postgres database=postgres application_name=pg_isready
2023-12-05 07:55:05 UTC [295784]: [656ed759.48368-4] postgres@postgres,app=pg_isready [00000] LOG: disconnection: session time: 0:00:00.002 user=postgres database=postgres host=[local]
2023-12-05 07:55:09 UTC [295786]: [656ed75d.4836a-1] [unknown]@[unknown],app=[unknown] [00000] LOG: connection received: host=10.224.2.113 port=57298
2023-12-05 07:55:09 UTC [295786]: [656ed75d.4836a-2] postgres@postgres,app=[unknown] [28000] FATAL: pg_hba.conf rejects connection for host "10.224.2.113", user "postgres", database "postgres", SSL encryption
2023-12-05 07:55:09 UTC [295787]: [656ed75d.4836b-1] [unknown]@[unknown],app=[unknown] [00000] LOG: connection received: host=10.224.2.113 port=57300
2023-12-05 07:55:09 UTC [295787]: [656ed75d.4836b-2] postgres@postgres,app=[unknown] [28000] FATAL: pg_hba.conf rejects connection for host "10.224.2.113", user "postgres", database "postgres", no encryption
2023-12-05 07:55:19 UTC [295792]: [656ed767.48370-1] [unknown]@[unknown],app=[unknown] [00000] LOG: connection received: host=10.224.2.113 port=42928
2023-12-05 07:55:19 UTC [295792]: [656ed767.48370-2] postgres@postgres,app=[unknown] [28000] FATAL: pg_hba.conf rejects connection for host "10.224.2.113", user "postgres", database "postgres", SSL encryption
2023-12-05 07:55:19 UTC [295793]: [656ed767.48371-1] [unknown]@[unknown],app=[unknown] [00000] LOG: connection received: host=10.224.2.113 port=42942
Validate Pgadmin Service/UI
kubectl get pod -n <namespace> | grep pgadmin
matrix-pgadmin-pgadmin4-85d45c65ff-tz4mj 1/1 Running 0 7d1h
pgadmin-UI
http:// <Nginx-LB-IP>/pgadmin4/
Right click Servers > Register > Server...
Validate Web Application Services
kubectl get pod -n <namespace> | grep web
matrix-webapp-7c595d9f4b-4p5j6 1/1 Running 0 5d6h
matrix-webapp-7c595d9f4b-cvrpj 1/1 Running 0 30h
matrix-webapp-7c595d9f4b-wqqt4 1/1 Running 0 30h
Matrix-webapp UI:
Access Web UI through below detail:
http://<webapp-url>
Example: https://<Nginx-LB-IP>/
Validate Flower Service/UI
Access Flower UI through below detail:
http://<flower-url>
Example: http://Nginx-LB-IP/flower/
Once you start the celery workers, this Flower page will become populated.
Verify Web application tables are in the database on the Accessnode Server.
Validate File Service
kubectl get pod -n <namespace> | grep file
matrix-fileservice-775cb68d45-m8qht 1/1 Running 0 19d
kubectl logs <pod-name> -n <namespace>
2023-11-17 14:15:31.874166569 +0000 UTC m=+114406.126128761
[]
Timer On
2023-11-17 14:15:32.133569023 +0000 UTC m=+114406.386887833
[]
Timer On
2023-11-17 14:16:31.930097923 +0000 UTC m=+114466.182060115
[]
Timer On
2023-11-17 14:16:32.190842189 +0000 UTC m=+114466.444160999
[]
Timer On
2023-11-17 14:17:31.989028759 +0000 UTC m=+114526.240990951
[]
Timer On
...
Logging in for the First Time
Now, the Fault Management and Mobility Performance Monitoring feature is running! Go to the VIP of the Webserver in a browser (https://URL
) and you should arrive at the Fault Management and Mobility Performance Monitoring feature.
You may get a security warning, but feel free to advance.
You can login with the credentials set during the Webserver Installation in the web configmap file.
General Troubleshooting Commands
This section includes helpful linux commands for troubleshooting.
System Administration commands:
See the partition and diskspace information, you can use the following commands:
Kubectl logs -f <pod_name> -n <namespace>
Kubectl describe <pod_name> -n <namespace>
Kubectl describe <service_name> -n <namespace>
Start a bash terminal in a K8 pod.
kubectl exec -it <pod_name> -n <namespace> -- bash
kubectl get pod -n <namespace> -o wide
kubectl get all -n <namespace> -o wide
kubectl logs <pod_name> -o wide
kubectl describe <resource_name>
helm list -n <namespace>
helm install <helm_name> -n <namespace> -f values.yaml ./
helm delete <helm_name> -n <namespace>
Base Image Upgradation Steps
Make the following changes to upgrade base image:
Get the base image details
#Example: matrix4-base:latest
Pull the image in docker registry
#Example: docker pull matrix4-base:latest
Re-tag with local docker registry
#Example: docker tag matrix4-base:latest 10.126.87.96 :5000/ matrix4-base:latest
Push the tagged image in the local registry
#Example: docker push 10.126.87.96 :5000/ matrix4-base:latest
Go to all the services helm chart path and update the values.yaml accordingly
#Example: imageVersions:
matrixweb: name: 10.126.87.96/ matrix4-base tag: latest
Helm upgrade
#Example: helm upgrade -n <namespace> <service_name> ./
Then validate the updated images inside pod
#Example: kubectl describe pod -n <namespace> <pod_name> | grep image
Helm Chart Upgradation Steps
Here's how to upgrade Helm charts to the latest versions:
Get the latest chart details: Use the URL provided in the release notes to access the latest Helm chart information.
Pull Docker images: Pull the required Docker images in registry using
docker pull <image-name>:latest
.Tag for local registry (optional): If using a local registry, tag the pulled images for your local registry using
docker tag <image-name>:latest <registry-address>/<image-name>:latest
. Replace<registry-address>
with your local registry address (e.g.,10.126.87.96:5000
).Push to local registry (optional): If using a local registry, push the tagged images using
docker push <registry-address>/<image-name>:latest
.Update values.yaml: In each service's Helm chart directory, update the
values.yaml
file with the new image details. For example, update theimageVersions
section formatrixweb
with the new image name and tag.Upgrade the release: Use
helm upgrade -n <namespace> <service_name> ./
to upgrade the Helm release in the specified namespace.Verify image updates: Validate that the pods are using the updated image versions. Run
kubectl describe pod -n <namespace> <pod_name> | grep image
to check the image used by a specific pod.
BulkStatVM creation step
First, we need to make it a sftp server
Log in to the sftp server here <VMIP>
Step 1: Install ssh
yum install openssh-server
Step 2: Create an sftp User
Create an sftp user for sftp server.
adduser “matrixsftp”
Set password for “sftpuser”
Passwd “matrixsftp”
Step 3: Set Up Directory for sftp
mkdir /matrix/{bulkstat,ssd}
Step 4: Set Ownership and Permissions
useradd –d /matrix/ -s /bin/false -g matrixsftp matrixsftp
sudo chown –R matrixsftp:matrixsftp /matrix/{bulkstat,ssd}
sudo chmod go+rx /matrix/
Step 5: Change SSHD Configuration for SFTP Group
vi /etc/ssh/sshd_config
Find the line that starts with Subsystem and ensure it looks like this:
Subsystem sftp internal-sftp
Then, add the following lines at the end of the file to configure the SFTP-only group and directory:
Match User matrixsftp
ForceCommand internal-sftp
ChrootDirectory /matrix
PasswordAuthentication yes
PermitTunnel no
AllowAgentForwarding no
AllowTcpForwarding no
X11Forwarding no
Paste below content and save
Step 6: Restart ssh Server
systemctl restart sshd
Now we can share the file through sftp
Note: To send node information to Prometheus for monitoring purposes, you need to deploy node_exporter on the host machine.
Prerequisite: Docker v25.0.3 must be installed
Install the node exporter
Run the node exporter container
docker run -d -p 9100:9100 --name=node_exporter prom-node-exporter: v1.0.1
Verify the Node Exporter is Running
curl http://<VMIP>:9100/metrics
Add this and Configure Prometheus to scrape exporter metrics
Keepalived Configuration
Prerequisite: Keepalived installed both server sftp vm
In the server 1 add the configuration
Add following lines in keepalived configuration and set priority in server1 in below path
Take the backup of existing keepalived.confcp /etc/keepalived/keepalived.conf/ /etc/keepalived/keepalived.conf_bkp
vi /etc/keepalived/keepalived.conf/
add the following lines
vrrp_instance VI_1 {
state MASTER
interface ens160
virtual_router_id 100
priority 150
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
<VIP>
}
virtual_ipaddress_excluded{
<IPV6 VIP>
}
track_script {
chk_sftp
}
}
vrrp_script chk_sftp {
script "[ $(ps aux | grep -v grep | grep 'sshd' | wc -l) -gt 0 ]"
interval 2
}
Add following lines in keepalived configuration and set priority in server2 in below path
Take the backup of existing keepalived.confcp /etc/keepalived/keepalived.conf/ /etc/keepalived/keepalived.conf_bkp
vi /etc/keepalived/keepalived.conf/
add the following lines
vrrp_instance VI_1 {
state BACKUP
interface ens160
virtual_router_id 100
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
<VIP>
}
virtual_ipaddress_excluded{
<IPV6 VIP>
}
track_script {
chk_sftp
}
}
vrrp_script chk_sftp {
script "[ $(ps aux | grep -v grep | grep 'sshd' | wc -l) -gt 0 ]"
interval 2
}
Restart the server
systemctl restart keepalived
© 2025 Cisco and/or its affiliates. All rights reserved.
For more information about trademarks, please visit: Cisco trademarks
For more information about legal terms, please visit: Cisco legal terms
For legal information about Accedian Skylight products, please visit: Accedian legal terms and tradmarks