added k3s-upgrade procedure

2025-05-05 11:19:21 +03:00
parent b451cf2830
commit fd8a011aa8
7 changed files with 122 additions and 54 deletions
--- a/README.md
+++ b/README.md
@ -35,6 +35,7 @@
 - 🍵 Gitea Git Server and Actions for CI/CD

 ### 📋 Coming Soon
+
 - Nextcloud
 - Monitoring Stack with Prometheus and Grafana

@ -51,6 +52,7 @@
 ### 1. Setting up Proxmox Infrastructure

 #### Proxmox Base Installation
+
 - Boot mini PCs from Proxmox USB drive
 - Install on SSD and configure networking
 - Set up cluster configuration
@ -59,6 +61,7 @@
 #### 3. Cloud Image Implementation

 Cloud images provide:
+
 - 🚀 Pre-configured, optimized disk images
 - 📦 Minimal software footprint
 - ⚡ Quick VM deployment
@ -70,6 +73,7 @@ your homelab environment.
 #### Proxmox VM Disk Management

 **Expanding VM Disk Size:**
+
 1. Access Proxmox web interface
 2. Select target VM
 3. Navigate to Hardware tab
@ -77,6 +81,7 @@ your homelab environment.
 5. Click Resize and enter new size (e.g., 50G)

 **Post-resize VM Configuration:**
+
 ```bash
 # Access VM and configure partitions
 sudo fdisk /dev/sda
@ -89,7 +94,9 @@ sudo mkfs.ext4 /dev/sdaX
 ```

 #### Physical Disk Passthrough
+
 Pass physical disks (e.g., NVME storage) to VMs:
+
 ```bash
 # List disk IDs
 lsblk |awk 'NR==1{print $0" DEVICE-ID(S)"}NR>1{dev=$1;printf $0" ";system("find /dev/disk/by-id -lname \"*"dev"\" -printf \" %p\"");print "";}'|grep -v -E 'part|lvm'
@ -100,19 +107,23 @@ qm set 103 -scsi2 /dev/disk/by-id/usb-WD_BLACK_SN770_1TB_012938055C4B-0:0
 # Verify configuration
 grep 5C4B /etc/pve/qemu-server/103.conf
 ```
-> 📚 Reference: [Proxmox Disk Passthrough Guide](https://pve.proxmox.com/wiki/Passthrough_Physical_Disk_to_Virtual_Machine_(VM))
+
+> 📚 Reference: [Proxmox Disk Passthrough Guide](<https://pve.proxmox.com/wiki/Passthrough_Physical_Disk_to_Virtual_Machine_(VM)>)

 ### 2. Kubernetes Cluster Setup

 #### K3s Cluster Configuration
+
 Setting up a 4-node cluster (2 master + 2 worker):

 **Master Node 1:**
+
 ```bash
 curl -sfL https://get.k3s.io | sh -s - server --cluster-init --disable servicelb
 ```

 **Master Node 2:**
+
 ```bash
 export TOKEN=<token>
 export MASTER1_IP=<ip>
@ -120,6 +131,7 @@ curl -sfL https://get.k3s.io | sh -s - server --server https://${MASTER1_IP}:644
 ```

 **Worker Nodes:**
+
 ```bash
 export TOKEN=<token>
 export MASTER1_IP=<ip>
@ -127,6 +139,7 @@ curl -sfL https://get.k3s.io | K3S_URL=https://${MASTER1_IP}:6443 K3S_TOKEN=${TO
 ```

 #### MetalLB Load Balancer Setup
+
 ```bash
 # Install MetalLB
 kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.13.7/config/manifests/metallb-native.yaml
@ -139,6 +152,7 @@ kubectl apply -f /home/taqi/homeserver/k3s-infra/metallb/metallbConfig.yaml
 ```

 **Quick Test:**
+
 ```bash
 # Deploy test nginx
 kubectl create namespace nginx
@ -156,3 +170,60 @@ Contributions welcome! Feel free to open issues or submit PRs.
 ## 📝 License

 MIT License - feel free to use this as a template for your own homelab!
+
+# Upgrade K3s cluster
+
+Ref: https://github.com/k3s-io/k3s-upgrade
+
+## Deploying the K3s Upgrade Controller
+
+First deploy the k3s upgrade controller
+
+```bash
+kubectl apply -f https://raw.githubusercontent.com/rancher/system-upgrade-controller/master/manifests/system-upgrade-controller.yaml
+```
+
+Check that the controller is running. If not, check if the serviceaccount is
+bound to the correct role.
+
+```bash
+kubectl get pods -n kube-system
+kubectl create clusterrolebinding system-upgrade \
+    --clusterrole=cluster-admin \
+    --serviceaccount=system-upgrade:system-upgrade
+```
+
+## Create the upgrade plan
+
+First label the selected node with `k3s-upgrade=true` label. This is
+needed to select the node for upgrade.
+
+```bash
+kubectl label node <node-name> k3s-upgrade=true
+```
+
+It is best practice to upgrade node one by one. Thus, the cluster will
+still be operational during the upgrade. And, for any issues, it is possible
+to rollback the upgrade.
+
+## Create the upgrade plan
+
+Then create the upgrade plan. The plan will be created in the `system-upgrade`
+namespace. You can change the namespace by using the `--namespace` flag.
+
+```bash
+kubectl apply -f /home/taqi/homeserver/kubernetes/k3s-upgrade/plan.yaml
+```
+
+The plan will fitst try to cordon and drain the node. If it fails, check
+the logs of the plan.
+
+The longhorn CSI pods might node be drained. In that case, you can
+cordon the node and drain it manually.
+Ref: https://github.com/longhorn/longhorn/discussions/4102
+
+```bash
+kubectl drain vm4 --ignore-daemonsets \
+    --delete-emptydir-data \
+    --pod-selector='app!=csi-attacher,app!=csi-provisioner'
+```
--- a/kubernetes/README.md
+++ b/kubernetes/README.md
@ -408,7 +408,7 @@ psql -U $POSTGRES_USER -d postgres --host 192.168.1.145 -p 5432
 ## Backup and Restore PostgreSQL Database

 ```bash
-# To backup
+# To backup§
 # Dump format is compressed and allows parallel restore
 pg_dump -U $POSTGRES_USER -h 192.168.1.145 -p 5432 -F c \
  -f db_backup.dump postgres
@ -466,7 +466,7 @@ kubectl get secret wildcard-cert-secret --namespace=cert-manager -o yaml \
  | sed 's/namespace: cert-manager/namespace: gitea/' | kubectl apply -f -

 # The configMap contains the app.ini file values for gitea
-kubectl apply -f gitea/configMap.yaml -n gitea
+envsubst < gitea/configMap.yaml | kubectl apply -n gitea -f -

 helm install gitea gitea-charts/gitea -f gitea/values.yaml \
  --namespace gitea \
@ -511,7 +511,8 @@ envsubst < traefik-middleware/auth_secret.yaml | kubectl apply -n my-portfolio -
 kubernetes apply -f traefik-middleware/auth.yaml -n my-portfolio
 ```

-Following middleware deployment, the authentication must be enabled by adding the appropriate annotation to the service's Ingress object specification:
+Following middleware deployment, the authentication must be enabled by adding
+the appropriate annotation to the service's Ingress object specification:

 ```
 traefik.ingress.kubernetes.io/router.middlewares: my-portfolio-basic-auth@kubernetescrd
--- a/kubernetes/gitea/values.yaml
+++ b/kubernetes/gitea/values.yaml
@ -15,8 +15,8 @@ gitea:
    email: email

 image:
-  repository: gitea/gitea
-  tag: 1.23.4
+  repository: gitea
+  tag: 1.23.7

 postgresql:
  enabled: false
--- a/kubernetes/k3s-upgrade/plan.yaml
+++ b/kubernetes/k3s-upgrade/plan.yaml
@ -0,0 +1,17 @@
+---
+apiVersion: upgrade.cattle.io/v1
+kind: Plan
+metadata:
+  name: k3s-latest
+  namespace: system-upgrade
+spec:
+  concurrency: 1
+  version: v1.32.4-k3s1
+  nodeSelector:
+    matchExpressions:
+      - {key: k3s-upgrade, operator: Exists}
+  serviceAccountName: system-upgrade
+  drain:
+    force: true
+  upgrade:
+    image: rancher/k3s-upgrade
--- a/kubernetes/media/jellyfin-deploy.yaml
+++ b/kubernetes/media/jellyfin-deploy.yaml
@ -72,7 +72,7 @@ spec:
  type: ClusterIP

 ---
-apiVersion: traefik.containo.us/v1alpha1
+apiVersion: traefik.io/v1alpha1
 kind: IngressRoute
 metadata:
  name: jellyfin-ingress
@ -91,7 +91,7 @@ spec:
    secretName: wildcard-cert-secret

 ---
-apiVersion: traefik.containo.us/v1alpha1
+apiVersion: traefik.io/v1alpha1
 kind: Middleware
 metadata:
  name: jellyfin-headers
--- a/kubernetes/my-portfolio/portfolioManifest.yaml
+++ b/kubernetes/my-portfolio/portfolioManifest.yaml
@ -60,24 +60,3 @@ spec:
                name: portfolio-app-svc
                port:
                  number: 80
-      - path: /experience
-        pathType: Prefix
-        backend:
-          service:
-            name: portfolio-app-svc
-            port:
-              number: 80
-      - path: /interest
-        pathType: Prefix
-        backend:
-          service:
-            name: portfolio-app-svc
-            port:
-              number: 80
-      - path: /project
-        pathType: Prefix
-        backend:
-          service:
-            name: portfolio-app-svc
-            port:
-              number: 80