added k3s-upgrade procedure

This commit is contained in:
2025-05-05 11:19:21 +03:00
parent b451cf2830
commit fd8a011aa8
7 changed files with 122 additions and 54 deletions

View File

@ -35,6 +35,7 @@
- 🍵 Gitea Git Server and Actions for CI/CD
### 📋 Coming Soon
- Nextcloud
- Monitoring Stack with Prometheus and Grafana
@ -51,6 +52,7 @@
### 1. Setting up Proxmox Infrastructure
#### Proxmox Base Installation
- Boot mini PCs from Proxmox USB drive
- Install on SSD and configure networking
- Set up cluster configuration
@ -59,6 +61,7 @@
#### 3. Cloud Image Implementation
Cloud images provide:
- 🚀 Pre-configured, optimized disk images
- 📦 Minimal software footprint
- ⚡ Quick VM deployment
@ -70,6 +73,7 @@ your homelab environment.
#### Proxmox VM Disk Management
**Expanding VM Disk Size:**
1. Access Proxmox web interface
2. Select target VM
3. Navigate to Hardware tab
@ -77,6 +81,7 @@ your homelab environment.
5. Click Resize and enter new size (e.g., 50G)
**Post-resize VM Configuration:**
```bash
# Access VM and configure partitions
sudo fdisk /dev/sda
@ -89,7 +94,9 @@ sudo mkfs.ext4 /dev/sdaX
```
#### Physical Disk Passthrough
Pass physical disks (e.g., NVME storage) to VMs:
```bash
# List disk IDs
lsblk |awk 'NR==1{print $0" DEVICE-ID(S)"}NR>1{dev=$1;printf $0" ";system("find /dev/disk/by-id -lname \"*"dev"\" -printf \" %p\"");print "";}'|grep -v -E 'part|lvm'
@ -100,19 +107,23 @@ qm set 103 -scsi2 /dev/disk/by-id/usb-WD_BLACK_SN770_1TB_012938055C4B-0:0
# Verify configuration
grep 5C4B /etc/pve/qemu-server/103.conf
```
> 📚 Reference: [Proxmox Disk Passthrough Guide](https://pve.proxmox.com/wiki/Passthrough_Physical_Disk_to_Virtual_Machine_(VM))
> 📚 Reference: [Proxmox Disk Passthrough Guide](<https://pve.proxmox.com/wiki/Passthrough_Physical_Disk_to_Virtual_Machine_(VM)>)
### 2. Kubernetes Cluster Setup
#### K3s Cluster Configuration
Setting up a 4-node cluster (2 master + 2 worker):
**Master Node 1:**
```bash
curl -sfL https://get.k3s.io | sh -s - server --cluster-init --disable servicelb
```
**Master Node 2:**
```bash
export TOKEN=<token>
export MASTER1_IP=<ip>
@ -120,6 +131,7 @@ curl -sfL https://get.k3s.io | sh -s - server --server https://${MASTER1_IP}:644
```
**Worker Nodes:**
```bash
export TOKEN=<token>
export MASTER1_IP=<ip>
@ -127,6 +139,7 @@ curl -sfL https://get.k3s.io | K3S_URL=https://${MASTER1_IP}:6443 K3S_TOKEN=${TO
```
#### MetalLB Load Balancer Setup
```bash
# Install MetalLB
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.13.7/config/manifests/metallb-native.yaml
@ -139,6 +152,7 @@ kubectl apply -f /home/taqi/homeserver/k3s-infra/metallb/metallbConfig.yaml
```
**Quick Test:**
```bash
# Deploy test nginx
kubectl create namespace nginx
@ -156,3 +170,60 @@ Contributions welcome! Feel free to open issues or submit PRs.
## 📝 License
MIT License - feel free to use this as a template for your own homelab!
# Upgrade K3s cluster
Ref: https://github.com/k3s-io/k3s-upgrade
## Deploying the K3s Upgrade Controller
First deploy the k3s upgrade controller
```bash
kubectl apply -f https://raw.githubusercontent.com/rancher/system-upgrade-controller/master/manifests/system-upgrade-controller.yaml
```
Check that the controller is running. If not, check if the serviceaccount is
bound to the correct role.
```bash
kubectl get pods -n kube-system
kubectl create clusterrolebinding system-upgrade \
--clusterrole=cluster-admin \
--serviceaccount=system-upgrade:system-upgrade
```
## Create the upgrade plan
First label the selected node with `k3s-upgrade=true` label. This is
needed to select the node for upgrade.
```bash
kubectl label node <node-name> k3s-upgrade=true
```
It is best practice to upgrade node one by one. Thus, the cluster will
still be operational during the upgrade. And, for any issues, it is possible
to rollback the upgrade.
## Create the upgrade plan
Then create the upgrade plan. The plan will be created in the `system-upgrade`
namespace. You can change the namespace by using the `--namespace` flag.
```bash
kubectl apply -f /home/taqi/homeserver/kubernetes/k3s-upgrade/plan.yaml
```
The plan will fitst try to cordon and drain the node. If it fails, check
the logs of the plan.
The longhorn CSI pods might node be drained. In that case, you can
cordon the node and drain it manually.
Ref: https://github.com/longhorn/longhorn/discussions/4102
```bash
kubectl drain vm4 --ignore-daemonsets \
--delete-emptydir-data \
--pod-selector='app!=csi-attacher,app!=csi-provisioner'
```

View File

@ -408,7 +408,7 @@ psql -U $POSTGRES_USER -d postgres --host 192.168.1.145 -p 5432
## Backup and Restore PostgreSQL Database
```bash
# To backup
# To backup§
# Dump format is compressed and allows parallel restore
pg_dump -U $POSTGRES_USER -h 192.168.1.145 -p 5432 -F c \
-f db_backup.dump postgres
@ -466,7 +466,7 @@ kubectl get secret wildcard-cert-secret --namespace=cert-manager -o yaml \
| sed 's/namespace: cert-manager/namespace: gitea/' | kubectl apply -f -
# The configMap contains the app.ini file values for gitea
kubectl apply -f gitea/configMap.yaml -n gitea
envsubst < gitea/configMap.yaml | kubectl apply -n gitea -f -
helm install gitea gitea-charts/gitea -f gitea/values.yaml \
--namespace gitea \
@ -511,7 +511,8 @@ envsubst < traefik-middleware/auth_secret.yaml | kubectl apply -n my-portfolio -
kubernetes apply -f traefik-middleware/auth.yaml -n my-portfolio
```
Following middleware deployment, the authentication must be enabled by adding the appropriate annotation to the service's Ingress object specification:
Following middleware deployment, the authentication must be enabled by adding
the appropriate annotation to the service's Ingress object specification:
```
traefik.ingress.kubernetes.io/router.middlewares: my-portfolio-basic-auth@kubernetescrd

View File

@ -15,8 +15,8 @@ gitea:
email: email
image:
repository: gitea/gitea
tag: 1.23.4
repository: gitea
tag: 1.23.7
postgresql:
enabled: false

View File

@ -0,0 +1,17 @@
---
apiVersion: upgrade.cattle.io/v1
kind: Plan
metadata:
name: k3s-latest
namespace: system-upgrade
spec:
concurrency: 1
version: v1.32.4-k3s1
nodeSelector:
matchExpressions:
- {key: k3s-upgrade, operator: Exists}
serviceAccountName: system-upgrade
drain:
force: true
upgrade:
image: rancher/k3s-upgrade

View File

@ -72,7 +72,7 @@ spec:
type: ClusterIP
---
apiVersion: traefik.containo.us/v1alpha1
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: jellyfin-ingress
@ -91,7 +91,7 @@ spec:
secretName: wildcard-cert-secret
---
apiVersion: traefik.containo.us/v1alpha1
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
name: jellyfin-headers

View File

@ -60,24 +60,3 @@ spec:
name: portfolio-app-svc
port:
number: 80
- path: /experience
pathType: Prefix
backend:
service:
name: portfolio-app-svc
port:
number: 80
- path: /interest
pathType: Prefix
backend:
service:
name: portfolio-app-svc
port:
number: 80
- path: /project
pathType: Prefix
backend:
service:
name: portfolio-app-svc
port:
number: 80