First, let me introduce the background of the practice:
I have always preferred NAS systems over Synology because it allows me to learn Linux commands instead of relying on user-friendly NAS operating systems like Synology. So, since I started playing home media streaming, I have been using pve as the base to manually build my own media streaming system using virtualized Linux operating systems. This way, I can motivate myself to learn Linux operations.
Therefore, the NAS cache is also built in the pve operating system.
In the second half of 2023, starting from playing with NAS, I gradually found that my disk space was not enough. For the sake of speed, I initially used the RAID10 scheme to attach the disks to the virtual machine. At that time, the QB and NAS management systems were separate, and the download directory was monitored by both the downloader and the media management tool through NFS mounting. Therefore, I did not actually feel the impact of HDD disk write speed. It wasn't until the spring of 2024, when I moved due to work relocation, that I thought since I hadn't seeded for a month, I would take this opportunity to reorganize my NAS. So, the journey of tinkering began again.
1. Reconstructing the system base#
I kept the pve base and originally deployed QB in a separate virtual machine. Now, I have unified the deployment of NAS management tools, subtitle tools, and seeding tools in the k8s environment. I created a 6C64G Rocky Linux as the media library directory and allocated the integrated 30T mechanical hard disk space to Rocky.
The disk space integration is done using the LVM logical volume method.
What is LVM logical volume?
LVM (Logical Volume Management) is a method used in Linux environments to manage disk drives and similar storage devices. Using LVM allows for more flexible management of storage space. Logical volumes are storage space units created and used under LVM management.
LVM components include: Physical Volume (PV), Volume Group (VG), Logical Volume (LV)
LVM allows for dynamic resizing of disks and supports snapshot functionality.
Compared to RAID10, the LVM method does not require sacrificing half of the disk space for data redundancy, but it lacks the high performance of RAID10.
1.1 Building LVM logical volumes#
> root@ubuntu:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdc 8:32 0 3.6T 0 disk
└─lvgroup-lv29_corig 253:12 0 29.1T 0 lvm
└─lvgroup-lv29 253:4 0 29.1T 0 lvm /mnt/lv29
sdd 8:48 0 3.6T 0 disk
└─lvgroup-lv29_corig 253:12 0 29.1T 0 lvm
└─lvgroup-lv29 253:4 0 29.1T 0 lvm /mnt/lv29
sde 8:64 0 3.6T 0 disk
sdf 8:80 0 3.6T 0 disk
└─lvgroup-lv29_corig 253:12 0 29.1T 0 lvm
└─lvgroup-lv29 253:4 0 29.1T 0 lvm /mnt/lv29
sdh 8:112 0 14.6T 0 disk
└─lvgroup-lv29_corig 253:12 0 29.1T 0 lvm
└─lvgroup-lv29 253:4 0 29.1T 0 lvm /mnt/lv29
sdi 8:128 0 3.6T 0 disk
└─lvgroup-lv29_corig 253:12 0 29.1T 0 lvm
└─lvgroup-lv29 253:4 0 29.1T 0 lvm /mnt/lv29
sdj 8:144 0 7.3T 0 disk
└─sdj1 8:145 0 7.3T 0 part
The result of the construction is shown above. sdc, sdd, sdf, sdh, sdi, and sdj are combined to form a 29.1T lv29 logical disk.
The specific operations are as follows:
## Create PV physical volumes
> pvcreate /dev/sdc /dev/sdd /dev/sdf /dev/sdh /dev/sdi /dev/sdj
## Create VG volume group for lvgroup
> vgcreate lvgroup /dev/sdc /dev/sdd /dev/sdf /dev/sdh /dev/sdi /dev/sdj
## Create LV logical volume and allocate all disk space to the lvgroup volume group
> lvcreate -n lv29 -l 100%VG lvgroup
Once created, the logical volume exists in the system just like a regular disk. Next, we need to set the file system format for it and set the mount point, after which it can be used normally.
## Create ext4 file system using formatting
> mkfs.ext4 /dev/lvgroup/lv29
## Create mount point
> mkdir -p /mnt/media
## Mount the logical volume
> mount /dev/lvgroup/lv29 /mnt/media
## Automatically mount at startup
> echo "/dev/lvgroup/lv29 /mnt/media ext4 defaults 0 0" | tee -a /etc/fstab
1.2 Mounting the disk to Linux in pve#
In the hardware options of the virtual machine, add the newly created lvgroup logical volume and fill in the complete available space to complete the addition of the disk.
2. LVM cache layer#
After implementing disk mounting and using it for some time, the drawbacks of not having a cache layer gradually became apparent. For example, when adding torrent files in the NAS management system for downloading, when QB takes over multiple download tasks, the NAS management system may experience system lag due to QB writing to the disk while searching for media or performing other operations.
Directly writing downloaded files to the HDD occupies memory as the first-level cache hardware. This is done to improve the efficiency of file writing and reduce disk write operations.
Since the memory size is only in the GB range, during the process of writing file blocks to the disk, they are cached through disk IO (page cache). When a large number of files and multiple downloaded content are simultaneously performing write operations on the HDD logical volume, the disk IO can easily be overwhelmed. The monitoring of the download directory in the NAS management tool involves reading from the disk. When there are a large number of files being written to the HDD, using disk IO for reading and writing simultaneously can cause disk IO congestion, severely affecting the user experience. Specifically, when I have more than 3 download tasks, the NAS management tool experiences significant delays, and even refreshing the page takes 5-30 seconds. This is because the disk IO and memory are severely blocked at this time.
Therefore, the mechanism of the LVM cache layer is introduced to effectively alleviate the phenomenon of HDD IO congestion by leveraging the high IOPS characteristics of sdd. Specifically, when a user performs a download task, the downloaded file blocks are first stored in memory, then written to the LVM cache layer (sdd), and finally written to the HDD.
2.1 Adding a cache layer to an existing LVM logical volume#
I have two SSD disks, sdb and sdg, which I now want to attach to LVM and use as a write cache. Since the cache logic of the logical volume is self-managed in LVM, I only need to add them to the LVM cache layer.
## Create logical disks for SSDs
pvcreate /dev/sdb /dev/sdg
## Extend to the lvgroup logical volume
vgextend lvgroup /dev/sdb /dev/sdg
## Add all space (currently only SSD is available) to the cache pool
lvcreate --type cache-pool -l 100%FREE -n lv_cache_pool lvgroup
## Attach the existing cache pool to the lv29 logical volume
lvconvert --type cache --cachepool lvgroup/lv_cache_pool lvgroup/lv29