{"id":5207,"date":"2025-09-10T15:10:18","date_gmt":"2025-09-10T20:10:18","guid":{"rendered":"https:\/\/neosmart.net\/blog\/?p=5207"},"modified":"2025-09-11T16:05:08","modified_gmt":"2025-09-11T21:05:08","slug":"zfs-on-linux-quickstart-cheat-sheet","status":"publish","type":"post","link":"https:\/\/neosmart.net\/blog\/zfs-on-linux-quickstart-cheat-sheet\/","title":{"rendered":"The idiomatic ZFS on Linux quickstart cheat-sheet"},"content":{"rendered":"<p>I&#8217;m a FreeBSD guy that has had a long, serious, and very much monogamous relationship with ZFS. I experimented with Solaris 9 to learn about ZFS, adopted OpenSolaris (2008?) back in the &#8220;aughts&#8221; for my first ZFS server, transitioned my installations over to OpenIndiana after Oracle bought Sun Microsystems out, and then at some point switched to FreeBSD, which I found to be a better designed OS once I had moved everything headless and was ready to completely bid the need for a desktop environment goodbye. But every once in a while I have to stand up a ZFS installation on Ubuntu, and then I spend a little too much time trying to remember how to do ZFS things that FreeBSD makes easy out-of-the-box. After doing that one time too many, I decided to put down my Linux-specific notes in a post for others (and myself) to reference in the future.<\/p>\n<h2>A fully functional ZFS setup following ZFS best practices and Linux\/Ubuntu idiomatic approaches<\/h2>\n<p>This guide will focus mainly on the Linux sysadmin side of things; note that basic knowledge and understanding of ZFS concepts and principles is assumed, but I&#8217;ll do my best to provide a succinct summary of what we&#8217;re doing and why we&#8217;re doing it at each point.<\/p>\n<p><!--more--><\/p>\n<h3>Step 1: Installing ZFS<\/h3>\n<p>Unlike on FreeBSD, on Linux you need to manually install the ZFS kernel modules and userland tooling to bring in ZFS filesystem support and install the venerable <code>zfs<\/code> and <code>zpool<\/code> utils used to manage a ZFS installation. Canonical&#8217;s Ubuntu was, to my knowledge, the first to offer a pre-packaged ZFS option for Linux users (after gambling Oracle wouldn&#8217;t sue them for violating the CDDL license if they included ZFS support in their repos), and I believe it&#8217;s still the most popular Linux distribution for ZFS users, so the specific command line incantations below are for Ubuntu:<\/p>\n<pre><code class=\"language-shell\">sudo apt install zfs-dkms\r\n<\/code><\/pre>\n<p>This will download, build, and install the ZFS kernel modules to match the version of the Linux kernel you&#8217;re currently running. Unlike most kernel modules,<sup id=\"rf1-5207\"><a href=\"#fn1-5207\" title=\"Largely due to licensing restriction workarounds\" rel=\"footnote\">1<\/a><\/sup> ZFS support isn&#8217;t built or distributed as part of the base kernel that Canonical maintains for its distributions, instead you have to manually build and load the kernel module that provides ZFS support (but this is automated by the <code>.deb<\/code> installed with <code>apt<\/code>) &#8211; and this needs to be done each time you upgrade the kernel.<sup id=\"rf2-5207\"><a href=\"#fn2-5207\" title=\"This too is *normally* taken care of by the package manager, provided all the packages have been correctly built and uploaded to the repository by the time you try to install a newer kernel version. There&rsquo;s very little you have to do manually.\" rel=\"footnote\">2<\/a><\/sup> All this really means is that installing <code>zfs-dkms<\/code> will take longer than installing most packages &#8211; expect the installation process to look like it&#8217;s stuck and be extra patient. Installing <code>zfs-dkms<\/code> will also pull in an automatic dependency on the userspace tools, <code>zfsutils-linux<\/code>, as well as other ZFS-related libraries and dependencies.<\/p>\n<h3>Step 2: Setting up your zpool<\/h3>\n<p>This part of the process is largely going to be the same regardless of which operating system you are using and is standard ZFS fare. You&#8217;ll need to identify the drives you wish to use in your zpool (the ZFS abstraction over the physical disks, arranged in the hierarchy\/topography you desire) and use <code>sudo zpool create<\/code> to create your first zpool (traditionally named <code>tank<\/code>). The only thing of note here is that you should use a stable path available to identify your disks, so instead of doing something like <code>sudo zpool create tank mirror \/dev\/sda \/dev\/sdb<\/code> to create a two-disk mirror zpool comprised of the two disks <code>\/dev\/sda<\/code> and <code>\/dev\/sdb<\/code>, you should instead use a different path to the same device, such as via <code>\/dev\/disk\/by-id\/<\/code> or <code>\/dev\/disk\/by-uuid\/<\/code> (going with <code>by-id\/<\/code>\u00a0might make it easier to figure out which disk to use, as the contents of <code>by-uuid\/<\/code> are all GUIDs).<sup id=\"rf3-5207\"><a href=\"#fn3-5207\" title=\"You could use \/dev\/disk\/by-path\/ but that means if you physically swap disks around in their cages the references would become switched around, so it&rsquo;s best not to.\" rel=\"footnote\">3<\/a><\/sup> On Linux, <code>lsblk<\/code> is your friend here to list disks attached to the system.<\/p>\n<pre><code class=\"language-shell\"># to create a mirror of the two volumes:\r\nsudo zpool create -o ashift=12 tank mirror \/dev\/disk\/by-id\/scsi-0VOLUME_NAME_01 \/dev\/disk\/by-id\/scsi-0VOLUME_NAME_02\r\n<\/code><\/pre>\n<p>And you can verify that the operation has succeeded by using `zpool list` to see the list of zpools live on the system.<\/p>\n<p>Most ZFS properties and features are configurable, can be set at any time,<sup id=\"rf4-5207\"><a href=\"#fn4-5207\" title=\"The most notable exception to this is the ashift property set above with -o ashift=12, which is a decent value for any ssd or 4k\/512e hdd.\" rel=\"footnote\">4<\/a><\/sup> and are inherited from parent datasets. Let&#8217;s set some default properties that are good starting values (we can always change them or override them for specific child datasets at any time):<\/p>\n<pre><code class=\"language-bash\">sudo zfs set compression=lz4 tank # or =zstd if you're on the very newest versions\r\nsudo zfs set recordsize=1m tank # i\/o optimization for storing content that rarely changes\r\n<\/code><\/pre>\n<h3>Step 3: Creating your ZFS datasets<\/h3>\n<p>The &#8220;zpool&#8221; is, as mentioned, the abstraction over the physical disks in your PC. It&#8217;s closest analogue is a &#8220;smart disk&#8221; comprised of multiple physical disks arranged in some specific topography with certain striping\/redundancy\/parity managed by the lower-level <code>zpool<\/code> command. Just as a zpool is a virtual disk, a dataset is a &#8220;smart partition&#8221; used to break up your &#8220;disk&#8221; into multiple logical storage units. Unlike real partitions, zfs datasets aren&#8217;t fixed in size, they rather straddle the line between a partition and a folder. They can be nested (like a folder), but you can&#8217;t rename a file across datasets (like a separate filesystem\/partition). You can snapshot them individually (or altogether, atomically) for backup and cloning purposes (see below), and its the finest-grained level of control you have for turning on\/off or re-configuring ZFS features and properties like the record size (only available to the time of creation), automatic block-level compression, etc.<\/p>\n<p>While ZFS automatically creates a dataset for the root zpool (in this case we now have <code>\/tank\/<\/code> mounted and ready) but it&#8217;s <em>generally<\/em> not a good practice to write directly to this dataset. Instead, you should create one or more child datasets where most content will go. We&#8217;ll just create one dataset for now:<\/p>\n<pre><code class=\"language-shell\">sudo zfs create tank\/data\r\n<\/code><\/pre>\n<p>and we can see all our datasets with `zfs list`, which shows them in their hierarchy\/tree as configured.<\/p>\n<h3>Step 4: Configuring automatic snapshots<\/h3>\n<p>One of the coolest and most important ZFS features is undoubtedly its instant, zero-cost snapshotting (enabled by its copy-on-write design). This lets you freeze an image of any dataset (or all of them) as it exists at any point of time, then restore back to it (or selectively copy files\/data back, as needed) at any point in the future, regardless of any changes you&#8217;ve made. (You only pay the storage cost of data added or deleted thereafter.) ZFS snapshots can be made manually with <code>sudo zfs snap -r tank@snapshot_name_or_date<\/code> (which snapshots <code>tank<\/code> and all its child datasets, instantly) or <code>sudo zfs snap tank\/data@snapshot_name<\/code> (which snapshots only the one <code>tank\/data<\/code> dataset). But since they&#8217;re virtually free, why not go a step further and automatically take snapshots of the data on a schedule? That way you&#8217;re protected in case of inadvertent data loss, not just when you take a snapshot before manually performing a known potentially destructive action.<\/p>\n<p>On Linux, the best way to automate these snapshots is with <code>zfs-auto-snapshot<\/code>, which we&#8217;ll install with <code>sudo apt install zfs-auto-snapshot<\/code>. It&#8217;ll automatically create new snapshots every month\/week\/day\/hour of designated zfs datasets, and delete the oldest ones too so you&#8217;re not paying the storage price forever.<\/p>\n<p>After installing <code>zfs-auto-snapshot<\/code>, it&#8217;s time to choose which datasets we want to protect and how often we want to take the snapshots. Instead of using a configuration file, <code>zfs-auto-snapshot<\/code> uses zfs properties to determine which zfs datasets to include in each snapshot interval, and since zfs properties are inherited by default, if we set up snapshots for the root dataset, it&#8217;ll automatically do the same for all child datasets.<\/p>\n<p>Let&#8217;s enable a daily snapshot of the root volume (and all child datasets):<\/p>\n<pre><code class=\"language-shell\">sudo zfs set com.sun:auto-snapshot:daily=true tank\r\n<\/code><\/pre>\n<p>You can repeat this but replace <code>daily<\/code> with one of <code>daily<\/code>, <code>monthly<\/code>, <code>weekly<\/code>, or <code>hourly<\/code> to (additionally) opt-into that frequency of snapshots. To <em>exclude<\/em> a dataset (and its children) from being included in a particular schedule, you can e.g. use <code>sudo zfs set com.sun:auto-snapshot:daily=false tank\/no_backups<\/code> to turn off daily snapshots for the <code>tank\/no_backups<\/code> dataset (assuming it exists).<\/p>\n<p>You can check if this is working (after waiting the prescribed amount of time) by checking to see what snapshots you have listed:<\/p>\n<pre>zfs list -t snap<\/pre>\n<h3>Step 5: Automatic monthly ZFS scrubs on Linux with systemd<\/h3>\n<p>One thing that makes ZFS stand out compared to other operating systems like <code>ext4<\/code> or even <code>xfs<\/code> is that it calculates the hash of each block of data you store on it. In the event of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Data_degradation\" rel=\"follow\">bitrot<\/a> (the silent corruption of data already stored to disk), zfs can a) flag that a file has been silently corrupted, b) automatically restore a good copy from another disk or parity (assuming your zpool topography provides redunancy).<sup id=\"rf5-5207\"><a href=\"#fn5-5207\" title=\"Or assuming you are using zfs set copies=2 tank (or greater).\" rel=\"footnote\">5<\/a><\/sup> It does this automatically every time you read a file, but what about if you have terabytes of data just sitting there, silently rotting away? How do you catch that corruption in time to fix it from a second copy on the zpool? The <code>zpool scrub tank<\/code> operation runs a low-priority scan in the background to detect and (hopefully) repair just that, but it needs to be scheduled (or run manually).<\/p>\n<p>On FreeBSD, this would be accomplished with the help of a simple monthly periodic script, but on Linux (Ubuntu particularly) it&#8217;s not as simple. The idiomatic way of setting up monthly work on Ubuntu is via the use of systemd units (aka services) and timers, unfortunately this requires setting up two separate files, but the good news is that you can just copy and paste what I&#8217;ve provided below, only modifying the zpool name from <code>tank<\/code> to whatever you are using, as needed.<\/p>\n<p>The first file we need to create is the actual systemd service, which is what is tasked with running running the <code>zfs scrub<\/code> operation. Copy the following to <code>\/etc\/systemd\/system\/zfs-scrub.service<\/code>:<\/p>\n<pre><code class=\"language-ini\">[Unit]\r\nDescription=ZFS scrub of all pools\r\n\r\n[Service]\r\nType=oneshot\r\nExecStart=\/usr\/sbin\/zpool scrub tank\r\n<\/code><\/pre>\n<p>And copy this timer file (which specifies when <code>zfs-scrub.service<\/code> is automatically run) to <code>\/etc\/systemd\/system\/zfs-scrub.timer<\/code>:<\/p>\n<pre><code class=\"language-ini\">[Unit]\r\nDescription=Run ZFS scrub service monthly\r\n\r\n[Timer]\r\nOnCalendar=monthly\r\n# Run if missed while machine was off\r\nPersistent=true\r\n# Add some randomization to start time to prevent thundering herd\r\nRandomizedDelaySec=30m\r\nAccuracySec=1h\r\n\r\n[Install]\r\nWantedBy=timers.target\r\n<\/code><\/pre>\n<p>Then execute the following to get systemd to see and activate the monthly scrub service:<\/p>\n<pre><code class=\"language-shell\">systemctl daemon-reload\r\nsystemctl enable --now zfs-scrub.timer\r\n<\/code><\/pre>\n<p>and verify that the timer has been started with the following:<\/p>\n<pre><code class=\"language-shell\">systemctl list-timers zfs-scrub.timer\r\n<\/code><\/pre>\n<p>which should show you output along the lines of the following:<\/p>\n<pre><code class=\"language-shell\">$ systemctl list-timers zfs-scrub.timer\r\nNEXT LEFT LAST PASSED UNIT ACTIVATES\r\nWed 2025-10-01 00:23:31 UTC 2 weeks 6 days - - zfs-scrub.timer zfs-scrub.service\r\n\r\n1 timers listed.\r\n<\/code><\/pre>\n<p>You can see if this is working by checking when the last scrub took place with <code>zpool status<\/code>:<\/p>\n<pre><code class=\"language-shell\">$ zpool status\r\npool: tank\r\nstate: ONLINE\r\nscan: scrub repaired 0B in 00:00:00 with 0 errors on Wed Sep 10 18:43:22 2025\r\n<\/code><\/pre>\n<p>And with that, you&#8217;re all set!<\/p>\n<hr class=\"footnotes\"><ol class=\"footnotes\"><li id=\"fn1-5207\"><p>Largely due to licensing restriction workarounds&nbsp;<a href=\"#rf1-5207\" class=\"backlink\" title=\"Jump back to footnote 1 in the text.\">&#8617;<\/a><\/p><\/li><li id=\"fn2-5207\"><p>This too is *normally* taken care of by the package manager, provided all the packages have been correctly built and uploaded to the repository by the time you try to install a newer kernel version. There&#8217;s very little you have to do manually.&nbsp;<a href=\"#rf2-5207\" class=\"backlink\" title=\"Jump back to footnote 2 in the text.\">&#8617;<\/a><\/p><\/li><li id=\"fn3-5207\"><p>You <em>could<\/em> use <code>\/dev\/disk\/by-path\/<\/code> but that means if you physically swap disks around in their cages the references would become switched around, so it&#8217;s best not to.&nbsp;<a href=\"#rf3-5207\" class=\"backlink\" title=\"Jump back to footnote 3 in the text.\">&#8617;<\/a><\/p><\/li><li id=\"fn4-5207\"><p>The most notable exception to this is the <code>ashift<\/code> property set above with <code>-o ashift=12<\/code>, which is a decent value for any ssd or 4k\/512e hdd.&nbsp;<a href=\"#rf4-5207\" class=\"backlink\" title=\"Jump back to footnote 4 in the text.\">&#8617;<\/a><\/p><\/li><li id=\"fn5-5207\"><p>Or assuming you are using <code>zfs set copies=2 tank<\/code> (or greater).&nbsp;<a href=\"#rf5-5207\" class=\"backlink\" title=\"Jump back to footnote 5 in the text.\">&#8617;<\/a><\/p><\/li><\/ol>","protected":false},"excerpt":{"rendered":"<p>I&#8217;m a FreeBSD guy that has had a long, serious, and very much monogamous relationship with ZFS. I experimented with Solaris 9 to learn about ZFS, adopted OpenSolaris (2008?) back in the &#8220;aughts&#8221; for my first ZFS server, transitioned my &hellip; <a href=\"https:\/\/neosmart.net\/blog\/zfs-on-linux-quickstart-cheat-sheet\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":505,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[656,1036,505,1035],"class_list":["post-5207","post","type-post","status-publish","format-standard","hentry","category-software","tag-freebsd","tag-systemd","tag-ubuntu","tag-zfs"],"aioseo_notices":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4xDa-1lZ","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/neosmart.net\/blog\/wp-json\/wp\/v2\/posts\/5207","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/neosmart.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/neosmart.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/neosmart.net\/blog\/wp-json\/wp\/v2\/users\/505"}],"replies":[{"embeddable":true,"href":"https:\/\/neosmart.net\/blog\/wp-json\/wp\/v2\/comments?post=5207"}],"version-history":[{"count":11,"href":"https:\/\/neosmart.net\/blog\/wp-json\/wp\/v2\/posts\/5207\/revisions"}],"predecessor-version":[{"id":5219,"href":"https:\/\/neosmart.net\/blog\/wp-json\/wp\/v2\/posts\/5207\/revisions\/5219"}],"wp:attachment":[{"href":"https:\/\/neosmart.net\/blog\/wp-json\/wp\/v2\/media?parent=5207"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/neosmart.net\/blog\/wp-json\/wp\/v2\/categories?post=5207"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/neosmart.net\/blog\/wp-json\/wp\/v2\/tags?post=5207"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}