ZFS Filesystem Best Practices#
This README is a practical ZFS quick-help guide. It focuses on everyday administration, safe defaults, common pool layouts, snapshots, backups, monitoring, recovery, and examples you can adapt.
Naming convention used in every example:
- Pools are named
pool0,pool1,pool2, and so on. - Filesystem datasets are named
volume0,volume1,volume2, and so on. - Block devices created with
zfs create -Vare also called volumes here, but
the examples explicitly call them zvols when they are block devices.
- Disk identifiers are examples only. Prefer stable paths such as
/dev/disk/by-id/... instead of /dev/sdX.
Warning: Many ZFS commands can destroy data. Read every command before using it, replace the device names with the correct ones, and keep tested backups.
ZFS is a copy-on-write storage system that combines filesystem, volume manager, software RAID, checksumming, snapshots, replication, compression, quotas, and optional native encryption into one coherent design. Its main strength is data integrity: every block is checksummed, redundant pools can repair bad copies automatically, snapshots make point-in-time recovery cheap, and zfs send / zfs receive can replicate exact dataset history to another pool or host. ZFS also makes everyday administration cleaner because storage is organized into pools, vdevs, datasets, zvols, and properties instead of separate RAID, partition, LVM, and filesystem layers. Its limits matter just as much: ZFS is not a backup, cannot save data after too many devices in the same vdev fail, cannot make an unsafe topology safe after the fact, does not protect against bad commands or destroyed snapshots, and may become less portable when newer pool features are enabled. Good ZFS systems are planned around redundancy, stable disk identifiers, free space, regular scrubs, tested backups, and datasets designed for the workload.
Table Of Contents#
- Core Ideas
- Golden Rules
- Recommended Pool Layouts
- Mirror Vdevs
- Convert A Single-Disk Pool To A Mirror
- RAIDZ1
- RAIDZ2
- RAIDZ3
- What Not To Do
- Pool Creation Checklist
- Dataset Design
- Common Dataset Properties
- Compression
- Record Size
- Quotas And Reservations
- Mountpoints
- Everyday Commands
- Snapshots
- Snapshot Holds
- Snapshot Retention
- Pool Checkpoints
- Backups With ZFS Send And Receive
- Resume Interrupted Replication
- ZFS Bookmarks For Replication
- Scrubs
- Disk Replacement
- Expanding Pools
- Add Another Mirror Vdev
- Add A RAIDZ Vdev
- Expand An Existing RAIDZ Vdev
- Import And Export
- Import Using A Specific Device Directory
- Import By Pool ID Or Temporary Name
- Encryption
- Encrypted Dataset Recovery Checklist
- Zvols
- Virtual Machines
- Databases
- Media And Archive Storage
- Shares
- Delegation
- Cache, Log, And Special Vdevs
- TRIM
- Monitoring
- Interpreting ZFS Errors
- Events And Alerting
- SMART Checks
- Performance Basics
- Security And Permissions
- Boot Pools
- Disaster Recovery
- Recovery Decision Tree
- Failed Pool Recovery Triage
- Clone Failing Disks Before Recovery Attempts
- Physical Pool Inspection With zdb
- Pool Rewind Recovery With -F
- Recover A Destroyed Pool Entry
- Handling Permanent Data Errors
- Ransomware Or Mass Deletion Recovery
- Recovery Practice Lab
- Community FAQ: Top 20 Recurring ZFS Questions
- Common Mistakes
- Example Build: General Home Or Small Server
- Example Build: Backup Pool
- Example Build: VM Pool
- Quick Reference
- Maintenance Schedule
- Final Best Practices Checklist
Core Ideas#
ZFS combines a volume manager and filesystem. Instead of creating a hardware RAID array and then putting a filesystem on top, you normally give ZFS direct access to the disks and let it manage redundancy, checksums, repair, snapshots, compression, and replication.
Important terms:
- Pool: top-level storage container, for example
pool0. - Vdev: one redundancy group inside a pool, such as one mirror or one RAIDZ2
group.
- Dataset: a ZFS filesystem, for example
pool0/volume0. - Zvol: a ZFS block device, for example
pool0/volume2. - Snapshot: read-only point-in-time copy, for example
pool0/volume0@daily-2026-05-14.
- Clone: writable dataset based on a snapshot.
- Scrub: online checksum verification and repair.
- Resilver: rebuild after replacing or adding redundancy to a device.
Golden Rules#
- Use ECC RAM when possible, especially for large pools or important data.
- Use reliable disks and monitor them. ZFS protects against many failures, not
against neglect.
- Never build ZFS on top of hardware RAID. Use HBA or IT mode so ZFS can see
each disk directly.
- Use stable disk paths from
/dev/disk/by-id/. - Plan vdev width and redundancy before creating the pool. You cannot remove
a RAIDZ vdev from a pool in normal designs.
- Do not fill pools. Keep at least 20% free space for performance and recovery
room.
- Use snapshots, but do not treat snapshots as backups.
- Back up to another pool, system, disk, or remote host with
zfs send. - Scrub regularly and read the results.
- Test restores, not only backups.
- Set properties at creation time when possible.
- Keep pools, operating system packages, and boot environments maintained.
- Do not casually run
zpool upgradeif the pool must remain importable on
older systems.
- Avoid mixing very different disk sizes or speeds inside one vdev.
- Replace failing disks early. Redundancy is not a reason to wait.
Recommended Pool Layouts#
Mirror Vdevs#
Mirrors are usually the best general-purpose layout for home labs, small servers, VM storage, databases, and workloads with random I/O.
Example: create pool0 from two mirrored disks.
zpool create \
-o ashift=12 \
-O compression=zstd \
-O atime=off \
-O xattr=sa \
-O acltype=posixacl \
-m /pool0 \
pool0 \
mirror \
/dev/disk/by-id/disk0 \
/dev/disk/by-id/disk1
Example: create pool0 from two mirror vdevs.
zpool create \
-o ashift=12 \
-O compression=zstd \
-O atime=off \
-m /pool0 \
pool0 \
mirror /dev/disk/by-id/disk0 /dev/disk/by-id/disk1 \
mirror /dev/disk/by-id/disk2 /dev/disk/by-id/disk3
Why use mirrors:
- Fast rebuilds compared with wide RAIDZ.
- Good random read and write performance.
- Easy expansion by adding another mirror vdev.
- Simpler replacement and growth strategy.
Tradeoff:
- A two-way mirror gives 50% usable capacity.
Convert A Single-Disk Pool To A Mirror#
If pool0 was created from one disk, you can attach a second disk to turn the plain single-disk vdev into a mirror. This is useful when you started without redundancy and want to add it later.
Check the current device name:
zpool status pool0
Attach a second disk:
zpool attach pool0 /dev/disk/by-id/disk0 /dev/disk/by-id/disk1
Watch the resilver:
zpool status pool0
After the resilver completes, pool0 has mirror redundancy.
Example before:
pool0
/dev/disk/by-id/disk0
Example after:
pool0
mirror-0
/dev/disk/by-id/disk0
/dev/disk/by-id/disk1
If zpool status shows a short device name because the pool was imported that way, use the name shown there:
zpool attach pool0 sdb /dev/disk/by-id/disk1
Best practice:
- Back up first. The original single disk is still a single point of failure
until the resilver completes.
- Use a new disk that is at least as large as the existing disk.
- Use stable
/dev/disk/by-id/names for the new disk when possible. - Do not confuse
attachwithadd.attachmirrors an existing vdev;add
creates another top-level vdev.
RAIDZ1#
RAIDZ1 is single-parity RAIDZ. It is usually only acceptable for small, non-critical pools with small disks and good backups.
Example:
zpool create \
-o ashift=12 \
-O compression=zstd \
-O atime=off \
-m /pool0 \
pool0 \
raidz1 \
/dev/disk/by-id/disk0 \
/dev/disk/by-id/disk1 \
/dev/disk/by-id/disk2
Best practice:
- Prefer mirrors or RAIDZ2 for important data.
- Avoid RAIDZ1 with large modern disks when the data matters.
RAIDZ2#
RAIDZ2 uses two parity disks per vdev. It is a good choice for larger media, archive, and backup pools.
Example:
zpool create \
-o ashift=12 \
-O compression=zstd \
-O atime=off \
-m /pool0 \
pool0 \
raidz2 \
/dev/disk/by-id/disk0 \
/dev/disk/by-id/disk1 \
/dev/disk/by-id/disk2 \
/dev/disk/by-id/disk3 \
/dev/disk/by-id/disk4 \
/dev/disk/by-id/disk5
Best practice:
- Use RAIDZ2 instead of RAIDZ1 for valuable data on large disks.
- Keep vdevs at reasonable widths. Common widths are 6, 8, 10, or 12 disks.
- Expansion normally means adding another RAIDZ2 vdev of similar width.
RAIDZ3#
RAIDZ3 uses three parity disks per vdev. It is suitable for large, slower, high-capacity archive pools where rebuild time is long and capacity matters more than write performance.
Example:
zpool create \
-o ashift=12 \
-O compression=zstd \
-O atime=off \
-m /pool0 \
pool0 \
raidz3 \
/dev/disk/by-id/disk0 \
/dev/disk/by-id/disk1 \
/dev/disk/by-id/disk2 \
/dev/disk/by-id/disk3 \
/dev/disk/by-id/disk4 \
/dev/disk/by-id/disk5 \
/dev/disk/by-id/disk6 \
/dev/disk/by-id/disk7
What Not To Do#
Avoid these layouts unless you fully understand the risk:
# No redundancy. Any disk failure can destroy the pool.
zpool create pool0 /dev/disk/by-id/disk0 /dev/disk/by-id/disk1
# Hardware RAID hides disks and errors from ZFS.
zpool create pool0 /dev/disk/by-id/hardware-raid-volume0
# Unstable device names can change after reboot.
zpool create pool0 /dev/sdb /dev/sdc
Pool Creation Checklist#
Before creating a pool:
- Confirm the disks with
lsblk -o NAME,SIZE,MODEL,SERIAL,TYPE. - Use
/dev/disk/by-id/paths. - Decide mirror, RAIDZ2, or RAIDZ3 before writing data.
- Use
ashift=12for most modern disks and SSDs. - Use
compression=zstdon current OpenZFS unless you have a specific reason
not to.
- Use
atime=offfor most server datasets. - Use
acltype=posixaclandxattr=saon Linux when POSIX ACLs are needed. - Decide whether the pool needs encryption.
- Decide mountpoints before creating many datasets.
Preview disk identifiers:
ls -l /dev/disk/by-id/
Show block devices:
lsblk -o NAME,SIZE,MODEL,SERIAL,TYPE,MOUNTPOINTS
Check existing pools:
zpool list
zpool status
Dataset Design#
Do not put everything directly in the root of the pool. Create datasets for different data types so you can apply different snapshots, quotas, compression, record sizes, mountpoints, and backup policies.
Example dataset layout:
pool0
pool0/volume0 # general files
pool0/volume1 # user files
pool0/volume2 # media or archives
pool0/volume3 # virtual machines
pool0/volume4 # databases
pool0/volume5 # backups
Create filesystem datasets:
zfs create -o mountpoint=/pool0/volume0 pool0/volume0
zfs create -o mountpoint=/pool0/volume1 pool0/volume1
zfs create -o mountpoint=/pool0/volume2 pool0/volume2
List datasets:
zfs list
Show important properties:
zfs get compression,atime,recordsize,mountpoint,quota,reservation pool0/volume0
Common Dataset Properties#
Compression#
Use Zstandard compression by default on current OpenZFS systems. zstd is a good modern default because it usually compresses better than lz4 while still offering good performance; in OpenZFS, zstd is equivalent to zstd-3.
zfs set compression=zstd pool0
zfs set compression=zstd pool0/volume0
Use lz4 when you need the lowest CPU overhead, have older systems that must import the pool, or have latency-sensitive VM/database workloads where testing shows zstd costs too much CPU.
zfs set compression=lz4 pool0/volume3
Use stronger Zstandard levels for cold or archival datasets. Higher levels can save more space, but they cost more CPU during writes.
zfs set compression=zstd-6 pool0/volume2
Check compression:
zfs get compressratio,compression pool0/volume2
Common compression choices:
| Setting | Best Use | Notes |
|---|---|---|
zstd |
New general-purpose datasets | Good default on current OpenZFS; same as zstd-3. |
zstd-1 |
Faster Zstandard | Lower CPU than default zstd, usually less compression. |
zstd-6 |
Cold data and archives | Better compression, more write CPU. |
zstd-fast |
Fast Zstandard mode | Useful when lz4 is too light but regular zstd is too costly. |
lz4 |
Old systems, weak CPUs, very low latency | Very fast and still a safe conservative fallback. |
gzip / gzip-N |
Legacy compatibility only | Usually not worth using now; zstd is generally better. |
zle |
Mostly-zero data | Compresses runs of zeros only. |
off |
Rare exceptions | Usually avoid disabling compression. |
Best practice:
- Set compression before writing data.
- Changing compression affects newly written blocks only.
- To recompress old data, rewrite it or replicate it to a new dataset.
- Already-compressed media, backups, and archives may not shrink much, but ZFS
will store blocks uncompressed when compression is not useful.
- Check OpenZFS feature compatibility before using
zstdon pools that must be
imported by older systems.
Access Time#
Disable access time updates for most datasets.
zfs set atime=off pool0
Use relatime=on only if applications need access-time behavior.
zfs set relatime=on pool0/volume0
Record Size#
recordsize affects filesystem datasets. It does not affect zvols.
General files:
zfs set recordsize=128K pool0/volume0
Large media and archives:
zfs set recordsize=1M pool0/volume2
Databases with small random I/O:
zfs set recordsize=16K pool0/volume4
Virtual machine image files:
zfs set recordsize=64K pool0/volume3
Best practice:
- Set
recordsizebefore writing data. - Changing
recordsizeaffects newly written blocks only. - Match database record size to the database page size when possible.
Quotas And Reservations#
A quota limits maximum dataset usage.
zfs set quota=500G pool0/volume0
A reservation guarantees space to a dataset.
zfs set reservation=100G pool0/volume1
Reference quotas limit only the dataset itself, excluding child datasets.
zfs set refquota=200G pool0/volume1
Show space usage:
zfs list -o name,used,avail,refer,quota,reservation
Mountpoints#
Set explicit mountpoints for clarity.
zfs set mountpoint=/pool0/volume0 pool0/volume0
zfs set mountpoint=/pool0/volume1 pool0/volume1
Temporarily unmount and mount:
zfs unmount pool0/volume0
zfs mount pool0/volume0
Mount all ZFS datasets:
zfs mount -a
Read-Only Datasets#
Make a dataset read-only:
zfs set readonly=on pool0/volume2
Make it writable again:
zfs set readonly=off pool0/volume2
Copies#
The copies property stores extra copies of blocks inside the same pool. It is not a replacement for redundancy or backups, but it can help protect very small, important datasets.
zfs set copies=2 pool0/volume1
Best practice:
- Use real vdev redundancy first.
- Use
copies=2only for selected important datasets, not huge media stores.
Everyday Commands#
Check Pool Health#
zpool status
zpool status pool0
Show only pools that have known problems:
zpool status -x
Short list:
zpool list
Show pool I/O:
zpool iostat -v pool0 5
Show dataset usage:
zfs list
Show snapshots:
zfs list -t snapshot
Create A Dataset#
zfs create pool0/volume0
Create with mountpoint and properties:
zfs create \
-o mountpoint=/pool0/volume0 \
-o compression=zstd \
-o atime=off \
pool0/volume0
Rename A Dataset#
zfs rename pool0/volume0 pool0/volume1
Destroy A Dataset#
Destroying a dataset deletes its data.
zfs destroy pool0/volume0
Destroy a dataset and its snapshots:
zfs destroy -r pool0/volume0
Move Files Into A Dataset#
Create the dataset:
zfs create -o mountpoint=/pool0/volume0 pool0/volume0
Copy data with preserved permissions:
rsync -aHAX --info=progress2 /source/volume0/ /pool0/volume0/
After verifying the copy, switch services or users to the new path.
See What Uses Space#
zfs list -o name,used,avail,refer,mountpoint
Show snapshots and written data:
zfs list -t filesystem,snapshot -o name,used,refer,written
Show pool allocation:
zpool list -o name,size,alloc,free,capacity,fragmentation,health
Clear A Resolved Error#
Only clear errors after understanding and fixing the cause.
zpool clear pool0
Clear one device:
zpool clear pool0 /dev/disk/by-id/disk0
Snapshots#
Snapshots are cheap, read-only points in time. They protect against accidental deletion, bad updates, and ransomware that does not have permission to destroy snapshots.
Create a snapshot:
zfs snapshot pool0/volume0@manual-2026-05-14
Create recursive snapshots:
zfs snapshot -r pool0@manual-2026-05-14
List snapshots:
zfs list -t snapshot
List snapshots for one dataset:
zfs list -t snapshot -r pool0/volume0
Destroy a snapshot:
zfs destroy pool0/volume0@manual-2026-05-14
Snapshot Naming#
Use sortable names:
pool0/volume0@hourly-2026-05-14-1300
pool0/volume0@daily-2026-05-14
pool0/volume0@weekly-2026-W20
pool0/volume0@monthly-2026-05
Avoid vague names:
pool0/volume0@new
pool0/volume0@backup
pool0/volume0@test
Snapshot Holds#
A hold protects a snapshot from accidental destruction. This is useful during recovery work, ransomware investigation, backup validation, or any time a snapshot must survive cleanup scripts.
Create a snapshot and hold it:
zfs snapshot pool0/volume0@recovery-2026-06-30
zfs hold keep pool0/volume0@recovery-2026-06-30
Apply a hold recursively to snapshots with the same name:
zfs snapshot -r pool0@recovery-2026-06-30
zfs hold -r keep pool0@recovery-2026-06-30
List holds:
zfs holds pool0/volume0@recovery-2026-06-30
zfs holds -r pool0@recovery-2026-06-30
Release a hold when the snapshot no longer needs protection:
zfs release keep pool0/volume0@recovery-2026-06-30
zfs release -r keep pool0@recovery-2026-06-30
Best practice:
- Use holds on snapshots that are part of an active recovery or legal hold.
- Use clear hold tags such as
keep,incident-2026-06-30, orrestore-test. - Do not leave holds undocumented; they can prevent expected snapshot pruning.
Restore One File#
Snapshots are visible under .zfs/snapshot when enabled.
Enable snapshot directory visibility:
zfs set snapdir=visible pool0/volume0
Restore one file:
cp /pool0/volume0/.zfs/snapshot/daily-2026-05-14/example.txt /pool0/volume0/example.txt
Restore a directory:
rsync -aHAX /pool0/volume0/.zfs/snapshot/daily-2026-05-14/dir0/ /pool0/volume0/dir0/
Roll Back A Dataset#
Rollback returns the entire dataset to the snapshot state. Newer changes are lost.
zfs rollback pool0/volume0@daily-2026-05-14
Rollback and destroy newer snapshots if required:
zfs rollback -r pool0/volume0@daily-2026-05-14
Best practice:
- Prefer restoring individual files when possible.
- Use rollback only when you want the whole dataset back in time.
Clone A Snapshot#
Use a clone to inspect or test from an old state without rolling back.
zfs clone pool0/volume0@daily-2026-05-14 pool0/volume1
Destroy the clone when finished:
zfs destroy pool0/volume1
Snapshot Retention#
Example retention policy:
- Keep hourly snapshots for 24 hours.
- Keep daily snapshots for 14 days.
- Keep weekly snapshots for 8 weeks.
- Keep monthly snapshots for 12 months.
Use an existing snapshot tool where possible, such as sanoid, zrepl, syncoid, zfs-auto-snapshot, or a platform-native scheduler.
Simple manual snapshot example:
zfs snapshot -r pool0@daily-2026-05-14
Simple manual cleanup example:
zfs destroy pool0/volume0@daily-2026-04-14
Best practice:
- Automate snapshot creation and pruning.
- Monitor snapshot space usage.
- Keep snapshots for recovery convenience, not as your only backup.
Pool Checkpoints#
A pool checkpoint is a short-term, pool-wide rewind point. It captures the entire state of pool0, including datasets, snapshots, pool properties, and vdev configuration. It is useful before risky pool-wide operations or destructive maintenance, such as a large cleanup, zfs destroy, pool feature upgrade testing on the same OpenZFS implementation, or a complicated migration step.
Check whether the checkpoint feature is available on the pool:
zpool get feature@zpool_checkpoint pool0
Create a checkpoint:
zpool checkpoint pool0
Check whether a checkpoint exists and how much space it uses:
zpool status pool0
zpool list -o name,size,alloc,free,checkpoint,health pool0
Discard a checkpoint after the maintenance succeeds:
zpool checkpoint -d pool0
Wait until checkpoint discard finishes:
zpool checkpoint -d -w pool0
Rewind to a checkpoint. The pool must be exported first, then imported with the rewind flag:
zpool export pool0
zpool import --rewind-to-checkpoint pool0
Preview the checkpointed state read-only before committing to the rewind:
zpool export pool0
zpool import -o readonly=on --rewind-to-checkpoint pool0
Important limits:
- A checkpoint is not a backup. It lives inside the same pool.
- A pool can have only one active checkpoint.
- Keep checkpoints temporary. They can consume space as the live pool changes.
- Rewinding permanently loses all changes written after the checkpoint.
- Once a pool is imported with
--rewind-to-checkpoint, that checkpoint is
consumed and cannot be used again.
- While a checkpoint exists, some operations are blocked, including vdev
remove, attach, detach, mirror split, and reguid.
- Adding a new vdev after a checkpoint is possible, but if you rewind, that vdev
must be added again.
- Scrubs do not repair checkpointed data that has been freed in the current
live state.
- Reservations and refreservations can become misleading while a checkpoint
exists because the checkpoint may consume space they normally protect.
Best practice:
- Use snapshots for normal file and dataset recovery.
- Use checkpoints for short maintenance windows where a whole-pool rewind would
be acceptable.
- Discard the checkpoint as soon as you are sure the operation succeeded.
- Do not keep a checkpoint around for routine retention.
Backups With ZFS Send And Receive#
ZFS replication is one of the strongest ZFS features. Use zfs send and zfs receive to copy snapshots exactly to another pool or host.
For a first full receive, let zfs receive create the destination dataset. If the destination already exists and has diverged, do not force it unless you are intentionally replacing or rolling it back.
Local Backup To Another Pool#
Create a snapshot:
zfs snapshot -r pool0/volume0@backup-2026-05-14
Send it to pool1:
zfs send -R pool0/volume0@backup-2026-05-14 | zfs receive -u pool1/volume0
The -u option receives the dataset without mounting it immediately. Check the received mountpoint before mounting because recursive sends can preserve source properties:
zfs get mountpoint pool1/volume0
zfs set mountpoint=/pool1/volume0 pool1/volume0
Incremental Backup#
Create the next snapshot:
zfs snapshot -r pool0/volume0@backup-2026-05-15
Send only the difference:
zfs send -R -I pool0/volume0@backup-2026-05-14 pool0/volume0@backup-2026-05-15 | zfs receive -u pool1/volume0
Remote Backup Over SSH#
Create a snapshot:
zfs snapshot -r pool0/volume0@backup-2026-05-14
Send to another host:
zfs send -R pool0/volume0@backup-2026-05-14 | ssh backup0.example.net zfs receive -u pool1/volume0
Incremental remote send:
zfs send -R -I pool0/volume0@backup-2026-05-14 pool0/volume0@backup-2026-05-15 | ssh backup0.example.net zfs receive -u pool1/volume0
Receive Into A Different Dataset Name#
zfs send -R pool0/volume0@backup-2026-05-14 | zfs receive -u pool1/volume1
Raw Encrypted Sends#
For encrypted datasets, raw sends preserve encryption without exposing plaintext to the receiving system.
zfs send -w pool0/volume0@backup-2026-05-14 | zfs receive -u pool1/volume0
Incremental raw send:
zfs send -w -I pool0/volume0@backup-2026-05-14 pool0/volume0@backup-2026-05-15 | zfs receive -u pool1/volume0
Resume Interrupted Replication#
Large sends can fail because of network loss, remote reboot, disk errors, or an interrupted terminal. Use resumable receives for large backups and restores so you do not need to restart from zero.
Start a receive in resumable mode:
zfs send -R pool0/volume0@backup-2026-06-30 | ssh backup0.example.net zfs receive -s -u pool1/volume0
If the receive is interrupted, check the resume token on the receiving side:
zfs get receive_resume_token pool1/volume0
Resume from the sender with the token value:
zfs send -t TOKEN | ssh backup0.example.net zfs receive -s -u pool1/volume0
If you decide to abandon the partial receive, abort it on the receiving side:
zfs receive -A pool1/volume0
Best practice:
- Use
zfs receive -sfor long transfers over unreliable links. - Save the exact source and destination snapshot names in your backup logs.
- Do not destroy the source snapshot until the receive has completed and a test
restore succeeds.
ZFS Bookmarks For Replication#
A bookmark records the creation point of a snapshot and can be used as the source side of a later incremental send. Bookmarks are useful when you want to delete old snapshots on the source but still keep an incremental replication anchor.
Create a snapshot and bookmark:
zfs snapshot pool0/volume0@backup-2026-06-30
zfs bookmark pool0/volume0@backup-2026-06-30 pool0/volume0#backup-2026-06-30
Use the bookmark as the incremental source:
zfs snapshot pool0/volume0@backup-2026-07-01
zfs send -i pool0/volume0#backup-2026-06-30 pool0/volume0@backup-2026-07-01 | zfs receive -u pool1/volume0
List bookmarks:
zfs list -t bookmark
Destroy a bookmark when it is no longer needed:
zfs destroy pool0/volume0#backup-2026-06-30
Best practice:
- Keep the actual snapshot until you know the receiver has the matching state.
- Use bookmarks to reduce long-term source snapshot clutter, not as a
replacement for real backup snapshots on the destination.
- Document which destination snapshot a bookmark corresponds to.
Best practice:
- Keep at least one backup outside the primary machine.
- Use recursive sends for datasets with children.
- Use raw sends for encrypted datasets when the receiver should not have the
key.
- Use resumable receives for large transfers.
- Regularly test receiving and restoring.
Scrubs#
A scrub reads data, verifies checksums, and repairs bad copies when redundancy exists.
Start a scrub:
zpool scrub pool0
Check progress:
zpool status pool0
Stop a scrub:
zpool scrub -s pool0
Recommended schedule:
- Consumer disks: scrub every 2 to 4 weeks.
- Enterprise disks: scrub monthly or according to workload.
- Backup pools that are often offline: scrub after import and before trusting a
restore.
Best practice:
- Scrub during low activity windows.
- Investigate checksum, read, or write errors.
- Do not ignore recurring errors after clearing them.
Disk Replacement#
Identify A Failing Disk#
Check status:
zpool status -v pool0
Look at disks:
lsblk -o NAME,SIZE,MODEL,SERIAL,TYPE
Check SMART data:
smartctl -a /dev/disk/by-id/disk0
Offline A Disk#
If the disk is still present and you need to replace it:
zpool offline pool0 /dev/disk/by-id/disk0
Replace A Disk#
Replace old disk with new disk:
zpool replace pool0 /dev/disk/by-id/disk0 /dev/disk/by-id/disk4
If ZFS already sees the old disk as unavailable:
zpool status pool0
# Use the unavailable device GUID shown by zpool status.
zpool replace pool0 1234567890123456789 /dev/disk/by-id/disk4
Watch resilver progress:
zpool status pool0
After a successful replacement, clear old errors if needed:
zpool clear pool0
Online A Disk#
zpool online pool0 /dev/disk/by-id/disk4
Detach A Disk From A Mirror#
Only detach from mirrors when you understand the redundancy left behind.
zpool detach pool0 /dev/disk/by-id/disk0
Expanding Pools#
Add Another Mirror Vdev#
This is a common and clean expansion method.
zpool add pool0 mirror /dev/disk/by-id/disk4 /dev/disk/by-id/disk5
Best practice:
- Add vdevs with similar redundancy and performance.
- Do not add a single disk vdev to a redundant pool.
- Use
zpool attach, notzpool add, when the goal is to mirror an existing
single disk.
Bad example:
# This can make the whole pool depend on one disk.
zpool add pool0 /dev/disk/by-id/disk4
Grow After Replacing All Disks#
Enable autoexpand:
zpool set autoexpand=on pool0
After every disk in a vdev has been replaced with larger disks, expand:
zpool online -e pool0 /dev/disk/by-id/disk4
Add A RAIDZ Vdev#
The traditional and widely supported way to expand a RAIDZ pool is to add another complete RAIDZ vdev.
zpool add pool0 raidz2 \
/dev/disk/by-id/disk6 \
/dev/disk/by-id/disk7 \
/dev/disk/by-id/disk8 \
/dev/disk/by-id/disk9 \
/dev/disk/by-id/disk10 \
/dev/disk/by-id/disk11
Best practice:
- Keep new RAIDZ vdevs similar to existing vdevs.
- Avoid mixing a RAIDZ2 vdev with a single disk or weak vdev.
Expand An Existing RAIDZ Vdev#
Newer OpenZFS versions support RAIDZ expansion. This widens an existing RAIDZ vdev by attaching another disk to that RAIDZ vdev.
Check whether the pool advertises the feature:
zpool get feature@raidz_expansion pool0
Find the RAIDZ vdev name:
zpool status pool0
Expand a six-disk raidz2-0 vdev into a seven-disk raidz2-0 vdev:
zpool attach pool0 raidz2-0 /dev/disk/by-id/disk12
Watch progress:
zpool status pool0
Important limits:
- RAIDZ expansion needs OpenZFS support and the
raidz_expansionpool feature. - If the feature is unavailable or disabled, check your operating system and
OpenZFS version before upgrading pool feature flags. New feature flags can make a pool unimportable on older systems.
- The new disk must be at least as large as the smallest disk in that RAIDZ
vdev.
- Expansion keeps the same parity level. A RAIDZ1 vdev stays RAIDZ1, RAIDZ2
stays RAIDZ2, and RAIDZ3 stays RAIDZ3.
- Existing blocks keep their old data-to-parity ratio. New blocks use the wider
layout after expansion.
- Expansion reads and rewrites allocated data in the vdev, so it can take a
long time.
- A scrub is started after expansion to verify copied blocks.
Examples of what this can and cannot do:
OK: 6-wide RAIDZ2 -> 7-wide RAIDZ2
OK: 5-wide RAIDZ1 -> 6-wide RAIDZ1
NO: 5-wide RAIDZ1 -> 6-wide RAIDZ2
NO: 6-wide RAIDZ2 -> 7-wide RAIDZ3
To change parity level, create a new pool or new vdev with the desired RAIDZ level and move data with zfs send and zfs receive. Adding a RAIDZ2 vdev to a pool that already has a RAIDZ1 vdev does not make the old RAIDZ1 vdev safer; the pool is still limited by its weakest vdev.
Import And Export#
Export a pool before moving disks to another system:
zpool export pool0
List importable pools:
zpool import
Import by name:
zpool import pool0
Import with an alternate root:
zpool import -R /mnt pool0
Import without mounting datasets:
zpool import -N pool0
Import Using A Specific Device Directory#
The -d option tells ZFS where to search for pool devices. This is a device search path, not a mountpoint. Use -R when you want to change where datasets mount.
Import using stable long names:
zpool import -d /dev/disk/by-id pool0
Import using shorter /dev names:
zpool import -d /dev pool0
Using /dev can make zpool status output shorter, for example sdb instead of a long /dev/disk/by-id/... name. The tradeoff is that /dev/sdX names are not stable across reboots, controller changes, or disk reordering.
Search multiple directories:
zpool import -d /dev/disk/by-id -d /dev/disk/by-path pool0
List importable pools from a specific directory without importing:
zpool import -d /dev
Import with short device names but mount everything under /mnt:
zpool import -d /dev -R /mnt pool0
Import for recovery without mounting filesystems:
zpool import -d /dev -N -o readonly=on pool0
Best practice:
- Prefer
/dev/disk/by-id/for normal operation because it survives device
renumbering.
- Use
-d /devdeliberately when short names are more important than stable
names, such as quick lab work or temporary recovery.
- Use
-R /mntfor recovery environments so datasets do not mount over the
live system paths.
Import By Pool ID Or Temporary Name#
If multiple importable pools have the same name, import by the numeric pool ID shown by zpool import. This happens after disk moves, lab tests, backup disk rotation, or attaching old replacement disks.
List importable pools and IDs:
zpool import
Example output shape:
pool: pool0
id: 1234567890123456789
state: ONLINE
action: The pool can be imported using its name or numeric identifier.
Import by ID:
zpool import 1234567890123456789
Import by ID under a temporary name:
zpool import 1234567890123456789 pool2
Import by ID read-only under /mnt:
zpool import -N -o readonly=on -R /mnt 1234567890123456789 pool2
Best practice:
- Import by ID when names collide.
- Use a temporary name such as
pool2for inspection or recovery. - Keep old disks from previous pools offline unless they are intentionally part
of the recovery.
Import read-only for recovery:
zpool import -o readonly=on pool0
Force import only when necessary, such as after a crashed system that cannot export the pool:
zpool import -f pool0
Best practice:
- Export cleanly before moving pools.
- Use read-only import when inspecting damaged systems.
- Avoid force import unless you know why it is needed.
Encryption#
ZFS native encryption is per dataset. Enable it when creating a dataset.
Create an encrypted dataset with a passphrase:
zfs create \
-o encryption=on \
-o keyformat=passphrase \
-o mountpoint=/pool0/volume0 \
pool0/volume0
Load key:
zfs load-key pool0/volume0
Mount encrypted dataset:
zfs mount pool0/volume0
Unload key:
zfs unload-key pool0/volume0
Check encryption:
zfs get encryption,keyformat,keystatus pool0/volume0
Encrypted Dataset Recovery Checklist#
An encrypted pool can be perfectly healthy while encrypted datasets remain unrecoverable without their keys. ZFS native encryption protects data by design; there is no backdoor if the key or passphrase is lost.
Before an incident:
- Store passphrases or key files in an offline password manager, sealed print,
or other tested recovery process.
- Document which datasets are encryption roots.
- Test key loading after reboot.
- Use raw encrypted sends when the backup host should not decrypt the data.
List encryption roots and key status:
zfs get -r encryptionroot,encryption,keyformat,keylocation,keystatus pool0
Import the pool without mounting datasets:
zpool import -N pool0
Dry-run a key load to test whether the key is correct:
zfs load-key -n pool0/volume0
Load keys recursively, then mount:
zfs load-key -r pool0/volume0
zfs mount pool0/volume0
Load all available encryption roots:
zfs load-key -a
Use a temporary key location without changing the dataset property:
zfs load-key -L file:///root/recovery-key0 pool0/volume0
Change a passphrase only after the current key is loaded:
zfs change-key pool0/volume0
Raw encrypted backup:
zfs snapshot pool0/volume0@secure-2026-06-30
zfs send -w pool0/volume0@secure-2026-06-30 | zfs receive -u pool1/volume0
Best practice:
- Test that backups can be received and mounted with the expected key process.
- Do not mix raw and non-raw incremental receives for the same encrypted
replication chain.
- Keep key backups separate from the encrypted pool.
- Treat lost keys as permanent data loss for that encrypted dataset.
Best practice:
- Create encryption at dataset creation time.
- Keep recovery keys or passphrases offline.
- Use raw sends for encrypted backup when the backup host should not decrypt
the data.
Zvols#
A zvol is a block device backed by ZFS. Use zvols for VM disks, iSCSI targets, or applications that need a block device.
Create a 100G zvol:
zfs create -V 100G pool0/volume0
Create a sparse 100G zvol:
zfs create -s -V 100G pool0/volume1
Set zvol block size at creation time:
zfs create -o volblocksize=16K -V 100G pool0/volume2
Find the device:
ls -l /dev/zvol/pool0/
Snapshot a zvol:
zfs snapshot pool0/volume0@before-update-2026-05-14
Best practice:
- Set
volblocksizebefore writing data. - Use smaller volblock sizes for databases or random I/O.
- Use larger volblock sizes for sequential workloads.
- Do not overuse sparse zvols unless you monitor free pool space carefully.
- Keep enough free space for snapshots and writes.
Virtual Machines#
For VM image files stored in a filesystem dataset:
zfs create -o mountpoint=/pool0/volume3 pool0/volume3
zfs set recordsize=64K pool0/volume3
zfs set compression=zstd pool0/volume3
zfs set atime=off pool0/volume3
For VM zvols:
zfs create -o volblocksize=16K -V 200G pool0/volume4
Best practice:
- Prefer mirrors for VM pools.
- Avoid very wide RAIDZ for heavy VM random writes.
- Keep snapshots short-lived for busy VM disks unless you need them.
- Monitor snapshot growth.
- Consider a separate dataset or zvol per VM.
Databases#
Databases often need more deliberate tuning than ordinary file storage.
Example for a PostgreSQL-like dataset:
zfs create -o mountpoint=/pool0/volume4 pool0/volume4
zfs set recordsize=8K pool0/volume4
zfs set compression=zstd pool0/volume4
zfs set atime=off pool0/volume4
Example for a MySQL or MariaDB InnoDB-like dataset:
zfs create -o mountpoint=/pool0/volume5 pool0/volume5
zfs set recordsize=16K pool0/volume5
zfs set compression=zstd pool0/volume5
zfs set atime=off pool0/volume5
InnoDB commonly uses 16K pages, so recordsize=16K is a practical starting point. If innodb_page_size is different, match recordsize to that value when possible.
Example for a database backup dataset:
zfs create -o mountpoint=/pool0/volume6 pool0/volume6
zfs set recordsize=1M pool0/volume6
zfs set compression=zstd pool0/volume6
Best practice:
- Match
recordsizeto the database page size when possible. - Set
recordsizebefore initializing or loading the database. - Prefer mirrors for write-heavy databases.
- Coordinate database-consistent snapshots with the database.
- Do not assume a filesystem snapshot is application-consistent unless the
application was flushed, paused, or designed for crash consistency.
Media And Archive Storage#
For large files:
zfs create -o mountpoint=/pool0/volume2 pool0/volume2
zfs set recordsize=1M pool0/volume2
zfs set compression=zstd pool0/volume2
zfs set atime=off pool0/volume2
For cold archives:
zfs set compression=zstd-6 pool0/volume2
Best practice:
- Use RAIDZ2 or RAIDZ3 for large archive pools.
- Use larger record sizes for large sequential files.
- Keep a second copy on another pool or system.
Shares#
ZFS can manage NFS and SMB sharing on some platforms, but exact behavior depends on the operating system. Many administrators prefer to manage shares with the native NFS or Samba configuration and let ZFS handle mountpoints.
Example dataset for sharing:
zfs create -o mountpoint=/pool0/volume0 pool0/volume0
zfs set compression=zstd pool0/volume0
zfs set atime=off pool0/volume0
Example NFS property where supported:
zfs set sharenfs=on pool0/volume0
Disable NFS sharing:
zfs set sharenfs=off pool0/volume0
Example SMB property where supported:
zfs set sharesmb=on pool0/volume1
Disable SMB sharing:
zfs set sharesmb=off pool0/volume1
Best practice:
- Use one dataset per share when permissions, quotas, or snapshots differ.
- Keep share configuration documented.
- Test permissions from a client machine.
Delegation#
ZFS delegation allows non-root users to perform selected ZFS operations.
Allow user user0 to create snapshots on pool0/volume0:
zfs allow user0 snapshot pool0/volume0
Allow user user0 to create and destroy snapshots:
zfs allow user0 snapshot,destroy pool0/volume0
View delegated permissions:
zfs allow pool0/volume0
Remove delegated permissions:
zfs unallow user0 snapshot,destroy pool0/volume0
Best practice:
- Delegate the minimum permissions needed.
- Be careful with
destroy,mount,send, andreceive. - Do not delegate pool-level administration casually.
Cache, Log, And Special Vdevs#
L2ARC#
L2ARC is a read cache on fast devices. It does not replace RAM.
Add an L2ARC cache device:
zpool add pool0 cache /dev/disk/by-id/ssd-cache0
Best practice:
- Add RAM first when possible.
- Use L2ARC only when the working set benefits from read caching.
- Do not expect L2ARC to improve write performance.
SLOG#
SLOG is a separate intent log device for synchronous writes. It is useful only for sync write workloads such as NFS, databases, or virtualization where sync writes matter.
Add a mirrored SLOG:
zpool add pool0 log mirror /dev/disk/by-id/ssd-log0 /dev/disk/by-id/ssd-log1
Best practice:
- Use power-loss-protected SSDs.
- Mirror the SLOG for important pools.
- Do not add a cheap consumer SSD as SLOG.
- SLOG does not speed up normal asynchronous writes.
Special Vdev#
A special vdev can store metadata and optionally small blocks. It can greatly improve metadata-heavy workloads, but if it fails and is not redundant, the pool can fail.
Add a mirrored special vdev:
zpool add pool0 special mirror /dev/disk/by-id/ssd-special0 /dev/disk/by-id/ssd-special1
Best practice:
- Use redundancy for special vdevs.
- Treat special vdevs as critical pool members.
- Plan before adding one. Removing it may not be practical.
TRIM#
For SSD pools, enable autotrim if appropriate for your platform and devices.
zpool set autotrim=on pool0
Manual trim:
zpool trim pool0
Best practice:
- Enable TRIM for SSD-backed pools unless your environment has a reason not to.
- Monitor device behavior after enabling autotrim.
Monitoring#
Minimum monitoring:
zpool status
zpool list
zfs list
Useful health command:
zpool status -x
zpool status -x is the quickest daily check because it suppresses normal pool details and only reports pools with known problems. Healthy output usually says all pools are healthy. If it prints pool names, read the full status for each pool:
zpool status -v pool0
Interpreting ZFS Errors#
ZFS reports device and data health through pool state, vdev state, scrub results, and per-device error counters. The three common device error counters are READ, WRITE, and CKSUM.
Example status columns:
NAME STATE READ WRITE CKSUM
pool0 ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
/dev/disk/by-id/disk0 ONLINE 0 0 0
/dev/disk/by-id/disk1 ONLINE 0 0 0
What the counters mean:
READ: the device failed or struggled to return requested data.WRITE: the device failed or struggled to write data.CKSUM: data was read but did not match the checksum ZFS expected.
When not to worry much:
zpool status -xsays all pools are healthy.- A scrub reports
0 errors. - Old nonzero counters appear after a known, fixed event and do not increase.
- A pool is
ONLINEand a transient cabling or power problem was fixed, then
verified by a clean scrub.
zpool statusshows a scrub or resilver in progress; that is normal during
maintenance, but it should finish.
When to investigate soon:
- Any
READ,WRITE, orCKSUMcounter is nonzero. - Counters increase over time.
- A scrub repairs data.
zpool status -xreports degraded or unhealthy pools.- SMART reports pending sectors, reallocated sectors, media errors, or CRC
errors.
- A device shows
DEGRADED,FAULTED,UNAVAIL,REMOVED, or repeated
online/offline transitions.
When to worry immediately:
- The pool state is
DEGRADED,FAULTED, orUNAVAIL. - A non-redundant pool or vdev has any device problem.
- A RAIDZ1 vdev has one failed disk.
- A mirror has only one remaining good side.
zpool status -vlists permanent data errors or specific damaged files.- Multiple devices in the same vdev show errors at the same time.
Recommended response:
zpool status -v pool0
zpool scrub pool0
zpool status pool0
smartctl -a /dev/disk/by-id/disk0
zpool events
If errors point to one disk, cable, HBA port, or enclosure slot, fix the hardware first. After the cause is fixed and a scrub is clean, clear stale counters:
zpool clear pool0
Do not clear errors just to hide them. Clearing is useful after you understand the cause, replace or repair the bad component, and verify the pool.
Watch pool I/O:
zpool iostat -v pool0 5
Watch dataset space:
zfs list -o name,used,avail,refer,mountpoint
Show pool events:
zpool events
Show detailed pool history:
zpool history pool0
Events And Alerting#
ZFS kernel events explain what happened before a pool reached its current state. They are useful during recovery because they can show whether a problem was a checksum error, I/O error, slow device, missing vdev, import failure, or configuration change.
Show recent events:
zpool events
Show full event payloads:
zpool events -v
Follow events while replacing hardware or testing a pool:
zpool events -f
Clear old events after you have documented them:
zpool events -c
Best practice:
- Enable your platform's ZFS event daemon or alerting system, often called
zed on OpenZFS/Linux systems.
- Send alerts to email, chat, monitoring, or another place that someone
actually reads.
- Alert on pool state changes, checksum errors, I/O errors, slow I/O, failed
imports, and vdev removals.
- Keep event output with incident notes before clearing it.
Best practice:
- Configure email or alerting for pool errors.
- Monitor SMART data separately.
- Monitor free space and snapshot growth.
- Alert before the pool reaches 80% usage.
- Treat checksum errors as serious.
SMART Checks#
ZFS checks data integrity, but disk firmware still reports useful health information.
Show SMART details:
smartctl -a /dev/disk/by-id/disk0
Run a short test:
smartctl -t short /dev/disk/by-id/disk0
Run a long test:
smartctl -t long /dev/disk/by-id/disk0
Best practice:
- Schedule SMART tests.
- Track reallocated sectors, pending sectors, CRC errors, and media errors.
- Replace suspect disks before they fail completely.
Performance Basics#
Keep Free Space#
Performance usually drops as a pool gets full.
Best practice:
- Keep pools below 80% used.
- Start planning expansion before 80%.
- Avoid going above 90% except temporarily.
Check capacity:
zpool list -o name,capacity,free,fragmentation,health
Match Layout To Workload#
Good defaults:
- General file server: mirrors or RAIDZ2.
- VM storage: mirrors.
- Database storage: mirrors.
- Media archive: RAIDZ2 or RAIDZ3.
- Backup target: RAIDZ2, RAIDZ3, or mirrors depending on restore needs.
Compression#
Usually keep compression enabled:
zfs set compression=zstd pool0
Compression can increase speed by reducing disk I/O.
Sync Writes#
Check sync behavior:
zfs get sync pool0/volume0
Default is usually correct:
zfs set sync=standard pool0/volume0
Dangerous setting:
# Can lose acknowledged synchronous writes during power loss or crash.
zfs set sync=disabled pool0/volume0
Best practice:
- Keep
sync=standardunless you understand the application and risk. - Use a proper SLOG for important sync write workloads.
Deduplication#
Do not enable deduplication casually.
zfs get dedup pool0/volume0
Best practice:
- Leave dedup off for most systems.
- Dedup needs large amounts of RAM and careful planning.
- Compression is usually the better choice.
Security And Permissions#
Set ownership after creating a dataset:
chown -R user0:group0 /pool0/volume0
Set basic permissions:
chmod 750 /pool0/volume0
Use ACLs when needed:
zfs set acltype=posixacl pool0/volume0
zfs set xattr=sa pool0/volume0
Best practice:
- Keep one dataset per permission boundary.
- Use encryption for data at rest when needed.
- Keep backup permissions as strict as primary permissions.
- Limit who can destroy snapshots.
Boot Pools#
Boot environments vary by operating system. The safest general practices are:
- Keep boot pool layouts simple.
- Use mirrors for boot disks when uptime matters.
- Do not use exotic feature flags if the bootloader cannot read them.
- Keep a tested rescue USB or recovery environment.
- Snapshot boot environments before major upgrades where supported.
- Confirm the system can boot after disk replacement.
Check pool features:
zpool get all pool0 | less
Upgrade pool features only after checking compatibility:
zpool upgrade pool0
Disaster Recovery#
Recovery Decision Tree#
Use this as the first pass during an incident. The goal is to choose the safest path before running commands that change pool state.
| Situation | First Safe Check | Preferred Recovery Path | Avoid |
|---|---|---|---|
| Deleted one file | zfs list -t snapshot -r pool0/volume0 |
Copy from .zfs/snapshot or restore from backup |
Rolling back the whole dataset unnecessarily |
| Bad package or app update | zfs list -t snapshot -r pool0 |
Restore files, clone a snapshot, or rollback one dataset | Pool-wide rewind unless every later change can be lost |
| Ransomware or mass deletion | zpool export pool0 if safe, or stop clients |
Import read-only, hold snapshots, restore into new datasets | Letting clients keep writing to the pool |
Pool is DEGRADED |
zpool status -gLPv pool0 |
Fix cabling/power, replace failed disk, resilver, scrub | Replacing multiple disks at once without reason |
| Pool will not import | zpool import -d /dev/disk/by-id |
Try read-only no-mount import, then documented recovery flags | Random -f, -F, -X, or label operations |
| Permanent file errors | zpool status -v pool0 |
Restore named files from backup or healthy send stream | Clearing errors before copying evidence |
| Interrupted backup or restore | zfs get receive_resume_token pool1/volume0 |
Resume with zfs send -t TOKEN |
Restarting from zero when a valid token exists |
| Missing SLOG | zpool import -d /dev/disk/by-id |
Prefer finding the SLOG; use -m only if loss is acceptable |
Assuming recent sync writes survived |
| Missing special or dedup vdev | zpool status -gLPv pool0 |
Restore missing vdev or restore from backup | Treating it like a disposable cache device |
| Encrypted dataset unavailable | zfs get -r keystatus,keylocation pool0 |
Load correct key, mount, then restore if needed | Destroying or recreating encryption roots |
Accidental zpool destroy |
zpool import -D |
Import destroyed pool read-only and copy data out | Reusing or relabeling the disks first |
Incident rules:
- Stop writes first when corruption, ransomware, or failing hardware is
suspected.
- Prefer read-only and no-mount imports during investigation.
- Clone failing disks before aggressive recovery attempts.
- Copy critical data out before starting heavy repairs if more hardware looks
weak.
- Do not clear errors until you have captured status, events, and SMART data.
Accidental File Deletion#
- Check snapshots:
zfs list -t snapshot -r pool0/volume0
- Restore from
.zfs/snapshot:
cp /pool0/volume0/.zfs/snapshot/daily-2026-05-14/file0.txt /pool0/volume0/file0.txt
Bad System Update#
Snapshot before the update:
zfs snapshot -r pool0@before-update-2026-05-14
Rollback one dataset if needed:
zfs rollback pool0/volume0@before-update-2026-05-14
If the risky change affects the whole pool rather than one dataset, create a pool checkpoint before starting. Rewinding to it loses all later changes, so use it only when a whole-pool undo is acceptable:
zpool checkpoint pool0
Pool Will Not Import#
List importable pools:
zpool import
Try read-only import:
zpool import -o readonly=on pool0
Try alternate root:
zpool import -R /mnt pool0
Use force only when you understand why:
zpool import -f pool0
Failed Pool Recovery Triage#
When a pool is failed, faulted, or non-importable, the first goal is to avoid making the situation worse. Do not destroy and recreate the pool, do not clear labels, do not run filesystem repair tools such as fsck, and do not repeatedly try random import flags. If the data is important and the disks may be failing, stop and make sector-level clones of the suspect disks first, for example with a recovery tool such as ddrescue, before further import attempts.
Capture the current state before changing anything:
zpool import
zpool import -d /dev/disk/by-id
zpool import -d /dev
lsblk -o NAME,SIZE,MODEL,SERIAL,TYPE
dmesg --ctime | grep -Ei 'zfs|i/o|error|reset|ata|scsi|nvme'
If the pool imports, collect detailed status:
zpool status -gLPv pool0
zpool events -v
zpool history pool0
Useful options while investigating:
-gshows vdev GUIDs, useful when device names are missing or unstable.-Lresolves symlinks to the current real device path.-Pprints full paths instead of shortened names.-vshows known permanent data errors and affected files when ZFS can name
them.
Start with the least invasive import:
zpool import -d /dev/disk/by-id -N -o readonly=on pool0
Use an alternate root in a rescue environment so datasets do not mount over the live system:
zpool import -d /dev/disk/by-id -N -o readonly=on -R /mnt pool0
Use -f only if the pool appears active because it was not exported cleanly and you are sure no other system is using it:
zpool import -d /dev/disk/by-id -f -N -o readonly=on pool0
If a pool imports read-only, copy the most important data out before repair attempts:
rsync -aHAX --info=progress2 /mnt/pool0/volume0/ /safe-copy/volume0/
If the pool is DEGRADED but importable, prefer recovery over experimentation:
- Verify all cables, HBAs, enclosures, and power before replacing disks.
- If a missing disk reappears, try
zpool online pool0 /dev/disk/by-id/disk0.
- If a disk is truly failed and redundancy remains, replace it with
zpool replace.
- Copy critical data before starting heavy operations if more disks look weak.
- After replacement or repair, let resilver finish and then scrub.
Example replacement:
zpool status -gLPv pool0
zpool replace pool0 1234567890123456789 /dev/disk/by-id/disk4
zpool status pool0
If zpool status -v lists permanent errors in files, restore those files from backup after the pool is stable. If it lists metadata objects or does not name a file, assume affected data may not be recoverable from that pool and prioritize backup restore.
Understand which missing device class matters:
- A missing normal top-level vdev usually means the pool cannot be recovered
without that vdev or a backup.
- A missing mirror side or RAIDZ member may be survivable if enough replicas
remain.
- A missing special or dedup vdev is critical and can make the whole pool
unavailable.
- A missing cache device should not lose pool data; remove or replace it after
the pool is stable.
- A missing separate log device may require
zpool import -m, but that can
discard recent synchronous writes.
Clone Failing Disks Before Recovery Attempts#
If the pool failure may involve physically failing disks, make sector-level clones before running heavy recovery operations. Scrubs, resilvers, repeated imports, and full-file copies can put enough load on a marginal disk to finish it off.
When to clone first:
- SMART shows pending sectors, reallocated sectors, media errors, or repeated
resets.
dmesgshows I/O errors, link resets, timeouts, or NVMe errors.- The disk clicks, drops offline, or disappears under load.
- Multiple disks in the same vdev are suspect.
- The pool contains data that has no tested backup.
Capture device identity:
lsblk -o NAME,SIZE,MODEL,SERIAL,TYPE
smartctl -a /dev/disk/by-id/disk0
zpool import -d /dev/disk/by-id
Example ddrescue workflow:
ddrescue -f -n /dev/disk/by-id/disk0 /safe-copy/disk0.img /safe-copy/disk0.map
ddrescue -f -r3 /dev/disk/by-id/disk0 /safe-copy/disk0.img /safe-copy/disk0.map
Use the map file so the recovery can resume. Work on cloned images or cloned replacement disks when possible, and keep the original disks unchanged until the recovery is complete.
If you clone to replacement disks, attach or import using the replacement devices, not the failing originals:
zpool import -d /dev/disk/by-id -N -o readonly=on pool0
Best practice:
- Clone the weakest disks first.
- Do not run
zpool clearor label operations before imaging suspect disks. - Do not write recovered data back onto the same failing pool.
- Keep notes mapping old serial numbers to cloned images or replacement disks.
Physical Pool Inspection With zdb#
zdb is the ZFS debugger. It can read vdev labels, pool configuration, uberblocks, and some dataset or object metadata directly from devices. That makes it useful when a pool will not import cleanly, when disk names changed after moving hardware, or when you need to prove which physical devices belong to which pool. It is not fsck, it is not a routine repair command, and much of its output assumes ZFS internals.
Use zdb for inspection, not as the first recovery action. If disks may be failing, clone them first and run zdb against the clones or images. If the pool imports read-only, copy or replicate the data out before spending time on deeper metadata analysis.
Useful read-only checks:
zdb -l /dev/disk/by-id/disk0
zdb -lu /dev/disk/by-id/disk0
zdb -lll /dev/disk/by-id/disk0
What these show:
zdb -lreads ZFS labels from one device or partition.zdb -lualso shows uberblocks, including transaction group history.zdb -lllshows every label copy, including stale or duplicate
configurations.
Look for:
- The expected pool name, pool GUID, vdev GUID, and top-level vdev GUID.
- Whether all mirror or RAIDZ members agree about the same pool layout.
- Whether labels mention old device paths, old hostnames, or old pool names.
- Whether a disk has no valid ZFS labels, labels from another pool, or labels
from an old destroyed pool.
- Whether only very old uberblocks remain, which can explain why rewind would
lose recent writes.
Inspect an exported or non-imported pool using a specific device directory:
zdb -e -p /dev/disk/by-id -C pool0
zdb -e -p /dev/disk/by-id -d pool0
This is useful after booting rescue media, moving disks to another system, or cloning disks to image files and loop devices. The -e option tells zdb to operate on an exported pool instead of relying on the normal cache file. The -p option limits the search path, similar in spirit to importing with zpool import -d /dev/disk/by-id.
If a pool checkpoint exists, zdb can inspect the checkpointed state without rolling the pool back:
zdb -e -p /dev/disk/by-id -k -C pool0
This helps compare the current on-disk configuration with the checkpointed configuration before deciding whether zpool import --rewind-to-checkpoint is appropriate.
When to worry:
zdb -lshows I/O errors while reading labels from an original disk. Stop
and image the disk before continuing.
- Different members of the same mirror or RAIDZ vdev report different pool
GUIDs or incompatible top-level vdev GUIDs.
- A disk that should be part of
pool0reports labels frompool1. - Labels are present, but the expected vdev is missing enough members that
redundancy cannot reconstruct the data.
zdb -luonly shows old transaction groups andzpool import -F -nreports
a large rewind.
When not to worry immediately:
- Old path names in labels are common after moving disks between systems.
- Old hostnames are common after migration or rescue booting.
- One bad label copy is not automatically fatal if other label copies are
valid.
- A label from an old pool on an unused replacement disk matters only if you
are about to reuse that disk. Clear it only after verifying backups and disk identity.
Advanced salvage options exist, but they are last-resort work. zdb -B can generate a backup stream from a numeric objset ID when normal dataset metadata is damaged but the dataset is still readable. zdb -r can copy a path or object out of a dataset in some cases. Treat these as expert recovery tools: work from cloned media, write output to a different pool such as pool1, and document every command before running it.
Avoid this pattern on original disks unless you have accepted the risk:
zdb -F pool0
zdb -FX pool0
For ordinary recovery, prefer the documented zpool import -F -n dry run first, then an explicit import decision. zdb -F and zdb -X are deep debugging and rewind tools, not everyday pool administration commands.
Pool Rewind Recovery With -F#
zpool import -F is recovery mode for a non-importable pool. It tries to make the pool importable by discarding the last few transactions. This can recover a pool after damaged recent metadata, but any discarded transactions are lost permanently.
Always dry-run first:
zpool import -d /dev/disk/by-id -F -n pool0
If the dry run says recovery is possible and the data loss is acceptable, import with recovery mode. Use -N to avoid mounting filesystems immediately:
zpool import -d /dev/disk/by-id -F -N pool0
After a successful rewind, scrub the pool and then copy or replicate important data elsewhere:
zpool scrub pool0
zpool status pool0
zfs snapshot -r pool0/volume0@recovered-2026-05-18
zfs send -R pool0/volume0@recovered-2026-05-18 | zfs receive -u pool1/volume0
Use -X only as a last resort. It enables extreme transaction search and may roll back to a transaction group that is not guaranteed to be consistent.
Dry-run the extreme option first:
zpool import -d /dev/disk/by-id -F -X -n pool0
Actual extreme recovery should be reserved for cases where the alternative is restoring from backup or accepting loss:
zpool import -d /dev/disk/by-id -F -X -N pool0
If zpool import or zpool status prints a specific recovery command, prefer that exact command over guessing. If no recovery action is offered and the pool still cannot import, plan for backup restore or professional recovery rather than trying destructive commands.
If a separate log device is missing, -m may allow import by discarding the missing log device. Recent synchronous transactions can be lost.
zpool import -d /dev/disk/by-id -m -N pool0
Recover A Destroyed Pool Entry#
If a pool was destroyed with zpool destroy, it may still be listed with -D until labels are overwritten.
List destroyed pools:
zpool import -D
Import a destroyed pool read-only and without mounting datasets:
zpool import -D -f -N -o readonly=on pool0
If this works, copy the data to another pool immediately. Do not treat this as a normal undo feature; overwritten labels or reused disks can make recovery impossible.
Handling Permanent Data Errors#
Permanent errors mean ZFS could not reconstruct some data from available replicas. They are different from device counters that were corrected during a scrub.
Start with verbose status:
zpool status -v pool0
If files are listed, restore those files from a snapshot, backup pool, or replication target:
cp /pool0/volume0/.zfs/snapshot/daily-2026-06-30/file0.txt /pool0/volume0/file0.txt
Or restore from a backup dataset:
rsync -aHAX /pool1/volume0/file0.txt /pool0/volume0/file0.txt
If the error is in metadata, a directory, or an object that ZFS cannot map to a file, prioritize copying readable data out and restoring the dataset from a clean backup.
After restoring or accepting loss, scrub again:
zpool scrub pool0
zpool status -v pool0
Clear stale errors only after the scrub is clean or after you have documented and accepted unrecoverable damage:
zpool clear pool0
Some OpenZFS versions support corrective receive, which can repair data blocks from a suitable healthy send stream for the affected dataset. It cannot repair metadata and it does not fix the hardware cause of corruption:
zfs send pool1/volume0@clean-2026-06-30 | zfs receive -c pool0/volume0
zpool scrub pool0
Best practice:
- Do not clear permanent errors before recording
zpool status -v. - Restore named files from backup instead of rolling back whole datasets when
possible.
- If permanent errors return after restore, suspect hardware, cabling, RAM, or
controller problems.
- Treat metadata permanent errors as high risk and restore the dataset or pool
from a clean backup.
Restore From Backup Pool#
Restore into a new dataset for inspection first:
zfs send -R pool1/volume0@backup-2026-05-14 | zfs receive -u pool0/volume1
After verifying the restored data, move applications or users to the restored dataset. Replacing an existing dataset should be a deliberate maintenance task, not an automatic first step.
Ransomware Or Mass Deletion Recovery#
If clients are actively deleting or rewriting files, stop the writes first. A perfect snapshot plan can still be damaged if the attacker or broken client has permission to destroy snapshots.
Immediate actions:
- Disconnect affected clients or stop the share service.
- Disable shares for the affected datasets.
- Preserve existing snapshots with holds.
- Avoid rolling back until you know which snapshot is clean.
- Restore into a new dataset first, then cut users over.
Disable ZFS-managed shares where supported:
zfs set sharenfs=off pool0/volume0
zfs set sharesmb=off pool0/volume0
Optionally make the dataset read-only while investigating:
zfs set readonly=on pool0/volume0
Snapshot the current damaged state for investigation:
zfs snapshot -r pool0@incident-2026-06-30
zfs hold -r incident pool0@incident-2026-06-30
List candidate clean snapshots:
zfs list -t snapshot -r pool0/volume0
Clone a known-good snapshot for inspection:
zfs clone pool0/volume0@daily-2026-06-29 pool0/volume1
zfs set mountpoint=/pool0/volume1 pool0/volume1
Copy known-good data into a new recovery dataset:
zfs create -o mountpoint=/pool0/volume2 pool0/volume2
rsync -aHAX --info=progress2 /pool0/volume1/ /pool0/volume2/
After validation, repoint shares or applications to the recovered dataset. Keep the incident snapshot and holds until investigation and backup verification are finished.
Recovery Practice Lab#
Practice recovery on a throwaway system before you need it. This lab uses file-backed vdevs under /tmp. Do not run it on a production host, and choose an unused pool name.
Create a small mirror pool:
mkdir -p /tmp/zfs-lab
truncate -s 512M /tmp/zfs-lab/disk0 /tmp/zfs-lab/disk1 /tmp/zfs-lab/disk2
zpool create -o ashift=12 -O compression=zstd -m /tmp/zfs-lab/mnt pool2 mirror /tmp/zfs-lab/disk0 /tmp/zfs-lab/disk1
zfs create pool2/volume0
Create a file and snapshot:
echo important-data > /tmp/zfs-lab/mnt/volume0/file0.txt
zfs snapshot pool2/volume0@before-delete
Delete and restore one file:
rm /tmp/zfs-lab/mnt/volume0/file0.txt
cp /tmp/zfs-lab/mnt/volume0/.zfs/snapshot/before-delete/file0.txt /tmp/zfs-lab/mnt/volume0/file0.txt
Simulate one failed mirror side and replace it:
zpool offline pool2 /tmp/zfs-lab/disk1
zpool status pool2
zpool replace pool2 /tmp/zfs-lab/disk1 /tmp/zfs-lab/disk2
zpool status pool2
Practice send and receive to another dataset:
zfs snapshot pool2/volume0@backup-test
zfs send pool2/volume0@backup-test | zfs receive pool2/volume1
zfs list pool2/volume1
Clean up when finished:
zpool destroy pool2
rm -rf /tmp/zfs-lab
Practice goals:
- Restore one file from
.zfs/snapshot. - Read
zpool statusduring a degraded mirror. - Replace a failed device and watch resilver progress.
- Send and receive a snapshot.
- Destroy the lab pool only after confirming it is the throwaway pool.
Community FAQ: Top 20 Recurring ZFS Questions#
These are common ZFS questions that repeatedly appear on Reddit, Stack Overflow / Stack Exchange-style sites such as Server Fault, Super User, and Unix & Linux, plus forums and NAS communities. The answers here are kept practical and conservative.
1. What Is The Difference Between A Pool, Vdev, Dataset, And Zvol?#
pool0 is the storage pool. A vdev is a top-level redundancy group inside that pool, such as a mirror or RAIDZ2 group. pool0/volume0 is usually a filesystem dataset. A zvol is a block device created with zfs create -V, for example pool0/volume1.
zpool status pool0
zfs list
zfs list -t volume
2. Why Does One Bad Vdev Endanger The Whole Pool?#
ZFS redundancy is at the vdev level. If pool0 has two mirror vdevs, each mirror must remain healthy enough to serve data. If any top-level data vdev is lost, the whole pool can be lost. There is no extra parity layer above vdevs.
Better:
zpool create pool0 \
mirror /dev/disk/by-id/disk0 /dev/disk/by-id/disk1 \
mirror /dev/disk/by-id/disk2 /dev/disk/by-id/disk3
Dangerous:
zpool create pool0 \
mirror /dev/disk/by-id/disk0 /dev/disk/by-id/disk1 \
/dev/disk/by-id/disk2
3. Should ZFS Be Used On Top Of Hardware RAID?#
Normally no. ZFS works best when it sees individual disks, serial numbers, errors, latency, and flush behavior directly. Hardware RAID can hide failing drives, reorder writes, block SMART visibility, and make recovery harder.
Use an HBA or controller in IT/JBOD mode:
zpool create pool0 mirror /dev/disk/by-id/disk0 /dev/disk/by-id/disk1
4. Can A Single-Disk Pool Become A Mirror Later?#
Yes. Use zpool attach, not zpool add.
zpool status pool0
zpool attach pool0 /dev/disk/by-id/disk0 /dev/disk/by-id/disk1
zpool status pool0
attach adds redundancy to an existing vdev. add creates a new top-level vdev.
5. I Accidentally Used zpool add Instead Of zpool attach. What Now?#
First, stop writing data and inspect the layout.
zpool status pool0
If you added a removable top-level mirror or single-disk vdev and your OpenZFS version supports removal for that topology, zpool remove may work:
zpool remove pool0 /dev/disk/by-id/disk2
If the added vdev cannot be removed, the clean recovery is usually to back up, destroy and recreate the pool correctly, then restore.
6. Can RAIDZ1 Be Converted To RAIDZ2 Or RAIDZ3 In Place?#
No. RAIDZ expansion can widen a RAIDZ vdev on newer OpenZFS versions, but it does not change the parity level. RAIDZ1 stays RAIDZ1, RAIDZ2 stays RAIDZ2, and RAIDZ3 stays RAIDZ3.
To change parity level, create a new pool or new vdev and move the data:
zfs snapshot -r pool0@move-2026-05-14
zfs send -R pool0@move-2026-05-14 | zfs receive -u pool1/volume0
7. What Is The Safest Way To Expand A Pool?#
For mirrors, add another mirror vdev:
zpool add pool0 mirror /dev/disk/by-id/disk4 /dev/disk/by-id/disk5
For RAIDZ, add another complete RAIDZ vdev or use RAIDZ expansion only if your OpenZFS version supports it:
zpool get feature@raidz_expansion pool0
zpool attach pool0 raidz2-0 /dev/disk/by-id/disk12
Do not add a lone disk to a redundant pool.
8. Can A Vdev Be Removed?#
Sometimes. Top-level mirror and single-disk vdev removal may be supported on modern OpenZFS, but RAIDZ vdev removal is not a normal design path. Special vdevs are pool-critical, and removing them may be unsupported or impractical in many layouts.
Check before assuming:
zpool status pool0
zpool remove pool0 mirror-1
Plan pool topology as if top-level vdevs are permanent.
9. Why Did Deleting Files Not Free Space?#
Common causes are snapshots, clones, open deleted files, reservations, zvols, or refreservations. Start with snapshots.
zfs list -t snapshot -o name,used,refer
zfs list -o name,used,avail,refer,usedsnap,usedds,usedrefreserv
Destroy old snapshots only when they are no longer needed:
zfs destroy pool0/volume0@daily-2026-04-14
10. Why Do zpool list, zfs list, And df Show Different Space?#
They measure different layers. zpool list reports pool-level allocation. zfs list reports dataset-level space after ZFS accounting. df reports what the mounted filesystem presents to applications. Snapshots, reservations, parity, metadata, refreservations, and slop space can make the numbers differ.
Use ZFS tools first:
zpool list pool0
zfs list -o name,used,avail,refer,mountpoint
zfs get quota,reservation,refquota,refreservation pool0/volume0
11. Are Snapshots Backups?#
No. Snapshots are excellent local recovery points, but they live on the same pool. If pool0 is destroyed, stolen, overwritten, or lost, its snapshots are lost too. Replicate snapshots to another pool or host.
zfs snapshot -r pool0/volume0@daily-2026-05-14
zfs send -R pool0/volume0@daily-2026-05-14 | zfs receive -u pool1/volume0
12. How Should I Use zfs send And zfs receive?#
Use a full send first, then incremental sends. Let the first receive create the destination dataset.
zfs snapshot -r pool0/volume0@backup-2026-05-14
zfs send -R pool0/volume0@backup-2026-05-14 | zfs receive -u pool1/volume0
zfs snapshot -r pool0/volume0@backup-2026-05-15
zfs send -R -I pool0/volume0@backup-2026-05-14 pool0/volume0@backup-2026-05-15 | zfs receive -u pool1/volume0
For encrypted datasets where the receiver should not decrypt data, use raw sends:
zfs send -w pool0/volume0@backup-2026-05-14 | zfs receive -u pool1/volume0
13. Should Deduplication Be Enabled?#
Usually no. Deduplication needs careful memory, metadata, and workload planning. It can make writes, deletes, and pool recovery much slower. Compression is the right default for most systems.
zfs set compression=zstd pool0/volume0
zfs get dedup pool0/volume0
Only enable dedup after testing with representative data and a recovery plan.
14. What recordsize Should I Use?#
Set recordsize per dataset before writing data. Use larger records for large sequential files and smaller records for databases or VM image files.
zfs set recordsize=128K pool0/volume0 # general files
zfs set recordsize=1M pool0/volume2 # media, archives, torrents
zfs set recordsize=16K pool0/volume4 # some databases
Changing recordsize affects newly written blocks only. Existing data must be rewritten to adopt the new size.
15. Do I Need SLOG, L2ARC, Or A Special Vdev?#
Usually not at first. Add RAM, pick a good pool layout, and measure the workload before adding support vdevs.
- SLOG helps synchronous writes when backed by a fast, power-loss-protected
device.
- L2ARC is a read cache and does not replace RAM.
- Special vdevs can speed metadata and small blocks, but they are critical to
the pool and should be redundant.
Check sync behavior before buying a SLOG:
zfs get sync pool0/volume0
zpool iostat -v pool0 5
16. How Much RAM Does ZFS Need? Is ECC Required?#
There is no universal rule such as "1 GB RAM per 1 TB storage" for normal ZFS. More RAM improves ARC caching, metadata-heavy workloads, and dedup-heavy systems. ECC is strongly recommended for important data because ZFS can repair bad on-disk copies, but it cannot make unreliable memory reliable.
On Linux, ARC limits can be tuned through module options, but tune only after observing real memory pressure.
17. Why Are Writes Slow?#
Common causes include sync writes without a suitable SLOG, an over-wide RAIDZ vdev, SMR disks, a nearly full pool, small random writes, wrong recordsize or volblocksize, encryption CPU limits, weak controllers, bad cables, snapshots on busy zvols, or simply expecting mirror-like IOPS from RAIDZ.
Start with:
zpool status pool0
zpool iostat -v pool0 5
zfs get recordsize,volblocksize,sync,compression pool0/volume0
zpool list -o name,capacity,fragmentation,health pool0
18. What Is ashift, And Why Does Everyone Recommend ashift=12?#
ashift is the sector-size exponent used by a vdev. ashift=12 means 4096-byte sectors and is a safe default for most modern HDDs and SSDs, including many devices that report 512-byte logical sectors. It is set when the vdev is created and cannot be changed for that vdev later.
zpool create -o ashift=12 pool0 mirror /dev/disk/by-id/disk0 /dev/disk/by-id/disk1
19. How Do I Import A Pool After Moving Disks?#
List importable pools:
zpool import
Import using stable names:
zpool import -d /dev/disk/by-id pool0
Import using short /dev names and mount below /mnt:
zpool import -d /dev -R /mnt pool0
For recovery, avoid mounting datasets immediately:
zpool import -d /dev -N -o readonly=on pool0
20. Why Is A Dataset Busy And Unable To Destroy, Export, Or Unmount?#
Something is still using it. Common causes are a shell with its current working directory inside the dataset, a running service, NFS or SMB sharing, a container mount, an open deleted file, a child dataset, a clone, or a snapshot hold.
Inspect before forcing anything:
zfs list -r pool0/volume0
zfs holds -r pool0/volume0
lsof +f -- /pool0/volume0
fuser -vm /pool0/volume0
Then stop the process, unshare the dataset, remove holds, or destroy dependent clones deliberately.
Common Mistakes#
Adding A Single Disk To A Redundant Pool#
Bad:
zpool add pool0 /dev/disk/by-id/disk4
Better:
zpool add pool0 mirror /dev/disk/by-id/disk4 /dev/disk/by-id/disk5
Assuming RAID Is Backup#
Redundancy protects against some disk failures. It does not protect against:
- Accidental deletion.
- Ransomware.
- Theft.
- Fire.
- Controller bugs.
- Admin mistakes.
- Silent application-level corruption already written to disk.
Use snapshots and backups.
Filling The Pool#
Bad:
zpool list pool0
# Capacity near 95%
Better:
zpool list -o name,capacity,free pool0
Plan expansion before the pool is full.
Ignoring Snapshots#
Snapshots can consume space when data changes.
Check snapshot space:
zfs list -t snapshot -o name,used,refer
Destroy old snapshots deliberately:
zfs destroy pool0/volume0@daily-2026-04-14
Enabling Dedup Without Planning#
Bad:
zfs set dedup=on pool0/volume0
Better:
zfs set compression=zstd pool0/volume0
Using Unstable Disk Names#
Bad:
zpool create pool0 mirror /dev/sdb /dev/sdc
Better:
zpool create pool0 mirror /dev/disk/by-id/disk0 /dev/disk/by-id/disk1
Example Build: General Home Or Small Server#
Create a mirrored pool:
zpool create \
-o ashift=12 \
-O compression=zstd \
-O atime=off \
-O xattr=sa \
-O acltype=posixacl \
-m /pool0 \
pool0 \
mirror /dev/disk/by-id/disk0 /dev/disk/by-id/disk1
Create datasets:
zfs create -o mountpoint=/pool0/volume0 pool0/volume0
zfs create -o mountpoint=/pool0/volume1 pool0/volume1
zfs create -o mountpoint=/pool0/volume2 pool0/volume2
Set properties:
zfs set recordsize=128K pool0/volume0
zfs set recordsize=128K pool0/volume1
zfs set recordsize=1M pool0/volume2
zfs set quota=500G pool0/volume1
Create initial snapshots:
zfs snapshot -r pool0@initial-2026-05-14
Check health:
zpool status pool0
zfs list
Example Build: Backup Pool#
Create pool1 as a RAIDZ2 backup pool:
zpool create \
-o ashift=12 \
-O compression=zstd \
-O atime=off \
-m /pool1 \
pool1 \
raidz2 \
/dev/disk/by-id/disk0 \
/dev/disk/by-id/disk1 \
/dev/disk/by-id/disk2 \
/dev/disk/by-id/disk3 \
/dev/disk/by-id/disk4 \
/dev/disk/by-id/disk5
Send a recursive backup from pool0/volume0. For the first full receive, pool1/volume0 should not already exist:
zfs snapshot -r pool0/volume0@backup-2026-05-14
zfs send -R pool0/volume0@backup-2026-05-14 | zfs receive -u pool1/volume0
Export when using removable disks:
zpool export pool1
Example Build: VM Pool#
Create mirror-based VM pool:
zpool create \
-o ashift=12 \
-O compression=zstd \
-O atime=off \
-m /pool2 \
pool2 \
mirror /dev/disk/by-id/nvme0 /dev/disk/by-id/nvme1 \
mirror /dev/disk/by-id/nvme2 /dev/disk/by-id/nvme3
Create VM dataset:
zfs create -o mountpoint=/pool2/volume0 pool2/volume0
zfs set recordsize=64K pool2/volume0
Create VM zvol:
zfs create -o volblocksize=16K -V 200G pool2/volume1
Snapshot before maintenance:
zfs snapshot pool2/volume1@before-maintenance-2026-05-14
Quick Reference#
Pool health:
zpool status -x
Detailed status:
zpool status -v pool0
Detailed recovery status with GUIDs and paths:
zpool status -gLPv pool0
Inspect ZFS labels on a physical disk:
zdb -l /dev/disk/by-id/disk0
Inspect labels and uberblocks:
zdb -lu /dev/disk/by-id/disk0
Inspect an exported pool using a specific device directory:
zdb -e -p /dev/disk/by-id -C pool0
List pools:
zpool list
List datasets:
zfs list
Create dataset:
zfs create pool0/volume0
Create snapshot:
zfs snapshot pool0/volume0@daily-2026-05-14
List snapshots:
zfs list -t snapshot
Hold recovery snapshot:
zfs hold keep pool0/volume0@daily-2026-05-14
List snapshot holds:
zfs holds pool0/volume0@daily-2026-05-14
Rollback:
zfs rollback pool0/volume0@daily-2026-05-14
Destroy snapshot:
zfs destroy pool0/volume0@daily-2026-05-14
Start scrub:
zpool scrub pool0
Stop scrub:
zpool scrub -s pool0
Export pool:
zpool export pool0
Import pool:
zpool import pool0
Import read-only without mounting:
zpool import -N -o readonly=on pool0
Import by pool ID under a temporary name:
zpool import 1234567890123456789 pool2
Create pool checkpoint:
zpool checkpoint pool0
Discard pool checkpoint:
zpool checkpoint -d pool0
Rewind to pool checkpoint:
zpool export pool0
zpool import --rewind-to-checkpoint pool0
Send snapshot:
# First full receive; pool1/volume0 should not already exist.
zfs send pool0/volume0@daily-2026-05-14 | zfs receive pool1/volume0
Incremental send:
zfs send -I pool0/volume0@daily-2026-05-14 pool0/volume0@daily-2026-05-15 | zfs receive pool1/volume0
Resume interrupted receive:
zfs get receive_resume_token pool1/volume0
zfs send -t TOKEN | zfs receive -s -u pool1/volume0
List events:
zpool events -v
Load encryption key:
zfs load-key pool0/volume0
Maintenance Schedule#
Daily:
- Check alerts.
- Confirm free space is healthy.
- Confirm backups completed.
Weekly:
- Review
zpool status. - Review
zpool events -vfor new hardware or data errors. - Review snapshot growth.
- Confirm backup replication.
Monthly:
- Run or verify scrub completion.
- Check SMART data.
- Test a small restore.
- Test loading encryption keys for encrypted recovery datasets.
- Review pool capacity trend.
- Confirm no old pool checkpoint was accidentally left behind.
Quarterly:
- Test a full restore path.
- Practice the recovery lab on a throwaway host or VM.
- Verify resumable replication and backup documentation.
- Review retention policy.
- Confirm recovery media works.
- Review whether feature upgrades are needed.
Final Best Practices Checklist#
- Use direct disk access, not hardware RAID.
- Use
/dev/disk/by-id/paths. - Use mirrors for performance-sensitive workloads.
- Use RAIDZ2 or RAIDZ3 for large archive pools.
- Use
ashift=12for modern disks unless you know otherwise. - Enable
compression=zstdby default. - Disable
atimeunless needed. - Create separate datasets for separate policies.
- Set
recordsizeorvolblocksizebefore writing data. - Keep pools below 80% used.
- Schedule scrubs.
- Monitor SMART health.
- Snapshot automatically.
- Hold critical recovery snapshots during incidents.
- Replicate backups with
zfs send. - Use resumable receives for large backup or restore streams.
- Use pool checkpoints only for short whole-pool maintenance rollback windows.
- Import read-only and without mounting when investigating damaged pools.
- Clone failing disks before aggressive recovery attempts.
- Use
zdbfor offline inspection of labels, GUIDs, and uberblocks, not as a
routine repair command.
- Test restores.
- Avoid dedup unless carefully designed.
- Use native encryption where appropriate.
- Keep encryption keys or passphrases recoverable offline.
- Configure ZFS event alerting.
- Be careful with
zpool upgrade. - Replace failing disks promptly.
- Document pool layout, disk serials, and recovery steps.