Remote Management of ZFS servers with Puppet and RAD

A few months ago I had the chance to test an Oracle ZFS Storage Appliance (ZFS SA) and the appliance made a very good impression on many areas. It especially brought to my mind again, that ZFS shines even more if you use it as NAS (Network Attached Storage), as central fileserver which shares its storage capacity for example via NFS.

But I did not really like the distributed storage configuration. E.g. a database server needs the correct ZFS properties set on the ZFS storage appliance via the web-interface or the custom CLI and also the corresponding NFS mount options in /etc/vfstab on the database server. Maybe this sounds like no big issue to you, for example, if you are also the admin responsible for the storage appliance, or if you have a perfect collaboration with the storage team. But especially if you want to automate the storage configuration, this distribution adds a significant complexity.

Of course I wanted to manage the configuration with Puppet like a local ZFS filesystem.
I don’t yet have a ZFS SA at work to deal with, but the availability of the new RAD REST interface in Solaris 11.3 motivated me to experiment with an own Puppet resource type to manage the remote ZFS filesystems directly on the client server.

Read More

Puppet with Solaris RAD

Solaris 11.3 beta ships with a REST-API for the Solaris Remote Administration Daemon (RAD), which makes RAD finally easy to use with Ruby and Puppet. The following is a small experiment to test its capabilities.

Puppet provides a nice abstraction layer of the configuration of a system. A Puppet manifest is usually easier and faster to understand than a documentation with many CLI commands. But there is no magic involved, Puppet usually executes also the same CLI commands in the background. With the debug option you can observe this command executions.

Read More

Zfsdir: Simple ZFS management with Puppet

ZFS is a great filesystem, with many, many features. But for all that it is still easy to manage, in my opinion easier than other filesystems. Managing storage is usually a high risk task, which makes automation harder. Would you change the size for a critical filesystem with an automated method? If it is an ext3 filesystem on LVM and software-raid, maybe not. If it is on ZFS, a low risk modification of the quota could be enough, e.g. zfs set quota=800g rpool/criticalfilesystem. That’s easy to automate. Nowadays automation becomes even necessary because, the amount of ZFS filesystems is growing. And if you like to use more features you likely need to set more ZFS properties.

Read More

Control the size of the ZFS ARC cache dynamically

Last updated on: 25th Dec 2014

Solaris 11.2 deprecates the zfs_arc_max kernel parameter in favor of user_reserve_hint_pct and that’s cool.

tl;dr
ZFS has a very smart cache, the so called ARC (Adaptive replacement cache). In general the ARC consumes as much memory as it is available, it also takes care that it frees up memory if other applications need more.

In theory, this works very good, ZFS just uses available memory to speed up slow disk I/O. But it also has some side effects, if the ARC consumed almost all unused memory. Applications which request more memory need to wait, until the ARC frees up memory. For example, if you restart a big database, the startup is maybe significantly delayed, because the ARC could have used the free memory from the database shutdown in the meantime already. Additionally this database would likely request large memory pages, if ARC uses just some free segments, the memory gets easily fragmented.

Read More

zpool-zones-vio.d - ZFS statistics per zpool and per zone

Another script, which provides a missing feature to Oracle Solaris 10/11: ZFS statistics per zpool and per zone.

The ZFS in Oracle Solaris 10 and 11 is still not completely ready for a single pool setup, for some applications it makes sense to use multiple pools. In general, if one workload stresses the single zpool too much, the performance of the other workloads can be degraded. Therefore it’s still best practice to use multiple pools, for example the “Configuring ZFS for an Oracle Database” white paper suggests to use at least an own pool for the redo-logs.

Read More

oracle-pwrite-zfs-latency.d: Detect slow sync writes on ZFS

The first DTrace integration into the Oracle 12c database was finally reverse engineered in great detail by Andrey Nikolaev. Finally we have a good documentation how the V$KERNEL_IO_OUTLIER view really works.

The standalone kernel_io_oulier.d DTrace script works great even on Solaris 10 and pre-12c Oracle databases, but only on RAW-devices and ASM. Oracle pushes ASM because of good arguments, but sometimes one likes to use a filesystem.

We have some databases on ZFS and so I was jealous that the script does not work with ZFS. Therefore I tried to make a similar script for ZFS.

Read More