Patching a live Solaris 10 system with LU, ZFS, and PCA
by Mark on Mar.03, 2010, under solaris
Sun have done some work in recent times with liveupgrade – the last time I looked at it, a few years back now, it was rubbish. I thought it was about time I took another look, since a lot of the updates in OpenSolaris were looking good.
The idea was to patch a Solaris 10 update 8 (10/09) machine to the most recent patch levels, whilst the machine was still up and running, going about it’s daily business. Other that the standard Solaris tools, I’d be using pca (Patch Check Advanced) to do the actual patching. The system was installed with a ZFS root, since this actually gets us some great features in LiveUpgrade (LU) – namely ZFS snapshots as boot environments (BEs).
First off, create a BE that will be patched:
solaris:~# lucreate -n patching Checking GRUB menu... System has findroot enabled GRUB Analyzing system configuration. Comparing source boot environment file systems with the file system(s) you specified for the new boot environment. Determining which file systems should be in the new boot environment. Updating boot environment description database on all BEs. Updating system configuration files. Creating configuration for boot environment . Source boot environment is . Creating boot environment . Cloning file systems from boot environment to create boot environment . Creating snapshot for on . Creating clone for on . Setting canmount=noauto for in zone on . WARNING: split filesystem file system type cannot inherit mount point options <-> from parent filesystem file type <-> because the two file systems have different types. Saving existing file in top level dataset for BE as //boot/grub/menu.lst.prev. File propagation successful Copied GRUB menu from PBE to ABE No entry for BE in GRUB menu Population of boot environment successful. Creation of boot environment successful.
If we now take a look at the ZFS filesystems we can see the ‘patching’ snapshot…
solaris:~# zfs list NAME USED AVAIL REFER MOUNTPOINT rpool 2.08G 5.73G 33K /rpool rpool/ROOT 1.03G 5.73G 21K legacy rpool/ROOT/install 1.03G 5.73G 1.03G / rpool/ROOT/install@patching 59.5K - 1.03G - rpool/ROOT/patching 120K 5.73G 1.03G / rpool/dump 560M 5.73G 560M - rpool/export 44K 5.73G 23K /export rpool/export/home 21K 5.73G 21K /export/home rpool/swap 512M 6.14G 100M -
Let’s see what lustatus shows us now…
solaris:~# lustatus Boot Environment Is Active Active Can Copy Name Complete Now On Reboot Delete Status -------------------------- -------- ------ --------- ------ ---------- install yes yes yes no - patching yes no no yes -
So we have two boot environments.
LU has a nice feature of letting you mount a BE to do ‘work’ on it. Let’s see what’s mounted, and then mount our newly created ‘patching’ BE…
solaris:~# lumount install on / solaris:~# lumount patching /.alt.patching solaris:~# lumount install on / patching on /.alt.patching
So now our alternate boot environment is mounted as /.alt.patching, we can go ahead and patch it. pca supports patching to an alternative root with the -R switch, much like Solaris packaging tools…
solaris:~# pca -i -R /.alt.patching [snip] ------------------------------------------------------------------------------ 141505 04 < 07 RS- 28 SunOS 5.10_x86: ipf patch Looking for 141505-07 (29/84) Trying SunSolve Trying https://sunsolve.sun.com/ (1/1) Done Installing 141505-07 (29/84) Unzipping patch Running patchadd Done Reboot recommended ------------------------------------------------------------------------------ [snip] ------------------------------------------------------------------------------ Download Summary: 84 total, 84 successful, 0 skipped, 0 failed Install Summary : 84 total, 84 successful, 0 skipped, 0 failed
This could take a while.
When the patching is complete, unmount the BE and set it to be the active one on the next reboot…
solaris:~# luumount patching solaris:~# lumount install on / solaris:~# luactivate patching System has findroot enabled GRUB Generating boot-sign, partition and slice information for PBE Saving existing file in top level dataset for BE as //etc/bootsign.prev. A Live Upgrade Sync operation will be performed on startup of boot environment. Generating boot-sign for ABE Saving existing file in top level dataset for BE as //etc/bootsign.prev. Generating partition and slice information for ABE Copied boot menu from top level dataset. Generating multiboot menu entries for PBE. Generating multiboot menu entries for ABE. Disabling splashimage Re-enabling splashimage No more bootadm entries. Deletion of bootadm entries is complete. GRUB menu default setting is unaffected Done eliding bootadm entries. ********************************************************************** The target boot environment has been activated. It will be used when you reboot. NOTE: You MUST NOT USE the reboot, halt, or uadmin commands. You MUST USE either the init or the shutdown command when you reboot. If you do not use either init or shutdown, the system will not boot using the target BE. ********************************************************************** In case of a failure while booting to the target BE, the following process needs to be followed to fallback to the currently working boot environment: 1. Boot from Solaris failsafe or boot in single user mode from the Solaris Install CD or Network. 2. Mount the Parent boot environment root slice to some directory (like /mnt). You can use the following command to mount: mount -Fzfs /dev/dsk/c0d0s0 /mnt 3. Run utility with out any arguments from the Parent boot environment root slice, as shown below: /mnt/sbin/luactivate 4. luactivate, activates the previous working boot environment and indicates the result. 5. Exit Single User mode and reboot the machine. ********************************************************************** Modifying boot archive service Propagating findroot GRUB for menu conversion. File propagation successful File propagation successful File propagation successful File propagation successful Deleting stale GRUB loader from all BEs. File deletion successful File deletion successful File deletion successful Activation of boot environment successful.
Notice the message about what to do to recover the old session should the boot fail. Personally I keep a copy of that notice to hand, just in case. Evernote is particularly handy I find.
So if we now look at lustatus, we can see our patching BE is the active on reboot…
solaris:~# lustatus Boot Environment Is Active Active Can Copy Name Complete Now On Reboot Delete Status -------------------------- -------- ------ --------- ------ ---------- install yes yes no no - patching yes no yes no -
So let’s go ahead and reboot at a time that suits us. When the system comes back up we can see ‘patching’ is now the active BE…
solaris:~# lustatus Boot Environment Is Active Active Can Copy Name Complete Now On Reboot Delete Status -------------------------- -------- ------ --------- ------ ---------- install yes no no yes - patching yes yes yes no -
And pca shows us there are no patches to be applied, so we’re up to date…
solaris:~# pca -l Using /var/tmp/patchdiag.xref from Mar/02/10 Host: solaris (SunOS 5.10/Generic_142901-05/i386/i86pc) List: missing (0/0)
zfs list shows us that the patching snapshot is now using up space too…
solaris:~# zfs list NAME USED AVAIL REFER MOUNTPOINT rpool 2.82G 5.00G 36.5K /rpool rpool/ROOT 1.77G 5.00G 21K legacy rpool/ROOT/install 31.3M 5.00G 1.05G / rpool/ROOT/patching 1.74G 5.00G 1.37G / rpool/ROOT/patching@patching 377M - 1.03G - rpool/dump 560M 5.00G 560M - rpool/export 44K 5.00G 23K /export rpool/export/home 21K 5.00G 21K /export/home rpool/swap 512M 5.40G 100M -
Using a recent Solaris 10, with ZFS root, LU and pca gives us a very realistic way of patching systems in a working, production, environment without the pain of downtime and with a workable roll back strategy.
Now, if I had time to write something to centralise this for many hosts, it would make a fantastic enterprise setup :)
8 comments for this entry:
Trackbacks / Pingbacks
-
[...] following pages and learned a lot… How to make and mount a clone of the BE to apply patches to: Patching a live Solaris 10 system with LU, ZFS, and PCA | Probably Some basics of live upgrade: 11.Maintaining Solaris Live Upgrade Boot Environments (Tasks) [...]
June 24th, 2010 on 7:54 pm
This is fantastic! My only question is, what happens next? You’re now booting into the snapshot, “patching,” forever? How about the next time you need to patch?
-James
June 24th, 2010 on 8:11 pm
Hi James,
You can delete the original BE and even rename this one if you chose. See man ludelete and lurename.
Cheers! –Mark
June 24th, 2010 on 8:20 pm
Thanks Mark!
I think the best use of this, for me, would be to have my first step be something like this:
lucreate -n patched20100623
Then if I patch next month, have it be:
lucreate -n patched20100723
etc. that way I have specific places to roll back to. Does this make any sense? We have plenty of disk space, so i’m not worried about having zfs snapshots pile up. I was just a little confused about permanently booting off a snapshot. ZFS is a little new to us, and I thought I had it all figured out, but this struck me a little funny.
Bottom line, though, is that it’s perfectly fine to leave it booting to that “patching” snapshot? And the next time I do the same thing it will create a new patching snapshot from the one I just patched?
-James
June 24th, 2010 on 8:27 pm
Yep, perfectly sensible way of working James. The beauty of snapshots is they don’t take up much space either.
June 24th, 2010 on 8:31 pm
Thanks for the snappy replies, Mark! I’ll be reading through your other blog entries as you seem to be doing a lot of the same tasks as me.
thanks again for your help!
-james
June 24th, 2010 on 8:35 pm
my only problem now is that i’m getting “Error 403″ on 297 of the patches that I need. I have two contracts listed in the sunsolve page that I added this afternoon. One of them, I think, is expired. I added that one first. Now it’s telling me that I have access to:
Solaris9SoftwareUpdates
Solaris10SoftwareUpdates
ContractRequired
OpenSolarisProductionPackage
Solaris8SoftwareUpdates
HardwareUpdates
SolarisSoftwareUpdates
Public
but i’m still getting 403. (I got 7, i think, successful the first time I ran it. These 297 were the ones that 403′ed the first time) Have any idea if it takes some time for Sun to update everything? The machine I’m working on does NOT have a contract, by the way. It’s our test machine (prepping for the patching on the contracted machine tomorrow) Perhaps sunsolve knows the machine’s serial? I know this isn’t your area, but figured i’d ask in case you had any experience with it.
-James
June 24th, 2010 on 8:40 pm
Unfortunately I can’t help you there James, sorry. I’m always working on machines with an enterprise subscription to SunSolve (lucky me).