Tuesday, August 9, 2011

Map the guest ldom name to the operating system hostname

Keeping the documentation of the IT infrastructure by hand up to date is not really a lot of fun nor exciting, but still necessary. More automation in this area would be nice. After looking for an "Enterprise solution" for some months, I think there is no off-the-shelf solution which fulfills all our needs.

Therefore I started to develop a basic discovery script, which should discover all important information of our Solaris and Linux servers and feeds the info in our CMDB. Most of the stuff is straight forward and at best a opportunity to improve my Regex skills, but there are also some challenges which could be interesting for others.

For example, finding out the hostname of a guest ldom (now called Oracle VM Server for SPARC guest VM), which is configured inside the ldom. The real hostname is much more interesting for us than the name of the virtual hardware.

In Solaris 10 update9 you have now the virtinfo utility, which is sometimes able to display the name of the control domain from inside the guest domain, you could use this information to create a mapping.

For our use case this method does not really help, the only possible solution I could find, was by connecting directly with telnet to the console of the guest ldoms. As you can see in the following listening, the hostname "virtualserver1" is printed in the login prompt.

telnet localhost 5000
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

Connecting to console "ldom1" in group "ldom1" ....
Press ~? for control options ..

virtualserver1 console login:

Telnet is not the most friendly and stable interface to script, but with python it's quickly done.

ldm_list_raw = exec_cl("ldm list -p | grep state=active")
consoles = re.findall(r'cons=([0-9]+)', ldm_list_raw)

print "Console: \tHostname of guest ldom"
print "-------- \t----------------------"
for console in consoles:
    tn = telnetlib.Telnet("127.0.0.1",console)

    tn.write("\n\n\n")
    mo = tn.expect([r' (\S*) console login',r'} (ok)'],2)
    tn.close()
    if mo[1]:
        ldom_hostname = mo[1].group(1)
        if ldom_hostname == 'ok':
            print "%s: \t\tOBP" % console
        else:
            print "%s: \t\t%s" % (console,ldom_hostname)

This basic script generates the following simple output, if executed from the control domain:

python list-ldom-hostnames.py 

Console:        Hostname of guest ldom
--------        ----------------------
5000:           virtualserver1
5002:           OBP
5003:           virtualserver3
5004:           Virtualserver4

Now you can also see a limitation of this method, if the operating system is not running, you won't get the hostname. E.g. the guest ldom on console 5002 from the example is turned on but not booted, it's waiting in the OBP.

Full prove of concept script: list-ldom-hostnames.py

Tuesday, September 7, 2010

Update on OpsCenter proxies in local zones

Some time ago, OK a long time ago, I talked about installing an OpsCenter proxy in a local zone on Solaris. I noted that it is a bad idea and ugly. Especially upgrading JET is somehow a no-go, if you still want support for your application. My hope during this time was, that version 2.5 will support local zones.

And in deed, the support was added, but only for the satellite controller. So I was looking for a way to improve the solution to run the proxy in a zone. The technical limitation which prevents the standard installation of the proxy in a zone, is the NFS server which is used by JET. It's simply not possible to run the Solaris NFS server in a local zone.

To overcome this limitation I tried unfs3 an opensource userspace NFS server, the installation was not so tricky compared to the solution with the remote NFS-server and JET 4.7, but there are some pitfalls. And during runtime the proxy runs really fine, at the moment we use such a proxy setup with around 80 connected agents.

But there is one big disadvantage,  if you upgrade OpsCenter, the upgrade scripts will have some problems with the setup. It seems that the SUN/Oracle engineers don't want to support this homegrown setup. I am tired of fixing/adapting this upgrade scripts, so we are slowly switching all proxies to LDOMs now. After a few months it seams that the proxy runs really fine in an own LDOM and it's also completely officially supported.

Lessons learned:

  • Don't run an OpsCenter proxy in a local zone, as long as it is not officially supported.
  • JET/Jumpstart: after troubleshooting some scripts of them, I like to say ... please re-implement this ancient software, please!!
  • unfs3 is pretty neat if you feel the need to run a NFS server in a solaris local zone.
  • Guest LDOMs are fine to run OpsCenter proxies.

Saturday, October 10, 2009

Publishing my bachelor thesis on Solaris network virtualization

After finishing my bachelor thesis, my plan was to publish the interesting parts of the thesis in own blog entries. But so far, I couldn't find the time to do this. So I'm just publishing the complete document now.

I hope it could help someone to get a quick overview on the Solaris/Opensolaris network stack and it's virtualization capabilities, especially Crossbow.

I'm open and glad for every feedback.

So far the document is only available in German:
Solaris-Netzwerkvirtualisierung.pdf (pdf)

Abstract

The topic of this work is Network-Virtualization on the Solaris OS. It starts with a brief look at the Solaris 10 network stack, with its features and capabilities. Its historical development over the years is also covered. The actual generation of the stack with the name „Crossbow” is the main topic of this work. Crossbow expands the stack with virtualization features, with considering performance and simple management. The architecture and the technical implementation of Crossbow is also covered. To prove the capabilities of the stack, several benchmarks are performed. The benchmarks look especially at the throughput and the CPU usage.

Crossbow has only been released in June 2009 along with Opensolaris 2009.06, so this is one of the first academic works which covers this topic.

Wednesday, June 17, 2009

Howto install a xVM Ops Center proxy in a local zone.

xVM Ops Center 2.1 has a well designed and flexible architecture with its satellite controller, proxies and agents. But the zoning support is quite poor. It is only supported to put the satellite controller in a local zone, the proxy needs to be installed in a global zone.

If Solaris container are your preferred virtualization technology, like in my company, it could be that you don't like it, if you have to deploy a proxy in the global zone along with many local zones. So the only solution is perhaps, to put the proxy on an exclusive server hardware. If you consider HA, multiple datacenters, costs and the very low resource utilization of the proxy services itself, you will not love this solution, too.

AFAIK, the proxy of 2.1 can't run in a local zone, because the shipped JET needs a local NFS-server and the Solaris NFS-Server can't run in a local zone. But luckily, the latest version of JET (4.7) now supports remote NFS-servers.

This Howto outlines how to install the xVM Ops Center proxy in a local zone, how to upgrade to JET 4.7 and how to use a remote NFS-server. This setup works quite fine in our environment, so far.

Please consider, that this Howto should only be a proof of concept, Ops Center is not indented for editing scripts by your own. Of course you will get into trouble if you open support requests at SUN for your modified Ops Center setup. If you want support for a zoned proxy, open a feature request at SUN. If enough people request this feature, it will get a higher priority and I guess it will be implemented soon.


HOWTO


Consider for reading

10.156.64.20 is the IP of the remote NFS-Server.

10.156.64.42 is the IP of the XVMOC Proxy.

10.156.64.41 is the IP of the XVMOC Enterprise Controller.


Install Zone


Some devices are needed in the zone.

The ISC-dhcpd needs the NIC cloning device, for e1000g, e.g. /dev/e1000g. Some JET-scripts need also the devices for lofiadm (/dev/lofictl, /dev/lofi/*, /dev/rlofi/*).

For DHCP also exlusive NICs are needed, you need for each (v)lan a NIC, e.g. e1000g2 for the „ILO“-lan and e1000g1 for the OC/JET/NFS-lan.

So create a zone config:

bash-3.00# zonecfg -z proxy info
zonename: proxy
zonepath: /zones/proxy

brand: native
autoboot: false
bootargs:
pool:
limitpriv:
scheduling-class:
ip-type: exclusive
net:
address not specified
physical: e1000g2
defrouter not specified
net:
address not specified
physical: e1000g1
defrouter not specified
device
match: /dev/e1000g
device
match: /dev/lofictl
device
match: /dev/lofi/*
device
match: /dev/rlofi/*


Install and boot it:


zoneadm -z proxy install
zoneadm -z proxy boot

Configure it:

zlogin -C proxy

Config for e1000g2

Use DHCP: No
Host name: proxy
IP address: 10.156.64.42
System part of a subnet: Yes
Netmask: 255.255.240.0
Enable IPv6: No
Default Route: Specify one
Router IP Address: 10.156.64.1

After configuring with „zlogin -C“, setup also the second interface for the „ILO“-lan.

Persitent or e.g

ifconfig e1000g1 10.156.16.42/20 up

Install TFTP-Server

Packages should be already installed, copy the SMF manifest from the global zone, to the proxy zone and import it:

svccfg import tftp-udp6.xml

Install Proxy

You need the XVMOC 2.1 installer. I don't know why, but you have to create some directories before executing the installer:

mkdir -p /opt/sun/n1gc/pkgs/var/sadm/install
mkdir -p /var/opt/sun/xvm/osp/web/pub/
mkdir -p /var/opt/sun/xvm/osp/ssh/

bash-3.00# ./install -p
Sun Microsystems, Inc. Binary Code License Agreement


xVM Ops Center Proxy Server Installer (version 2.1.0.900 on SunOS)


1. Install Expect. [Completed]
2. Install IPMI tool. [Completed]
3. Install Agent components. [Completed]
4. Install application packages. [Completed]
5. Install Core Channel components. [Completed]
6. Install Proxy Server components. [Completed]
7. Install UCE Http proxy. [Completed]
8. Install OS provisioning components. [Completed]
9. Initialize (but do not start) services. [Completed]

xVM Ops Center Proxy Server installation is complete.
xVM Ops Center Proxy Server is now ready to be configured.



Now upgrading to JET 4.7 is necessary

pkgrm JetFLASH
pkgrm SUNWjet

Download Jet 4.7 and install only the SUNWjet and the JetFLASH package. Ignore share_nfs errors.

pkgadd -d jet.pkg

Migrate to remote NFS-Shares

On NFS-Server create following directories:

mkdir /opt/SUNWjet
mkdir /var/js
mkdir -p /var/opt/sun/xvm/osp/share/allstart

On NFS-Server, share this directories, the proxy zone needs write access, the future JET-clients only read access.

share -F nfs -o ro,anon=0,rw=proxy -d "Allstart Share" /var/opt/sun/xvm/osp/share/allstart
share -F nfs -o ro,anon=0,rw=proxy -d "Allstart Share" /var/js
share -F nfs -o ro,anon=0,rw=proxy -d "JET Framework" /opt/SUNWjet

On Proxy Zone:

  1. Mount this shares somewhere.

  2. Move the local content to the mounted shares

  3. Delete the local content, e.g. rm /opt/SUNWjet/*

  4. Unmount the NFS-shares

  5. Remount the NFS-Shares on the now emtpy mountpoints (/var/opt/sun/xvm/osp/share/allstart, /var/js, /opt/SUNWjet)
    mount -F nfs nfsserver:/opt/SUNWjet /opt/SUNWjet
    mount -F nfs nfsserver:/var/js /var/js
    mount -F nfs nfsserver:/var/opt/sun/xvm/osp/share/allstart \
    /var/opt/sun/xvm/osp/share/allstart



Configure and start xvmoc proxy

See also: XVMOC-WIKI http://wikis.sun.com/display/xvmOC2dot1/Configuring+an+Enterprise+Controller+for+Updates+%28Optional%29

/opt/SUNWxvmoc/bin/proxyadm configure -s 10.156.64.41 -u root -p /var/tmp/xVM/mypasswd -a 10.156.64.42



Edit file: /opt/SUNWjet/etc/jumpstart.conf

JS_CFG_SVR=10.156.64.20
JS_CLIENT_BOOT="remote"
JS_PKG_DIR="10.156.64.20:/var/js/pkg"
JS_PATCH_DIR="10.156.64.20:/var/js/patch"



For some reason the „add_install_client“ from the Solaris Update6 DVD checks if the JET-filesystems are local ones, this breaks the installation, so we have to fix it. The Fix manipulates the return value. This procedure is necessary after every „OS-Image import“.

File: /var/js//Solaris_10/Tools/add_install_client

under the line „df -l ${IMAGE_PATH} > /dev/null 2>&1“ insert:

echo "====FIX====="

so it looks like:

df -l ${IMAGE_PATH} > /dev/null 2>&1
echo "====FIX====="
if [ $? -ne 0 ] ; then


Finish

Now create a OS-Profile for Solaris 10 with this extra custom JET-parameters:

base_config_media=10.156.64.20
base_config_client_boot=remote



You should be able to deploy this OS-Profile, now.

Wednesday, May 27, 2009

Summary of the new feature in Opensolaris 2009.06

Peter Dennis created a nice presentation for Community One, about the new features in OpenSolaris 2009.06

Have a look: wikis.sun.com

Sunday, May 17, 2009

3MobileTV now in HD

A small update on 3MobileTV on my G1. I mentioned two weeks ago, that the 3MobileTV live-streams are working on my G1, but only the non-HD versions.

After upgrading the firmware to 1.5, I tried the HD-streams again. I use the firmware from JF, perhaps there is some additional magic in it, but with this firmware, the HD-streams are working fine!
HD is really a big difference, my G1 screen seems now a lot larger :)

My next wish is now, some kind of buffering. Because atm, it seems there is no buffering at all. If there is a performance issue, the player stucks and sometimes you have to restart the stream.

Wednesday, May 13, 2009

Oracle buys Virtual Iron

Oracle did it again, they bought one of the last available Xen specialists on the virtualization market. Today, they announced the acquirement of Virtual Iron.

Now, Oracle owns three Xen-based virtualization solutions: Oracle VM, xVM-Server and Virtual Iron. It seems clear, that Xen is the strategic virtualization technology for them. Which could mean, that Xen has a great future, although Redhat, Ubuntu and some other Linux distributions use the rival technology KVM.

But, I have to say, I don't feel very well after reading the announcement (FAQ):

. . .
How will Virtual Iron fit into Oracle’s overall virtualization management strategy?

The combination of Oracle and Virtual Iron supports Oracle’s strategy to provide complete, full stack management across the virtual and physical enterprise and is expected to provide customers with comprehensive and dynamic virtualization management. The combined suite of products is expected to simplify the deployment and configuration of physical servers, virtual machines, and applications while providing a highly available platform for hosting Oracle software and other enterprise applications. In addition, dynamic virtualization management software will help maintain virtual machine performance, improve data center utilization, and optimize power consumption. Virtual Iron combined with Oracle VM and Enterprise Manager, is expected to provide a functionally rich virtualization management product suite that can efficiently manage the entire software stack across virtual and physical environments to make applications easier to deploy, manage, and support.
...

Oracle Enterprise Manager should be extended to manage Xen virtual machines?
"... management across the virtual and physical enterprise ..."? - Wait I've heart about it, ... Sun xVM Ops Center! Right?, Not?

I really don't know Oracle Enterprise Manager, but according to this press release, it sounds, that Oracle Enterprise Manager should become very similar to Sun xVM Ops Center.

xVM Ops Center 2.1 is today in a quite good shape, but xVM-Server is only as beta available, I guess it well take some time, until it's ready.

Another interesting question for Oracle's virtualization strategy is: Will they use Linux for the Xen dom0 or their own Solaris, in the future? ATM, Linux is used by Oracle VM and Virtual Iron. Solaris is used for xVM-Server. Solaris in dom0 would mean: ZFS, Dtrace, SMF, FM, Crossbow and a lot more great Solaris technologies, but does it matter for Oracle?

There are potential conflicts with the xVM portfolio, a clear roadmap for virtualization, especially for xVM, is strongly needed. Oracle, Sun are you listening?