It’s been about 2 months since my last post, so let me fill you in on something I’ve been slowly working on.
While doing research in my last semester as an undergrad at Marist College and working for EMC as an SE Intern I came up with what i thought would be a pretty neat idea. The idea started with the fact that I had been working with Ganglia Monitoring System and then I came across a gentlemen names Brian Bockelman from Nebraska. I had a brief conversation about gridFTP with him, how it would be nice to monitor the hosts running gridFTP and be able to react to load on the network using the network controller. Thus, where host-aware networking came from.
“Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. It leverages widely used technologies such as XML for data representation, XDR for compact, portable data transport, and RRDtool for data storage and visualization. It uses carefully engineered data structures and algorithms to achieve very low per-node overheads and high concurrency. The implementation is robust, has been ported to an extensive set of operating systems and processor architectures, and is currently in use on thousands of clusters around the world. It has been used to link clusters across university campuses and around the world and can scale to handle clusters with 2000 nodes.”
Because ganglia uses RRD (Round Robin Database) files to store time sensitive information about a specific host, using this, each host can store information about their network load, cpu usage, memory that is free, etc. What I wanted to do was mux the SDN environment with the scale-out Ganglia Monitored network so I could make network decisions based on data I was getting back from the monitored hosts.
I wanted to accomplish a few simple tasks,
- The controller of the network should be “aware” of which hosts are monitored by Ganglia
- The controller should be able to “poll” the data from the hosts it knows are being monitored.
- The data should relate to “thresholds” set by a network admin, so when a threshold is met, the network reacts via the controller.
Off the bat I needed a controller, I’ve used Floodlight before and it has a great open community for developing opensource, so I pulled the master off git and threw it in my development environment. My development environment consisted of 2 x86 boxes that ran KVM and Openvswitch. (this could certainly work for monitoring compute nodes within an openstack environment, or even guests within a specific tenant). Openvswitch provides my openflow connections to Floodlight as well as the data flows between virtual interfaces on KVM. Here is a diagram that should visual the dev environment.
This environment is pretty simple, but does the trick. I ran the Floodlight Controller inside the KVM hypervisor on one host and just Ubuntu VMs for the rest. Download or use apt-get to install ganglia.
On nodes I wanted the monitor on I ran
sudo apt-get install ganglia-monitor
On nodes I wanted the monitor and the gmetad collector on, I ran
sudo apt-get install ganglia-monitor gmetad
Then for the controller node,
sudo apt-get install ganglia-monitor gmetad ganglia-webfrontend
Take note that the node in which Floodlight sits on, also has the gmetad server on it. This is because ganglia metrics from the different cluster are collected in /var/lib/ganglia/rrds/ and the Ganglia Modules will look for this directory. This setting is also configurable incase you set Ganglia to collect them somewhere else. The gmetad server can also export its directory via NFS and can mount on the controller node if you didn’t want to run gmetad on the Floodlight host. I want to eventually have the controller connect to a rrd socket but I thought this was unnecessary for a PoC.
I configured the gmonds to speak UDP to limit network traffic, but ultimately you can use Multicast or UDP, the Ganglia setup really doesn’t matter too much, only that a gmetad directory be located where the controller resides.
Once the environment was setup I could start to dive into development but there were a few major design choices I had to consider before I stated to do so.
1) How would I read RRD files from the underlying filesystem? Meaning what interfaces were out there, should I make my own, what RRD functions do I need?
2) What methods was I going to take to consistently poll the data?Are there priorities?, variable polling times? Timing?
My design decisions led me to these conclusions.
1) Thre are a few java interfaces rrd4j, jrobin, java-rrd-hg, and jrrd. Ultimately, I needed to be able to read RRD files with filters like average, max, min in mind. jrrd was the right fit, I wind up using the interface and extending its methods into more useful ones in the module, but it was the choice that works best at the time. It has a few dependencies
I had thought about running cron jobs to dump the RRD files to XML ever so often and read it via streaming or DOM based XML interfaces with java. This wind up getting thrown out the windows for a few different reasons.
2) I decides to represents “Monitored Hosts” and “Ganglia Rules” as objects within the modules, this abstraction allows me to associate rules with hosts, rules can also provide a “pollingTime” variable which tells the controller how often to poll the host for rule thresholds. Once a threshold is met, the “Action” is then carried out, which could be to push static flow, add a firewall rule, drop traffic etc. Essentially anything the controller can do. Priorities and timing were a must for rules as well.
Metrics that can be monitored by default are:
boottime System boot timestamp l,f bread_sec bwrite_sec bytes_in Number of bytes in per second l,f bytes_out Number of bytes out per second l,f cpu_aidle Percent of time since boot idle CPU l cpu_arm cpu_avm cpu_idle Percent CPU idle l,f cpu_intr cpu_nice Percent CPU nice l,f cpu_num Number of CPUs l,f cpu_rm cpu_speed Speed in MHz of CPU l,f cpu_ssys cpu_system Percent CPU system l,f cpu_user Percent CPU user l,f cpu_vm cpu_wait cpu_wio disk_free Total free disk space l,f disk_total Total available disk space l,f load_fifteen Fifteen minute load average l,f load_five Five minute load average l,f load_one One minute load average l,f location GPS coordinates for host e lread_sec lwrite_sec machine_type mem_buffers Amount of buffered memory l,f mem_cached Amount of cached memory l,f mem_free Amount of available memory l,f mem_shared Amount of shared memory l,f mem_total Amount of available memory l,f mtu Network maximum transmission unit l,f os_name Operating system name l,f os_release Operating system release (version) l,f part_max_used Maximum percent used for all partitions l,f phread_sec phwrite_sec pkts_in Packets in per second l,f pkts_out Packets out per second l,f proc_run Total number of running processes l,f proc_total Total number of processes l,f rcache swap_free Amount of available swap memory l,f swap_total Total amount of swap memory l,f sys_clock Current time on host l,f wcache(And any added by you/your environment, this is a development effort to add it do rrd)
The workflow is essentially this:
- Enable Host-Aware Networking
- Add the hosts you want to become monitored using the REST interface. The parameters needed with be IP, DOMAIN and Hostname
- Add a Rule that defines metrics to be monitored, a threshold for those metrics and associate it with a Host that is actively monitored. Rules can also be “met” a certain amount of time before the controller action is carried out.
- You can then view the reactions to the metrics being polls at /hand/gangliahosts/messages (this is not final URI) but this will show INFO, WARN, THRESHOLD_MET, messaged for your hosts and what the controller did.
An example of how this would work is the following:
In the end I tried to code this project with a 2 month time frame, It is mostly done and still in test, but hopefully I can get it out the community to share some of the need things I was able to do by monitoring hosts within a Floodlight controller and reacting to metrics read by the controller.