Monday, December 02, 2013

Watching Accumulo Recover From a Killed Master Process In a Multi-Master Configuration.

Accumulo can easily run in a multiple-master configuration. This post shows how to watch it recover when a master process is killed.

The steps below show how to convert from a single-master cluster to a two-master cluster. Then you'll kill the active master and watch the monitor page to see Accumulo automatically switch to the backup master.
  1. Start a cluster with a master and two nodes using https://github.com/medined/Accumulo_1_5_0_By_Vagrant.
  2. vagrant ssh master
  3. cd accumulo_home/bin/accumulo 
  4. bin/stop-all.sh
  5. echo "affy-slave1" >> conf/masters
  6. bin/start-all.sh
  7. Visit http://affy-master:50095/master to see which node is the current master. Note that you are connecting to the monitor process not the master process. Don't let the hostnames confuse you.
  8. Enable auto-refresh.
  9. SSH to whichever node is listed as the master.
  10. ps fax | grep app=master | grep -v grep | cut -d' ' -f1 | xargs kill -9
  11. Visit http://affy-master:50095/master and you should see a 'Master Server Not Running' message. Reload the page if needed.
  12. Within a few seconds, the alternate master process should be active.
Normally you'd copy the conf/masters to all nodes. However for this tiny demonstration it is not needed.

Restarting the killed master process is easy. Following the steps below:
  1. vagrant ssh master
  2. cd accumulo_home/bin/accumulo 
  3. bin/start-all.sh