Tuesday, October 02, 2012

How To Run Multiple Instances of Accumulo on One Hadoop Cluster

On the Accumulo User mailing list, Kristopher K. asked:
I built 1.5 from source last night and wanted to try it out on my existing Hadoop cluster without overwriting my current 1.4 set. Is there a way to specify the /accumulo directory in HDFS such that you can run multiple instances?

Eric N. replied:
From the monitoring user interface, see the Documentation link, then Configuration, see the first property:

instance.dfs.dir

You'll also change all the port numbers from the defaults. And there's a port number in conf/generic_logger.xml that points to the logging port on the monitor.

For example, here are some entries from my conf/accumulo-site.xml file:

<property>
<name>master.port.client</name>
<value>10010</value>
</property>

<property>
<name>tserver.port.client</name>
<value>10011</value>
</property>

<property>
<name>gc.port.client</name>
<value>10101</value>
</property>

<property>
<name>trace.port.client</name>
<value>10111</value>
</property>

<property>
<name>monitor.port.client</name>
<value>11111</value>
</property>

<property>
<name>monitor.port.log4j</name>
<value>1560</value>
</property>

And conf generic_logger.xml:

<!-- Send all logging data to a centralized logger -->
<appender name="N1" class="org.apache.log4j.net.SocketAppender">
<param name="remoteHost" value="${org.apache.accumulo.core.host.log}"/>
<param name="port" value="1560"/>
<param name="application" value="${org.apache.accumulo.core.application}:${org.apache.accumulo.core.ip.localhost.hostname}"/>
<param name="Threshold" value="WARN"/>
</appender>
Post a Comment