How to collect Hadoop metrics with Chukwa (Part I)

Chukwa is an Apache Project which is built on top of the HDFS and MapReduce framework. According to the Chukwa site, it is an open source data collection system for monitoring of distributed systems and more specifically Hadoop clusters.In this post (which is the first of two), we’ll install and configure Chukwa in a standalone scheme over the HDFS and then on HBase. So before continuing with this post, it is highly recommended read the Chukwa Architecture.

For this, we’re going to use:

  • Hadoop 1.0.4
  • HBase 0.94.1
  • Chukwa 0.5.0
  • Java SE 1.6 update 37 o superior

Next, it’s presented a description of Chwukwa components:

  • Agents are process that run in each computer and emit data to Collectors. Data emitted is generated by means of adapters. Adapters generally wrap some other data source, such as a file or a Unix command-line tool from which the information is extracted.
  • Collectors receive data from the agent and write it to stable storage. According to Chukwa site, rather than have each adaptor write directly to HDFS, data is sent across the network to a collector process, that does the HDFS writes. Each collector receives data from up to several hundred hosts.
  • ETL Processes for parsing and archiving the data. Collectors can write data directly to HBase or sequence files in HDFS.  Chukwa has a toolbox of MapReduce jobs for organizing and processing incoming data. These jobs come in two kinds, Archiving and Demux.
  • Data Analytics Scripts for aggregate Hadoop cluster health. These scripts provide visualization and interpretation of health of Hadoop cluster.
  • HICC, the Hadoop Infrastructure Care Center; a web-portal style interface for displaying data. Data is fetched from HBase, which in turn is populated by collector or data analytic scripts that runs on the collected data, after Demux.

This post is based on the Chukwa Administration Guide and they are included a few comments as configuration examples and compatibility options. In addition, it’s important to mention that we are working with Hadoop 1.0.4 and Chukwa 0.5.0 versions which can be download at this link. Thus, in this first part we will describe the basic setup of Chukwa, which is made up of three components: agents, collectors and ETL process.

A. Agent configuration

  1. Obtain a copy of Chukwa. You can find the latest release on the Chukwa release page.
  2. Un-tar the release, via tar xzf.
  3. We refer to the directory containing Chukwa as CHUKWA_HOME. It may be helpful to set CHUKWA_HOME explicitly in your environment, but Chukwa does not require that you do so.
  4. Make sure that JAVA_HOME is set correctly and points to a Java 1.6 JRE. It’s generally best to set this in etc/chukwa/
  5. In etc/chukwa/, set CHUKWA_LOG_DIR and CHUKWA_PID_DIR to the directories where Chukwa should store its console logs and pid files. The pid directory must not be shared between different Chukwa instances: it should be local, not NFS-mounted.
  6. Optionally, set CHUKWA_IDENT_STRING. This string is used to name Chukwa’s own console log files. Next, it is presented a file example.
# The java implementation to use. Required.
export JAVA_HOME=/usr/java/jdk1.6.0_37

# Optional
# The location of HBase Configuration directory. For writing data to
# HBase, you need to set environment variable HBASE_CONF to HBase conf
# directory.

# Hadoop Configuration directory
export HADOOP_CONF_DIR="/usr/local/hadoop-1.0.4/conf";

# The location of chukwa data repository (in either HDFS or your local
# file system, whichever you are using)
export chukwaRecordsRepository="/chukwa/repos/"

# The directory where pid files are stored. CHUKWA_HOME/var/run by default.
export CHUKWA_PID_DIR=/tmp/chukwa/pidDir

# The location of chukwa logs, defaults to CHUKWA_HOME/logs
export CHUKWA_LOG_DIR=/tmp/chukwa/log

# The location to store chukwa data, defaults to CHUKWA_HOME/data

# Instance name for chukwa deployment
export JAVA_PLATFORM=Linux-i386-32

# Datatbase driver name for storing Chukwa Data.

# Database URL prefix for Database Loader.

# HICC Jetty Server heap memory settings
# Specify min and max size of heap to JVM, e.g. 300M

# HICC Jetty Server port, defaults to 4080

Note. It is important to mention that in this first part we are NOT going to work with HBase as repository of data collected, instead we are going to store it into the HDFS.

  1. Agents sends data collected to a random collector from a list of collectors. So, it’s necessary to indicate what the collector list is. The collector list is specified in the $CHUKWA_HOME/etc/chukwa/collectors file, so the file should look something like:

Our collectors file only contains localhost, example:

  1. Other file that should be modified is $CHUKWA_HOME/etc/chukwa/chukwa-agent-conf.xml. The most important value to modify is the cluster/group name which identifies the monitored source nodes. This value is stored in each Chunk of collected data and it can be used to distinguish data coming from different clusters. Our chukwa-agent-conf.xml looks like:
    The cluster's name for this agent
    The socket port number the agent's control interface can be contacted at.

    The hostname of the agent on this node. Usually localhost, this is used by the chukwa instrumentation agent-control interface library
    the prefix to to prepend to the agent's checkpoint file(s)
    the location to put the agent's checkpoint file(s)

    the frequency interval for the agent to do checkpoints, in milliseconds

    the number of post attempts to make to a single collector, before marking it failed

    the number of attempts to find a working collector

    the number of milliseconds to wait between searches for a collector


Note. It’s important to comment that it’t necessary to open the 9093, 9095 and 9097 ports  in our firewall to be able to connect to the agent.

  1. Configuring Hadoop for monitoring. One of the key goals for Chukwa is to collect logs from Hadoop clusters. The Hadoop configuration files are located in HADOOP_HOME/etc/hadoop. To setup Chukwa to collect logs from Hadoop, we  need to change some of the Hadoop configuration files.
  • Copy CHUKWA_HOME/etc/chukwa/ file to HADOOP_CONF_DIR/
  • Copy CHUKWA_HOME/etc/chukwa/ file to HADOOP_CONF_DIR/
  • Edit HADOOP_HOME/etc/hadoop/ file and change “hadoop.log.dir” to your actual CHUKWA log dirctory (ie, CHUKWA_HOME/var/log)

Note. To avoid the following error; log4j:ERROR Could not instantiate class [org.apache.hadoop.chukwa.inputtools.log4j.ChukwaDailyRollingFileAppender], we should copy the chukwa-client-xx.jar and json-simple-xx.jar into hadoop/lib directory. These files should be available on all hadoop nodes.

  1. Collector configuration. Since we are going to use HDFS for data storage in this tutorial, we must disable the HBase options and work only with the HDFS configuration parameters like writer.hdfs.filesystem. This should be set to the HDFS root URL on which Chukwa will store data. Next it’s presented an example of the chukwa-collector-conf.xml file.


    Chukwa local data sink directory, see

    Local chukwa writer, see

  <!-- HBaseWriter parameters 
    org.apache.hadoop.chukwa.datacollection.writer.SocketTeeWriter,org.apache.hadoop.chukwa.datacollection.writer.hbase.HBaseWriter, org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter


    Demux parser class package, HBaseWriter uses this package name to validate HBase for annotated demux parser classes.

    Verify HBase Table schema with demux parser schema, log
    warning if there are mismatch between hbase schema and demux parsers.

    If this option is set to true, and HBase table schema 
    is mismatched with demux parser, collector will shut down itself.

    HDFS to dump to
    Chukwa data sink directory

    Chukwa rotate interval (ms)

    A flag to indicate that the collector should close at a fixed
    offset after every rotateInterval. The default value is false which uses
    the default scheme where collectors close after regular rotateIntervals.
    If set to true then specify chukwaCollector.fixedTimeIntervalOffset value.
    e.g., if isFixedTimeRotatorScheme is true and fixedTimeIntervalOffset is
    set to 10000 and rotateInterval is set to 300000, then the collector will
    close its files at 10 seconds past the 5 minute mark, if
    isFixedTimeRotatorScheme is false, collectors will rotate approximately
    once every 5 minutes

    Chukwa fixed time interval offset value (ms)

    The HTTP port number the collector will listen on

Note. Chukwa 0.5.0 includes the Hadoop libraries hadoop-core-1.0.0.jar and hadoop-test-1.0.0.jar to comunicate to IPC Server version 4. So it’s necessary to replace the above libreries with the hadoop-core-1.0.4.jar and hadoop-test-1.0.4.jar files located in the chukwa-0.5.0/share/chukwa/lib directory.

  1. Once modified our configuration files, we can start the services and collect data from chukwa. For this, we are going to start first the agent and then the collector as follow:
[bautista@Zen-UnderLinx chukwa-0.5.0]$ bin/chukwa agent
OK chukwaAgent.checkpoint.dir [File] = /tmp/chukwa/log/
OK chukwaAgent.checkpoint.interval [Time] = 5000
WARN: option chukwaAgent.collector.retries may not exist; val = 144000
chukwaAgent.connector.retryRate Time
chukwaAgent.sender.retries Integral
chukwaAgent.control.remote Boolean
WARN: option chukwaAgent.collector.retryInterval may not exist; val = 20000
chukwaAgent.sender.retryInterval Integral
chukwaAgent.connector.retryRate Time
chukwaCollector.rotateInterval Time
OK chukwaAgent.control.port [Portno] = 9093
WARN: option chukwaAgent.hostname may not exist; val = localhost
chukwaAgent.control.remote Boolean
chukwaAgent.checkpoint.enabled Boolean
chukwaAgent.sender.retries Integral
OK chukwaAgent.sender.fastRetries [Integral] = 4
WARN: option syslog.adaptor.port.9095.facility.LOCAL1 may not exist; val = HADOOP
adaptor.dirscan.intervalMs Integral
adaptor.memBufWrapper.size Integral
chukwaAgent.adaptor.context.switch.time Time
No checker rules for: chukwaAgent.tags
[bautista@Zen-UnderLinx chukwa-0.5.0]$
  1. Next we start the Hadoop services, like this:
[bautista@Zen-UnderLinx hadoop-1.0.4]$ bin/
starting namenode, logging to /usr/local/hadoop-1.0.4/bin/../logs/hadoop-bautista-namenode-Zen-UnderLinx.out
localhost: starting datanode, logging to /usr/local/hadoop-1.0.4/bin/../logs/hadoop-bautista-datanode-Zen-UnderLinx.out
localhost: starting secondarynamenode, logging to /usr/local/hadoop-1.0.4/bin/../logs/hadoop-bautista-secondarynamenode-Zen-UnderLinx.out
starting jobtracker, logging to /usr/local/hadoop-1.0.4/bin/../logs/hadoop-bautista-jobtracker-Zen-UnderLinx.out
localhost: starting tasktracker, logging to /usr/local/hadoop-1.0.4/bin/../logs/hadoop-bautista-tasktracker-Zen-UnderLinx.out
[bautista@Zen-UnderLinx hadoop-1.0.4]$
  1. Finally, we start the collector with the following command:
[bautista@Zen-UnderLinx chukwa-0.5.0]$ bin/chukwa collector
[bautista@Zen-UnderLinx chukwa-0.5.0]$ WARN: option may not exist; val = /chukwa
chukwaRootDir null URI
nullWriter.dataRate Time
WARN: option may not exist; val = /chukwa/temp
chukwaRootDir null
nullWriter.dataRate Time
chukwaCollector.tee.port Integral
WARN: option chukwaCollector.fixedTimeIntervalOffset may not exist; val = 30000
chukwaCollector.minPercentFreeDisk Integral
chukwaCollector.tee.keepalive Boolean
chukwaCollector.http.threads Integral
OK chukwaCollector.http.port [Integral] = 8080
WARN: option chukwaCollector.isFixedTimeRotatorScheme may not exist; val = false
chukwaCollector.writeChunkRetries Integral
chukwaCollector.showLogs.enabled Boolean
chukwaCollector.minPercentFreeDisk Integral
OK chukwaCollector.localOutputDir [File] = /tmp/chukwa/dataSink/
OK chukwaCollector.pipeline [ClassName list] = org.apache.hadoop.chukwa.datacollection.writer.SocketTeeWriter,org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter
OK chukwaCollector.rotateInterval [Time] = 300000
OK chukwaCollector.writerClass [ClassName] = org.apache.hadoop.chukwa.datacollection.writer.localfs.LocalWriter
OK writer.hdfs.filesystem [URI] = hdfs://localhost:9000
No checker rules for: chukwaCollector.outputDir
started Chukwa http collector on port 8080

[bautista@Zen-UnderLinx chukwa-0.5.0]$
  1. In a few minutes, we will see that chuckwa has collected some dataSinkArchives files which include different metrics like the next screenshots:

In the next post, we are going to modify our configuration files to storage the collected metrics into HBase.