LZO's licence (GPL) is
incompatible with Hadoop (Apache) and therefore one should install the LZO separately
in cluster to enable LZO compression in Hadoop and HBase. LZO compression format is
split-table compression. It provides the high compression and decompression
speed.
Perform the below steps to enable the LZO compression in Hadoop and
HBase:
1. Install the LZO development packages:
sudo
yum install lzo lzo-devel
2. Download the Latest LZO release using below command:
wget
https://github.com/twitter/hadoop-lzo/archive/release-0.4.17.zip
3. Unzip the downloaded bundle:
4. Change the current directory to the
extracted folder:
cd hadoop-lzo-release-0.4.17
5. Run the command to generate the native
libraries
6. Copy the generated jar and native libraries
to Hadoop and HBase lib directories.
cp build/hadoop-lzo-0.4.17.jar
$HADOOP_HOME/lib/
cp build/hadoop-lzo-0.4.17.jar
$HBASE_HOME/lib/
cp build/hadoop-lzo-0.4.17/lib/native/Linux-amd64-64/*
$HADOOP_HOME/lib/native/
cp build/hadoop-lzo-0.4.17/lib/native/Linux-amd64-64/*
$HBASE_HOME/lib/native/
7. Add the following properties in
core-site.xml file of hadoop.
<property>
<name>io.compression.codecs</name>
<value>
org.apache.hadoop.io.compress.DefaultCodec,
org.apache.hadoop.io.compress.GzipCodec,
org.apache.hadoop.io.compress.BZip2Codec,
org.apache.hadoop.io.compress.DeflateCodec,
org.apache.hadoop.io.compress.SnappyCodec,
org.apache.hadoop.io.compress.Lz4Codec,
com.hadoop.compression.lzo.LzoCodec,
com.hadoop.compression.lzo.LzopCodec
</value>
</property>
<property>
<name>io.compression.codec.lzo.class</name>
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>
8. Sync the hadoop and HBase Home directory on
all nodes of hadoop and hbase cluster.
rsync $HADOOP_HOME/ node1:$HADOOP_HOME/
node2:$HADOOP_HOME/
rsync $HBASE_HOME/ node1:$HBASE_HOME/
node2:$HBASE_HOME/
9. Add the HADOOP_OPTS variable in .bashrc file on all hadoop nodes:
export
HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native:$HADOOP_HOME/lib/"
10. Add the HBASE_OPTS variable in .bashrc file on all HBase nodes:
export
HBASE_OPTS="-Djava.library.path=$HBASE_HOME/lib/native/:$HBASE_HOME/lib/"
11. Verify the LZO compression in Hadoop:
a.
Create a LZO compressed file using lzop utility. Below command will create
a compressed file for the LICENSE.txt file which is available inside the
HADOOP_HOME directory.
b.
Copy the Generated LICENSE.txt.lzo file to / (root)
HDFS path using below command.
bin/hadoop fs -copyFromLocal
LICENSE.txt.lzo /
c.
Index the LICENSE.txt.lzo file in HDFS using
below command.
bin/hadoop jar
lib/hadoop-lzo-0.4.17.jar com.hadoop.compression.lzo.LzoIndexer
/LICENSE.txt.lzo
Once you execute the above command you will see the below
output on console. You can also verify the index file creation on HADOOP UI in HDFS
Browser.
14/12/20
14:04:05 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
14/12/20
14:04:05 INFO lzo.LzoCodec: Successfully loaded & initialized native-lzo
library [hadoop-lzo revc461d77a0feec38a2bba31c4380ad60084c09205]
Java
HotSpot(TM) 64-Bit Server VM warning: You have loaded library
/data/repo/hadoop-2.4.1/lib/native/libhadoop.so.1.0.0 which might have disabled
stack guard. The VM will try to fix the stack guard now.
It's
highly recommended that you fix the library with 'execstack -c
<libfile>', or link it with '-z noexecstack'.
14/12/20
14:04:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for
your platform... using builtin-java classes where applicable
14/12/20
14:04:08 INFO lzo.LzoIndexer: [INDEX] LZO Indexing file /LICENSE.txt.lzo, size
0.00 GB...
14/12/20
14:04:08 INFO Configuration.deprecation: hadoop.native.lib is deprecated.
Instead, use io.native.lib.available
14/12/20
14:04:09 INFO lzo.LzoIndexer: Completed LZO Indexing in 0.61 seconds (0.01
MB/s). Index size is 0.01 KB.
12. Verify the LZO Compression in HBase:
You
can verify the LZO Compression in HBase by creating a table using the LZO
compression from HBase shell.
a.
Create a table with LZO Compression using below
command:
create ‘t1’, { NAME=>’f1’,
COMPRESSION=>’lzo’ }
b.
Verify the Compression type in table using below
describe command on table:
Once you execute the above command you will see the below
console output. The LZO Compression for the table can also be verified on HBase UI.
DESCRIPTION ENABLED
't1', {
NAME => 'f1' , DATA_BLOCK_ENCODING => 'NONE' , BLOOMFILTER => 'ROW', REPLICATION_SCOPE
=> '0', VERSION true S => '1',
COMPRESSION => 'LZO', MIN_VERSIONS => '0', TTL => 'FOREVER',
KEEP_DELETED_CELLS => 'false', BLOCK SIZE => '65536', IN_MEMORY =>
'false', BLOCKCACHE => 'true'}
1
row(s) in 0.8250 seconds