LZO's licence (GPL) is
incompatible with Hadoop (Apache) and therefore one should install the LZO separately
in cluster to enable LZO compression in Hadoop and HBase. LZO compression format is
split-table compression. It provides the high compression and decompression
speed.
Perform the below steps to enable the LZO compression in Hadoop and
HBase:
1. Install the LZO development packages:
sudo
yum install lzo lzo-devel
2. Download the Latest LZO release using below command:
wget
https://github.com/twitter/hadoop-lzo/archive/release-0.4.17.zip
3. Unzip the downloaded bundle:
unzip
release-0.4.17.zip
4. Change the current directory to the
extracted folder:
cd hadoop-lzo-release-0.4.17
5. Run the command to generate the native
libraries
ant
compile-native
6. Copy the generated jar and native libraries
to Hadoop and HBase lib directories.
cp build/hadoop-lzo-0.4.17.jar
$HADOOP_HOME/lib/
cp build/hadoop-lzo-0.4.17.jar
$HBASE_HOME/lib/
cp build/hadoop-lzo-0.4.17/lib/native/Linux-amd64-64/*
$HADOOP_HOME/lib/native/
cp build/hadoop-lzo-0.4.17/lib/native/Linux-amd64-64/*
$HBASE_HOME/lib/native/
7. Add the following properties in
core-site.xml file of hadoop.
<property>
<name>io.compression.codecs</name>
<value>
org.apache.hadoop.io.compress.DefaultCodec,
org.apache.hadoop.io.compress.GzipCodec,
org.apache.hadoop.io.compress.BZip2Codec,
org.apache.hadoop.io.compress.DeflateCodec,
org.apache.hadoop.io.compress.SnappyCodec,
org.apache.hadoop.io.compress.Lz4Codec,
com.hadoop.compression.lzo.LzoCodec,
com.hadoop.compression.lzo.LzopCodec
</value>
</property>
<property>
<name>io.compression.codec.lzo.class</name>
<value>com.hadoop.compression.lzo.LzoCodec</value>
</property>
8. Sync the hadoop and HBase Home directory on
all nodes of hadoop and hbase cluster.
rsync $HADOOP_HOME/ node1:$HADOOP_HOME/
node2:$HADOOP_HOME/
rsync $HBASE_HOME/ node1:$HBASE_HOME/ node2:$HBASE_HOME/
rsync $HBASE_HOME/ node1:$HBASE_HOME/ node2:$HBASE_HOME/
9. Add the HADOOP_OPTS variable in .bashrc file on all hadoop nodes:
export
HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native:$HADOOP_HOME/lib/"
10. Add the HBASE_OPTS variable in .bashrc file on all HBase nodes:
export
HBASE_OPTS="-Djava.library.path=$HBASE_HOME/lib/native/:$HBASE_HOME/lib/"
11. Verify the LZO compression in Hadoop:
a.
Create a LZO compressed file using lzop utility. Below command will create
a compressed file for the LICENSE.txt file which is available inside the
HADOOP_HOME directory.
lzop LICENSE.txt
b.
Copy the Generated LICENSE.txt.lzo file to / (root)
HDFS path using below command.
bin/hadoop fs -copyFromLocal
LICENSE.txt.lzo /
c.
Index the LICENSE.txt.lzo file in HDFS using
below command.
bin/hadoop jar
lib/hadoop-lzo-0.4.17.jar com.hadoop.compression.lzo.LzoIndexer
/LICENSE.txt.lzo
Once you execute the above command you will see the below
output on console. You can also verify the index file creation on HADOOP UI in HDFS
Browser.
14/12/20
14:04:05 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
14/12/20
14:04:05 INFO lzo.LzoCodec: Successfully loaded & initialized native-lzo
library [hadoop-lzo revc461d77a0feec38a2bba31c4380ad60084c09205]
Java
HotSpot(TM) 64-Bit Server VM warning: You have loaded library
/data/repo/hadoop-2.4.1/lib/native/libhadoop.so.1.0.0 which might have disabled
stack guard. The VM will try to fix the stack guard now.
It's
highly recommended that you fix the library with 'execstack -c
<libfile>', or link it with '-z noexecstack'.
14/12/20
14:04:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for
your platform... using builtin-java classes where applicable
14/12/20
14:04:08 INFO lzo.LzoIndexer: [INDEX] LZO Indexing file /LICENSE.txt.lzo, size
0.00 GB...
14/12/20
14:04:08 INFO Configuration.deprecation: hadoop.native.lib is deprecated.
Instead, use io.native.lib.available
14/12/20
14:04:09 INFO lzo.LzoIndexer: Completed LZO Indexing in 0.61 seconds (0.01
MB/s). Index size is 0.01 KB.
12. Verify the LZO Compression in HBase:
You
can verify the LZO Compression in HBase by creating a table using the LZO
compression from HBase shell.
a.
Create a table with LZO Compression using below
command:
create ‘t1’, { NAME=>’f1’,
COMPRESSION=>’lzo’ }
b.
Verify the Compression type in table using below
describe command on table:
describe ‘t1’
Once you execute the above command you will see the below
console output. The LZO Compression for the table can also be verified on HBase UI.
DESCRIPTION ENABLED
't1', {
NAME => 'f1' , DATA_BLOCK_ENCODING => 'NONE' , BLOOMFILTER => 'ROW', REPLICATION_SCOPE
=> '0', VERSION true S => '1',
COMPRESSION => 'LZO', MIN_VERSIONS => '0', TTL => 'FOREVER',
KEEP_DELETED_CELLS => 'false', BLOCK SIZE => '65536', IN_MEMORY =>
'false', BLOCKCACHE => 'true'}
1
row(s) in 0.8250 seconds