[HD] Hadoop开源环境搭建(集群模式):5.Kylin

版本:HDFS2.7.3  Hive-2.1.1  HBase-1.2.4 Kylin1.6

1.下载kylin安装包 http://www.apache.org/dyn/closer.cgi/kylin/apache-kylin-1.6.0/apache-kylin-1.6.0-
hbase1.x-bin.tar.gz
我的计划安装目录 /home/hadoop/BigData/kylin-1.6.0-hbase1.x-bin

$ tar zxvf apache-kylin-1.6.0-hbase1.x-bin.tar.gz -C /home/hadoop/BigData/

 

2.配置环境变量,关于Hive/HBase。vi .bash_profile参考如下:主要是HIVE_CONF、HBASE_CLASSPATH、
hive_dependency

##############
### Env -- HD
##############
export HADOOP_HOME=/home/hadoop/hadoop-2.7.3
export PATH=$HADOOP_HOME/bin:$PATH
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop

export HIVE_HOME=/home/hadoop/BigData/apache-hive-2.1.1-bin
export PATH=${HIVE_HOME}/bin:$PATH
export HIVE_CONF=${HIVE_HOME}/conf
#export hive_dependency=${HIVE_HOME}/conf:${HIVE_HOME}/lib/*:${HIVE_HOME}/hcatalog/share/hcatalog/hive-
hcatalog-core-2.1.1.jar

export ZOOKEEPER_HOME=/home/hadoop/BigData/zookeeper-3.5.1-alpha
export HBASE_HOME=/home/hadoop/BigData/hbase-1.2.4
export PATH=${HBASE_HOME}/bin:$PATH
export HBASE_CLASSPATH=$HBASE_HOME/lib/*
export HBASE_CLASSPATH=$HADOOP_CONF:$HADOOP_HOME/*:$HADOOP_HOME/lib/*:$ZOOKEEPER_HOME/*:$ZOOKEEPER_HOME/lib/*:$HBASE_CLASSPATH
export KYLIN_HOME=/home/hadoop/BigData/kylin-1.6.0-hbase1.x-bin
export KYLIN_CONF=${KYLIN_HOME}/conf

 

2.配置Hive
将hive-site.xml软连接到${KYLIN_HOME}/conf/下

$ ln -s /home/hadoop/BigData/apache-hive-2.1.1-bin/conf/hive-site.xml /home/hadoop/BigData/kylin-1.6.0-hbase1.x-bin/conf/hive-site.xml

 

3.配置HBase
将hbase-site.xml中的property内容添加到kylin_job_conf.xml参考如下:

<configuration>

<property>
<name>mapreduce.job.split.metainfo.maxsize</name>
<value>-1</value>
<description>The maximum permissible size of the split metainfo file.
The JobTracker won't attempt to read split metainfo files bigger than
the configured value. No limits if set to -1.
</description>
</property>

<property>
<name>mapreduce.map.output.compress</name>
<value>true</value>
<description>Compress map outputs</description>
</property>

<!--
The default map outputs compress codec is org.apache.hadoop.io.compress.DefaultCodec,
if SnappyCodec is supported, org.apache.hadoop.io.compress.SnappyCodec could be used.
-->
<!--
<property>
<name>mapreduce.map.output.compress.codec</name>
<value>org.apache.hadoop.io.compress.SnappyCodec</value>
<description>The compression codec to use for map outputs
</description>
</property>
-->
<property>
<name>mapreduce.output.fileoutputformat.compress</name>
<value>true</value>
<description>Compress the output of a MapReduce job</description>
</property>
<!--
The default job outputs compress codec is org.apache.hadoop.io.compress.DefaultCodec,
if SnappyCodec is supported, org.apache.hadoop.io.compress.SnappyCodec could be used.
-->
<!--
<property>
<name>mapreduce.output.fileoutputformat.compress.codec</name>
<value>org.apache.hadoop.io.compress.SnappyCodec</value>
<description>The compression codec to use for job outputs
</description>
</property>
-->
<property>
<name>mapreduce.output.fileoutputformat.compress.type</name>
<value>BLOCK</value>
<description>The compression type to use for job outputs</description>
</property>
<-- hbase-site.xml properties -->
<property>
<name>mapreduce.job.max.split.locations</name>
<value>2000</value>
<description>No description</description>
</property>

<property>
<name>dfs.replication</name>
<value>2</value>
<description>Block replication</description>
</property>

<property>
<name>mapreduce.task.timeout</name>
<value>3600000</value>
<description>Set task timeout to 1 hour</description>
</property>

<property>
<name>hbase.rootdir</name>
<value>hdfs://192.168.111.140:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>192.168.111.140,192.168.111.141,192.168.111.142</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hadoop/BigData/zookeeper-3.5.1-alpha/zookeeper-data</value>
</property>

</configuration>

 

4.配置kylin.properties,参考如下:

kylin.rest.servers=192.168.111.140:7070

kylin.job.jar=$KYLIN_HOME/lib/kylin-job-1.6.0.jar
kylin.coprocessor.local.jar=$KYLIN_HOME/lib/kylin-coprocessor-1.6.0.jar

 

5.启动kylin. 启动kylin.sh的时候kylin回自动check-env.sh、find-hbase-dependency.sh、find-hive-dependency.sh

$ sh /home/hadoop/BigData/kylin-1.6.0-hbase1.x-bin/bin/kylin.sh start

 

6.运行样例程序

$ sh /home/hadoop/BigData/kylin-1.6.0-hbase1.x-bin/bin/sample.sh

如果各位因为hdfs配置而遇到权限问题,请对hdfs中的相关目录赋予权限。

 

7.创建Cube

打开Kylin URL:   http://192.168.111.140:7070/kylin  默认用户名密码 ADMIN/KYLIN

Step7.1.重新更新metaStore

Step7.2创建Cube

Step7.3查询

 

 

 

常见问题及解决:

1.阶段 #17 Step Name: Convert Cuboid Data to HFile查看yarn日志报错log4j:ERROR setFile(null,true) call
failed. java.io.FileNotFoundException: /home/hadoop/hadoop-
2.7.3/logs/userlogs/application_1489454186796_0015/container_1489454186796_0015_01_000001 (Is a
directory)
原因:hive的jar包未正确加入,请确保hive_dependency正确配置或者Yarn内存不够,无法为container分配新的内存,建议调大yarn内存或者减少container分配内存。参考我的
yarn-site.xml,所有节点修改并重启生效

<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>32840</value>
<discription>当前节点nodemanager进程分配内存大小</discription>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
<discription>单个任务可申请最少内存,默认1024MB</discription>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>8192</value>
<discription>单个任务可申请最大内存,默认8192MB</discription>
</property>

 

2.报错找不到historyserver.
原因:启动historyServer
解决:请在所有节点启动historyServer
sh /home/hadoop/hadoop-2.7.3/sbin/mr-jobhistory-daemon.sh stop historyserver
sh /home/hadoop/hadoop-2.7.3/sbin/mr-jobhistory-daemon.sh start historyserver

 

3.如果报错Method Not found org.apache.hbase.util.CheckSumError
原因:hive1.2.4自带的hbase jar包是1.1.1,我使用的是hbase1.2.4 ,hbase-common-1.1.1.jar不包含此方法。
解决:将hive下的所有hbase jar替换为 hbase1.2.4的jar。列表如下:

hbase-annotations-1.1.1.jar
hbase-client-1.1.1.jar
hbase-common-1.1.1.jar
hbase-common-1.1.1-tests.jar
hbase-hadoop2-compat-1.1.1.jar
hbase-hadoop2-compat-1.1.1-tests.jar
hbase-hadoop-compat-1.1.1.jar
hbase-prefix-tree-1.1.1.jar
hbase-procedure-1.1.1.jar
hbase-protocol-1.1.1.jar
hbase-server-1.1.1.jar

 

4…UnknownProtocolException: org.apache.hadoop.hbase.exceptions.UnknownProtocolException: No registered
coprocessor service found for name CubeVisitService in region KYLIN_RF44K1EOAC
原因:Kylin leverages HBase coprocessor to optimize query performance. After new versions released, the
RPC protocol may get changed, so user need to redeploy coprocessor to HTable.
解决:执行如下命令部署Coprocessor

$ $KYLIN_HOME/bin/kylin.sh org.apache.kylin.storage.hbase.util.DeployCoprocessorCLI $KYLIN_HOME/lib/kylin-coprocessor-*.jar all

 

 

 

分类上一篇:无,已是最新文章    分类下一篇:

Leave a Reply