承接安装系列hadoop,hive hive 与 kylin的搭建 在原有hadoop+hbase+hive+spark基础上搭建 hive 配置环境变量添加如下:/etc/profile和 ~/...=/itcast/hive/conf:/itcast/hive/lib/*:/itcast/hive/hcatalog/share/hcatalog/hive-hcatalog-pig-adapter-.../hive/hcatalog/share/hcatalog/hive-hcatalog-server-extensions-1.1.0-cdh5.5.1.jar:/itcast/hive/hcatalog...itcast/kylin/conf 编辑kylin.properties kylin.rest.servers=centos1:7070,centos2:7070,centos3:7070 kylin.hbase.cluster.fs...=hdfs://mycluster/apps/hbase/data kylin.route.hive.enabled=true kylin.route.hive.url=jdbc:hive2://centos1
hcatalog帮我们解决了这个问题,有了它我们不用关心hive中数据的存储格式。详细信息请仔细阅读本文。 本文主要是讲mapreduce使用HCatalog读写hive表。...hcatalog使得hive的元数据可以很好的被其它hadoop工具使用,比如pig,mr和hive。...HCatalog的表为用户提供了(HDFS)中数据的关系视图,并确保用户不必担心他们的数据存储在何处或采用何种格式,因此用户无需知道数据是否以RCFile格式存储, 文本文件或sequence 文件。...HCatalog提供HCatInputFormat / HCatOutputFormat,使MapReduce用户能够在Hive的数据仓库中读/写数据。 它允许用户只读取他们需要的表和列的分区。
/hcatalog does not exist! HCatalog jobs will fail..../hcatalog does not exist! HCatalog jobs will fail..../hcatalog does not exist! HCatalog jobs will fail..../hcatalog does not exist! HCatalog jobs will fail..../hcatalog does not exist! HCatalog jobs will fail.
export HBASE_HOME=/developer/hbase-1.2.0 # 6. hive export HIVE_HOME=/developer/apache-hive-1.1.0-bin...export HIVE_CONF_DIR=${HIVE_HOME}/conf export HCAT_HOME=$HIVE_HOME/hcatalog # 7. kylin export KYLIN_HOME...developer/apache-kylin-2.3.0-bin export hive_dependency=$HIVE_HOME/conf:$HIVE_HOME/lib/*:$HCAT_HOME/share/hcatalog.../hive-hcatalog-core-1.1.0.jar #Path # 1. big data export PATH=$KYLIN_HOME/bin:$PATH export PATH=$HIVE_HOME.../bin:$HBASE_HOME/bin:$HADOOP_HOME/bin:$PATH export PATH=$MAVEN_HOME/bin:$CATALINA_HOME/bin:$JAVA_HOME
oozie-client.noarch yum remove -y gweb.noarch yum remove -y snappy-devel.x86_64 yum remove -y hcatalog.noarch...rm -rf hadoop-log rm -rf hadoop-lib rm -rf hadoop-default rm -rf oozie-conf rm -rf hcatalog-conf...rm -rf /etc/hcatalog rm -rf /etc/hive rm -rf /etc/ganglia rm -rf /etc/nagios rm -rf /etc/oozie...rm -rf /etc/sqoop rm -rf /etc/zookeeper rm -rf /var/run/hadoop rm -rf /var/run/hbase rm -rf /var/...rm -rf /usr/lib/hcatalog rm -rf /usr/lib/hive rm -rf /usr/lib/oozie rm -rf /usr/lib/sqoop rm -rf
1.hadoop环境搭建 参考:hadoop_学习_02_Hadoop环境搭建(单机) 2.hbase环境搭建 参考:hbase_学习_01_HBase环境搭建(单机) 3.hive环境搭建 参考:hive...export HIVE_CONF_DIR=${HIVE_HOME}/conf export HCAT_HOME=$HIVE_HOME/hcatalog # 7. kylin export KYLIN_HOME...developer/apache-kylin-2.3.0-bin export hive_dependency=$HIVE_HOME/conf:$HIVE_HOME/lib/*:$HCAT_HOME/share/hcatalog.../hive-hcatalog-core-1.1.0.jar #Path # 1. big data export PATH=$KYLIN_HOME/bin:$PATH export PATH=$HIVE_HOME.../start-all.sh (2) 启动 hbase 进入hbase的 bin 目录,执行 .
/hbase does not exist! HBase imports will fail..../hcatalog does not exist! HCatalog jobs will fail..../hbase does not exist! HBase imports will fail..../hcatalog does not exist! HCatalog jobs will fail....HCatalog jobs will fail." # echo 'Please set $HCAT_HOME to the root of your HCatalog installation.'
Ambari目前已支持大多数Hadoop组件,包括HDFS,MapReduce,Hive,Pig,HBase,Zookeeper,Sqoop和HCatalog等。...Apache Ambari支持HDFS,MapReduce,Hive,Pig,HBase,Zookeeper,Sqoop和HCatalog等的集中管理。也是5个顶级Hadoop集群管理工具之一。...--- 组件服务 是否支持 HDFS 是 HBase 是 Hive 是 Yarn 是 Storm 是 Kafka 是 Knox 是 Solr 是 Druid 是 更多(自定义) 是 Ambari的功能...就Ambari的作用来说,就是创建,管理,监控Hadoop集群,但是这里的Hadoop是广义的,指的是Hadoop整个生态圈(例如Hive,HBase,Sqoop,Zookeeper等),而并不是特指Hadoop...预先配置好关键的运维指标(Metrics),也可以直接查看Hadoop Core(HDFS和MapReduce)及相关项目(如HBase,Hive和HCatalog等)是否健康。
6.4虚拟机,IP地址为 192.168.56.101 master 192.168.56.102 slave1 192.168.56.103 slave2 hadoop 2.7.2 hbase...hive_dependency环境变量 export hive_dependency=/home/grid/hive/conf:/home/grid/hive/lib/*:/home/grid/hive/hcatalog.../share/hcatalog/hive-hcatalog-core-2.0.0.jar 六、把hive安装目录拷贝到Hadoop集群的其他节点 scp -r hive slave1:/home/grid.../bin/start-hbase.sh 3....中建立了2个表,如图6所示 图6 注意: 1. kylin、hadoop、hbase、hive的版本一定要匹配。
当中,但还有很多数据不是用Tab分隔的 下面我们介绍如何使用hive来导入数据到hbase当中。 .../* /tmp/hbase_splits; c.创建hfiles.hql ADD JAR /usr/lib/hbase/hbase-0.94.6.1.3.0.0-104-security.jar;...put pagecounts-20081001-000000. gz /$Path_to_Input_Files_on_Hive_Client/wikistats/ 3.创建必要的表 注意:$HCATALOG_USER...是HCatalog服务的用户(默认是hcat) $HCATALOG_USER-f /$Path_to_Input_Files_on_Hive_Client/tables.ddl 执行之后,我们会看到如下的提示...INFO streaming.StreamJob: Output: /tmp/hbase_splits_txt 再执行这一句 hadoop fs -cat /tmp/hbase_splits_txt
/hive-hcatalog does not exist! HCatalog jobs will fail..../hive-hcatalog does not exist! HCatalog jobs will fail..../hive-hcatalog does not exist! HCatalog jobs will fail..../hive-hcatalog does not exist! HCatalog jobs will fail....–hbase-create-table –hbase-row-key id –column-family imei hbase–>mysql 关于将Hbase的数据导入到mysql里,Sqoop
SUCCESS [ 0.536 s] [INFO] Hive HBase Handler ....................................SUCCESS [ 20.985 s] [INFO] Hive HCatalog .........................................SUCCESS [ 48.139 s] [INFO] Hive HCatalog Core ....................................SUCCESS [ 5.561 s] [INFO] Hive HCatalog Pig Adapter .............................SUCCESS [ 4.961 s] [INFO] Hive HCatalog Server Extensions ....................
/hbase does not exist! HBase imports will fail....Please set $HBASE_HOME to the root of your HBase installation..../hcatalog does not exist! HCatalog jobs will fail....Please set $HCAT_HOME to the root of your HCatalog installation.
一.前述 1.HBase,是一个高可靠性、高性能、面向列、可伸缩、实时读写的分布式数据库。...二.Hbase数据模型 ? 2.1 ROW KEY(相当于关系型数据库中的ID) 决定一行数据 按照字典顺序排序的。...HBase把同一列族里面的数据存储在同一目录下,由几个文件保存。 2.3 Timestamp时间戳(相当于版本!!!)...三.Hbase架构 ?...3.1 Client 包含访问HBase的接口并维护cache来加快对HBase的访问 3.2 Zookeeper 保证任何时候,集群中只有一个master(HA) 存贮所有Region的寻址入口。
HBase imports will fail. 3 Please set $HBASE_HOME to the root of your HBase installation. 4 Warning.../hcatalog does not exist!.../hcatalog does not exist!.../hcatalog does not exist!.../hcatalog does not exist!
hive-hbase hive-webhcat hue-beeswax hue-hbase hue-impala hue-pig hue-plugins hue-rdbms hue-search hue-spark...lib/solr /var/lib/sqoop* /var/lib/zookeeper /usr/lib/hadoop /usr/lib/hadoop* /usr/lib/hive /usr/lib/hbase...lib/hue /usr/lib/oozie /usr/lib/sqoop* /usr/lib/zookeeper /usr/lib/bigtop* /usr/lib/flume-ng /usr/lib/hcatalog.../var/run/hbase /var/run/impala /var/run/hive /var/run/hdfs-sockets 删除服务命令 rm -rf /usr/bin/hadoop* /usr...* /etc/hcatalog /etc/sentry /etc/solr /etc/spark* rm -rf /etc/alternatives/avro-tools /etcalternatives
and monitoring Apache Hadoop clusters which includes support for Hadoop HDFS, Hadoop MapReduce, Hive, HCatalog..., HBase, ZooKeeper, Oozie, Pig and Sqoop....HBase: Use Apache HBase when you need random, realtime read/write access to your Big Data....Convenient base classes for backing Hadoop MapReduce jobs with Apache HBase tables....HCatalog: Apache HCatalog is a table and storage management service for data created using Apache Hadoop
领取专属 10元无门槛券
手把手带您无忧上云