前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >hdfs和yarn高可用对比

hdfs和yarn高可用对比

作者头像
SRE运维实践
发布2021-03-09 15:13:21
1.3K0
发布2021-03-09 15:13:21
举报
文章被收录于专栏:SRE运维实践

序言

总有一天你会笑着说出曾经令你痛苦的事情,毕竟有些东西虽然不是你想要的,但是却是你自找的,表面上是无奈,实际上是懒得去做选择,成功的路只有一条,而失败的路则是各种各样的原因。

得不到的时候念念不忘,得到的时候,却不珍惜,这到底是为什么呢?是忘记了出发的初心还是产生了新的欲望而反被其折磨?

高可用

1 高可用架构对比

HDFS的出现,就是为了解决海量数据的存储问题,从而采用分布式架构存储文件,将一个大文件按照block块来切分,然后分布在不同的机器,不同的机架中,数据节点能水平扩容,从而能海纳百川,存储海量数据。

HDFS是为了存储数据的,从而要保证数据的可靠性,从而就有了datanode数据节点的三副本机制,而且在数据写入的时候,是流水线的方式写入,也就是正常情况下三节点数据写入成功才返回客户端成功,特殊情况下,写入一个也成,毕竟她自带了副本复制机制,也就是当副本数不满足设定的时候,会找到距离近的,负载低的,把数据再复制过去。

HDFS是为了支持海量数据的分析计算的,就像MapReduce程序,文件多副本存储,也就意味着当同一份数据被三个任务跑的时候,可以分布在三台机器上,从而充分的发挥机器的算力。

HDFS是分布式存储的,从而需要一个相当于字典的索引数据,有什么数据,有多少块,权限是啥,用户是啥,从而就有了namenode,既然有了名称服务器,那就意味着要持久化存储,需要保存相关的一些数据,保存的就是fsimage和edit日志信息,在客户端上传数据的时候,将操作日志记录在edit中,然后返回给客户端,而fsimage则是相当于内存的数据,可以理解为基线,像基准测试一样。

如上图所示,可以看到很多的组件,包括zkfc,还有QJM集群,再看看yarn集群的高可用。

对比一下就会看到,yarn集群的高可用架构比hdfs的要简单太多了,没有zkfc,没有qjm集群,只需要一个zk集群来负责选举出active的resourcemanager就好了。

为什么差别这么大?这就是持久化数据的高可用和无状态高可用的区别了,hdfs的namenode要保持高可用,必须要保证数据同步,从而需要一个共享存储QJM来存放edits日志,然后同步到standby的节点上去,而对于resourcemanager来说,并不需要持久化啥数据,也就是无状态的,就像容器一样,直接删除,再创建一个完全没问题,所以差别来说,就是因为需要保存一些数据,这就是有状态和无状态之分。

无状态的可以理解为结婚了,没有娃,而有状态的你可以理解为结婚了有了娃,那有了娃怎么办,你得有人看着吧,说分手就分手,对于无状态是可以的,对于有状态的你得找个人看着,就是一个standby了,而另外一个负责赚钱养家,那就是active了,最怕的就是两个都去赚钱了,然后都active了,俗称脑裂split brain,这个时候一般直接打死一个,让你没事就知道赚钱,打死的那个就是standby,只要看着孩子就成,如果两个都看着孩子,就是两个都不去赚钱了,也就是都是standby,其实还好,只是暂时没钱,不会孩子没人管。。。对于无状态的来说,其实还好,都出去赚钱,最多就是钱多了,也就是执行的任务数量多了点,相当于任务重跑了一下,可能会有数据重复,只要任务设计的好,就不会出现这种问题了,那要是两个都不去赚钱,变成了standby,那么就只能喝西北风了,毕竟刚结婚,还有一大堆的task在那等着需要资源resource呢。

2 近看hdfs

近看一朵花,远看豆腐渣,很多东西,看的太深,就忘记了全局层面的东西,埋头看路,低头看天,啥都没有。。。

代码语言:javascript
复制
#数据节点存储的数据,包括block块数据,还有数据的校验码[root@KEL subdir11]# pwd/$HADOOP_HOME/$DATADIR/dfs/data/current/BP-184102405-192.168.1.10-1612873956948/current/finalized/subdir0/subdir11[root@KEL subdir11]# cat blk_10737448401,1613991123,admin,admin[root@KEL subdir11]# cat blk_1073744840_4016.meta ʗE[root@KEL subdir11]# file blk_1073744840_4016.meta blk_1073744840_4016.meta: raw G3 data, byte-padded

在查看数据节点存储数据的时候,需要注意的是,这些数据块和校验信息并不会存储在namenode里面,这个是datanode和namenode进行通信获取,在启动的时候,会统一汇报,这个也是所谓的安全模式safemode,此时你只能查,不能修改增加元数据。

再看一下namenode保存的内容:

代码语言:javascript
复制
 #namenode保存的内容,包括edits日志和fsimage   edits_0000000000000053943-0000000000000053944  edits_inprogress_0000000000000053945  fsimage_0000000000000053730 fsimage_0000000000000053730.md5  fsimage_0000000000000053930   fsimage_0000000000000053930.md5 edits_0000000000000013106-0000000000000013107  seen_txid  edits_0000000000000013108-0000000000000013109  VERSION63  edits_0000000000000013110-0000000000000013111[root@KEL current]# pwd/$HADOOP_HOME/$DATA_DIR/dfs/name/current[root@KEL current]# file fsimage_0000000000000053730fsimage_0000000000000053730: data[root@KEL current]# file fsimage_0000000000000053730.md5 fsimage_0000000000000053730.md5: ASCII text[root@KEL current]# cat fsimage_0000000000000053730.md5 2c8359248cbcc504dca7e3020f8bb309 *fsimage_0000000000000053730

可以看到其中有大量的edtis文件,用来记录相关的操作信息,edits的可以理解为历史的,当前正在使用的inprogress。

代码语言:javascript
复制
#使用lsof可以查看当前进程占用的文件java    2560 root  233u   /edits_inprogress_0000000000000053951[root@KEL current]# jps2560 NameNode[root@KEL current]# lsof -p 2560|grep -v jar

可以使用命令查看fsimage和edits的内容:

代码语言:javascript
复制
#将fsimage转换成xml[root@KEL current]# hdfs oiv -p xml -i fsimage_0000000000000053730 -o fsimage.xml[root@KEL current]# vim  fsimage.xml  <?xml version="1.0"?>   <fsimage>   <NameSection>     <genstampV1>1000</genstampV1>     <genstampV2>1002</genstampV2>     <genstampV1Limit>0</genstampV1Limit>     <lastAllocatedBlockId>1073741826</lastAllocatedBlockId>     <txid>37</txid>   </NameSection>   <INodeSection>     <lastInodeId>16400</lastInodeId>     <inode>       <id>16385</id>       <type>DIRECTORY</type>       <name></name>       <mtime>1392772497282</mtime>       <permission>theuser:supergroup:rwxr-xr-x</permission>       <nsquota>9223372036854775807</nsquota>       <dsquota>-1</dsquota>     </inode>   ...remaining output omitted..

查看edits文件内容:

代码语言:javascript
复制
#将edits log转换成xml格式查看,这个里面没内容[root@KEL current]# hdfs oev -p xml -i edits_0000000000000013122-0000000000000013123 -o edits.xml[root@KEL current]# vim edits.xml [root@KEL current]# cat edits.xml <?xml version="1.0" encoding="UTF-8"?><EDITS>  <EDITS_VERSION>-63</EDITS_VERSION>  <RECORD>    <OPCODE>OP_START_LOG_SEGMENT</OPCODE>    <DATA>      <TXID>13122</TXID>    </DATA>  </RECORD>  <RECORD>    <OPCODE>OP_END_LOG_SEGMENT</OPCODE>    <DATA>      <TXID>13123</TXID>    </DATA>  </RECORD></EDITS>

查看journal node保存的内容:

代码语言:javascript
复制
#保存的都是edits文件,active写入,standby的读出[root@KEL current]# cat last-promised-epoch 78[root@KEL current]# cat last-writer-epoch 78[root@KEL current]# ls -l edits_*|wc -l5652

在高可用架构中,zkfc其实就是namenode的zookeeper连接客户端,当namenode进程挂了之后,zkfc进程是第一时间知道的,然后就执行fence程序,把namenode的状态设置为standby,但是当zkfc进程挂了呢,那就要等一段时间了,因为zkfc只负责自己namenode节点的生杀大权。

在进行高可用搭建的时候,还需要进行格式化,也就是在zk上创建一个节点。也可以看到保存在zk上面的内容:

代码语言:javascript
复制
#使用zk客户端连接查看,看到其中保存的active节点[root@KEL bin]# ./zkCli.sh -server localhost:3001[zk: localhost:3001(CONNECTED) 4] get /hadoop-ha/ns/ActiveStandbyElectorLocknsnn1KEL �F(�>cZxid = 0x2f00000007ctime = Sat Mar 06 01:43:02 CST 2021mZxid = 0x2f00000007mtime = Sat Mar 06 01:43:02 CST 2021pZxid = 0x2f00000007cversion = 0dataVersion = 0aclVersion = 0ephemeralOwner = 0x2000018c99c0000dataLength = 20numChildren = 0[zk: localhost:3001(CONNECTED) 5] get /hadoop-ha/ns/ActiveBreadCrumb        nsnn1KEL �F(�>cZxid = 0x300000008ctime = Tue Feb 09 20:33:49 CST 2021mZxid = 0x2f00000008mtime = Sat Mar 06 01:43:03 CST 2021pZxid = 0x300000008cversion = 0dataVersion = 148aclVersion = 0ephemeralOwner = 0x0dataLength = 20numChildren = 0

3 namenode页面信息

在namenode的界面上,显示很多需要关注的信息,对于运维来说,关注这些信息也是比较多的,毕竟还是关注底层架构。

简单概述信息包括:是否是安全模式,安全情况,具有的文件和目录,块数量,这个地方可以简单判断下文件的大小,毕竟如果hdfs存储大量的小文件,会消耗很多namenode的内存,而且在进行处理的时候,寻找相应的块信息,性能也会受到影响。hdfs是java写的,从而显示了堆栈占用的内存,下面则是一个概览,配置的容量大小,分布式文件系统使用的容量,非DFS使用的容量,块池使用量,数据盘使用的占比。毕竟是一个分布式存储系统,从而关注各种容量使用量,这个界面上显示non dfs used,这个还是蛮不错的,有的时候是因为其他的数据占用了空间,导致dfs的空间不足。

这个显示了QJM的相关信息,也就是QJM运行在哪些机器上面,而且还显示了目前是哪个edit文件生效,相关事务id号(和standby显示的信息不一致,standby只显示QJM)。后面是namenode的存储空间,存储的类型是image和edits。

datanode就显示相关节点的信息,占用的容量大小等,是否有磁盘损坏,decimissioning表示退役的节点,就像有的需要下线维修或者替换机器,主要是扩容和缩容可能会出现。

代码语言:javascript
复制
#hdfs的管理命令,可以查看帮助手册[root@KEL bin]# hdfs dfsadmin -hh: Unknown commandUsage: hdfs dfsadminNote: Administrative commands can only be run as the HDFS superuser

主要报告磁盘的损坏信息:

代码语言:javascript
复制
#datanode的日志信息2021-03-06 01:23:34,954 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid unassigned) service to KEL1/192.168.1.99:90002021-03-06 01:23:34,955 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to KEL/192.168.1.10:9000. Exiting.org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 1, volumes configured: 2, volumes failed: 1, volume failures tolerated: 0        at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.<init>(FsDatasetImpl.java:285)        at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:34)        at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetFactory.newInstance(FsDatasetFactory.java:30)        at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1371)        at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1323)        at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:317)        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:223)        at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:802)        at java.lang.Thread.run(Thread.java:748)2021-03-06 01:23:34,955 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid unassigned) service to KEL/192.168.1.10:90002021-03-06 01:23:35,060 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid unassigned)2021-03-06 01:23:37,061 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode2021-03-06 01:23:37,062 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 02021-03-06 01:23:37,063 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:

损坏后,会导致datanode不提供服务。

快照信息,快照是只读的,特殊时间点的拷贝信息,用来进行备份使用,进行一些用户错误和灾难恢复,其实开启回收站也是蛮好的一个功能,一般在分布式文件系统的使用的时候,都会开启这个功能,防止误删,要不然删除就找不回来了,在使用的时候,需要先允许创建snapshot。

代码语言:javascript
复制
[root@KEL hadoop-2.7.2]# hdfs dfs -createSnapshot / testSnapshotcreateSnapshot: Failed to add snapshot: there are already 0 snapshot(s) and the snapshot quota is 0[root@KEL hadoop-2.7.2]# hdfs dfsadmin -allowSnapshot /Allowing snaphot on / succeeded[root@KEL hadoop-2.7.2]# hdfs dfs -createSnapshot / testSnapshotCreated snapshot /.snapshot/testSnapshot

启动namenode的过程信息,其中可以看到分为几个阶段,第一阶段,加载fsimage信息到内存中,花费21秒,第二加载edits log信息,花费2秒。

第三阶段保存检查点,并没有保存,第四阶段安全模式,此时主要是等到所有的datanode汇报block块信息,可以看到这个时间也是漫长的,如果里面文件数量,块数量很多,这个启动时间可能会比较长。

页面上的信息大部分都是dfsadmin的report信息:

代码语言:javascript
复制
[root@KEL bin]# hdfs dfsadmin -reportConfigured Capacity: 66672975872 (62.09 GB)Present Capacity: 51799486464 (48.24 GB)DFS Remaining: 50167644160 (46.72 GB)DFS Used: 1631842304 (1.52 GB)DFS Used%: 3.15%Under replicated blocks: 0Blocks with corrupt replicas: 0Missing blocks: 0Missing blocks (with replication factor 1): 0-------------------------------------------------Live datanodes (3):Name: 192.168.1.10:50010 (KEL)Hostname: KELDecommission Status : NormalConfigured Capacity: 29180092416 (27.18 GB)DFS Used: 543965184 (518.77 MB)

namenode无法启动的时候,出现如下报错,记得检查zk服务是否已经启动。

代码语言:javascript
复制
2021-03-06 01:26:29,525 INFO org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Triggering log roll on remote NameNode KEL1/192.168.1.99:90002021-03-06 01:26:29,531 WARN org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unable to trigger a roll of the active NNorg.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category JOURNAL is not supported in state standby        at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)        at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1774)        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1313)        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:5824)        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1121)        at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:142)        at org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService$2.callBlockingMethod(NamenodeProtocolProtos.java:12025)        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)        at java.security.AccessController.doPrivileged(Native Method)        at javax.security.auth.Subject.doAs(Subject.java:422)

4 namenode的切换

在高可用架构中,zkfc负责切换的动作,也就是杀掉自己的namenode或者是将状态修改为standby,除了namenode进程出现问题的时候,会进行切换,当zkfc进程挂了的时候,这个时候也会将namenode进行切换:

代码语言:javascript
复制
[root@KEL1 logs]# hdfs haadmin -getServiceState nn2active[root@KEL1 logs]# hdfs haadmin -getServiceState nn1standby[root@KEL1 logs]# jps2503 NameNode3626 DFSZKFailoverController#模拟zkfc进程挂掉[root@KEL1 logs]# kill -9 3626[root@KEL1 logs]# hdfs haadmin -getServiceState nn2standby[root@KEL1 logs]# hdfs haadmin -getServiceState nn1active[root@KEL1 logs]# jps2503 NameNode#本来为standby节点的zkfc进程通知active的namenode进行状态切换[root@KEL logs]# tail -f hadoop-root-zkfc-KEL.log 2021-03-06 06:25:32,521 INFO org.apache.hadoop.ha.ActiveStandbyElector: Checking for any old active which needs to be fenced...2021-03-06 06:25:32,534 INFO org.apache.hadoop.ha.ActiveStandbyElector: Old node exists: 0a026e7312036e6e321a044b454c3120a84628d33e2021-03-06 06:25:32,536 INFO org.apache.hadoop.ha.ZKFailoverController: Should fence: NameNode at KEL1/192.168.1.99:90002021-03-06 06:25:32,811 INFO org.apache.hadoop.ha.ZKFailoverController: Successfully transitioned NameNode at KEL1/192.168.1.99:9000 to standby state without fencing2021-03-06 06:25:32,811 INFO org.apache.hadoop.ha.ActiveStandbyElector: Writing znode /hadoop-ha/ns/ActiveBreadCrumb to indicate that the local node is the most recent active...2021-03-06 06:25:32,835 INFO org.apache.hadoop.ha.ZKFailoverController: Trying to make NameNode at KEL/192.168.1.10:9000 active...2021-03-06 06:25:33,529 INFO org.apache.hadoop.ha.ZKFailoverController: Successfully transitioned NameNode at KEL/192.168.1.10:9000 to active state

当journal node不可用时,standby节点的namenode会直接关闭(连接QJM超时):

代码语言:javascript
复制
2021-03-06 06:38:14,426 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: flush failed for required journal (JournalAndStream(mgr=QJM to [192.168.1.10:8485, 192.168.1.99:8485, 192.168.1.199:8485], stream=QuorumOutputStream starting at txid 54187))java.io.IOException: Timed out waiting 20000ms for a quorum of nodes to respond.  at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:137)  at org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:107)  at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)  at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)  at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:533)  at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)  at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:57)  at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:529)  at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:647)  at org.apache.hadoop.hdfs.server.namenode.FSEditLog.endCurrentLogSegment(FSEditLog.java:1266)  at org.apache.hadoop.hdfs.server.namenode.FSEditLog.rollEditLog(FSEditLog.java:1203)  at org.apache.hadoop.hdfs.server.namenode.FSImage.rollEditLog(FSImage.java:1294)  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:5832)  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1121)  at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:142)  at org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService$2.callBlockingMethod(NamenodeProtocolProtos.java:12025)  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)  at java.security.AccessController.doPrivileged(Native Method)  at javax.security.auth.Subject.doAs(Subject.java:422)  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)2021-03-06 06:38:14,427 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Aborting QuorumOutputStream starting at txid 541872021-03-06 06:38:14,433 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 12021-03-06 06:38:14,435 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: KEL1/192.168.1.99:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=10000 MILLISECONDS)2021-03-06 06:38:14,436 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: KEL/192.168.1.10:8485. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=10000 MILLISECONDS)2021-03-06 06:38:14,455 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: /************************************************************SHUTDOWN_MSG: Shutting down NameNode at KEL/192.168.1.10************************************************************/

active节点会坚持一段时间,然后也关闭namenode:

代码语言:javascript
复制
2021-03-06 06:40:32,914 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 118229 ms (timeout=120000 ms) for a response for getJournalState(). Succeeded so far: [192.168.1.199:8485]2021-03-06 06:40:33,918 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 119233 ms (timeout=120000 ms) for a response for getJournalState(). Succeeded so far: [192.168.1.199:8485]2021-03-06 06:40:34,687 FATAL org.apache.hadoop.hdfs.server.namenode.FSEditLog: Error: recoverUnfinalizedSegments failed for required journal (JournalAndStream(mgr=QJM to [192.168.1.10:8485, 192.168.1.99:8485, 192.168.1.199:8485], stream=null))java.io.IOException: Timed out waiting 120000ms for a quorum of nodes to respond.  at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:137)  at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createNewUniqueEpoch(QuorumJournalManager.java:182)  at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.recoverUnfinalizedSegments(QuorumJournalManager.java:436)  at org.apache.hadoop.hdfs.server.namenode.JournalSet$8.apply(JournalSet.java:624)  at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:393)  at org.apache.hadoop.hdfs.server.namenode.JournalSet.recoverUnfinalizedSegments(JournalSet.java:621)  at org.apache.hadoop.hdfs.server.namenode.FSEditLog.recoverUnclosedStreams(FSEditLog.java:1439)  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startActiveServices(FSNamesystem.java:1112)  at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.startActiveServices(NameNode.java:1710)  at org.apache.hadoop.hdfs.server.namenode.ha.ActiveState.enterState(ActiveState.java:61)  at org.apache.hadoop.hdfs.server.namenode.ha.HAState.setStateInternal(HAState.java:64)  at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.setState(StandbyState.java:49)  at org.apache.hadoop.hdfs.server.namenode.NameNode.transitionToActive(NameNode.java:1583)  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.transitionToActive(NameNodeRpcServer.java:1478)  at org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)  at org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:4460)  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)  at java.security.AccessController.doPrivileged(Native Method)  at javax.security.auth.Subject.doAs(Subject.java:422)  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)2021-03-06 06:40:34,691 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 12021-03-06 06:40:34,696 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: KEL1/192.168.1.99:8485. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=10000 MILLISECONDS)2021-03-06 06:40:34,700 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:

zkfc会报告namenodes失连:

代码语言:javascript
复制
2021-03-06 06:47:25,839 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: KEL/192.168.1.10:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=10000 MILLISECONDS)2021-03-06 06:47:25,840 WARN org.apache.hadoop.ha.HealthMonitor: Transport-level exception trying to monitor health of NameNode at KEL/192.168.1.10:9000: java.net.ConnectException: Connection refused Call From KEL/192.168.1.10 to KEL:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

zk未启动的报错日志:

代码语言:javascript
复制
/*
* 提示:该行代码过长,系统自动注释不进行高亮。一键复制会移除系统注释 
* #其中之一的namenode2021-03-06 07:01:36,729 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9000, call org.apache.hadoop.hdfs.server.protocol.NamenodeProtocol.rollEditLog from 192.168.1.99:48213 Call#21 Retry#0: org.apache.hadoop.ipc.StandbyException: Operation category JOURNAL is not supported in state standby2021-03-06 07:02:36,558 INFO org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Triggering log roll on remote NameNode KEL1/192.168.1.99:90002021-03-06 07:02:36,567 WARN org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unable to trigger a roll of the active NNorg.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category JOURNAL is not supported in state standby  at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)  at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1774)  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1313)  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:5824)  at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1121)  at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:142)  at org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService$2.callBlockingMethod(NamenodeProtocolProtos.java:12025)  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)  at java.security.AccessController.doPrivileged(Native Method)  at javax.security.auth.Subject.doAs(Subject.java:422)  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)  at org.apache.hadoop.ipc.Client.call(Client.java:1475)  at org.apache.hadoop.ipc.Client.call(Client.java:1412)  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)  at com.sun.proxy.$Proxy15.rollEditLog(Unknown Source)  at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.rollEditLog(NamenodeProtocolTranslatorPB.java:148)  at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:273)  at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$600(EditLogTailer.java:61)  at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:315)  at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:284)  at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:301)  at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)  at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:297)2021-03-06 07:02:36,806 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9000, call org.apache.hadoop.hdfs.server.protocol.NamenodeProtocol.rollEditLog from 192.168.1.99:48217 Call#25 Retry#0: org.apache.hadoop.ipc.StandbyException: Operation category JOURNAL is not supported in state standby2021-03-06 06:56:27,083 WARN org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Waited 19055 ms (timeout=20000 ms) for a response for selectInputStreams. No responses yet.2021-03-06 06:56:28,045 WARN org.apache.hadoop.hdfs.server.namenode.FSEditLog: Unable to determine input streams from QJM to [192.168.1.10:8485, 192.168.1.99:8485, 192.168.1.199:8485]. Skipping.java.io.IOException: Timed out waiting 20000ms for a quorum of nodes to respond.  at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:137)  at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectInputStreams(QuorumJournalManager.java:471)  at org.apache.hadoop.hdfs.server.namenode.JournalSet.selectInputStreams(JournalSet.java:278)  at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1508)  at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1532)  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:652)  at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:975)  at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:681)  at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:584)  at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:644)  at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:811)  at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:795)  at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1488)  at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1554)2021-03-06 07:00:02,301 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby2021-03-06 07:00:36,663 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000, call org.apache.hadoop.hdfs.server.protocol.NamenodeProtocol.rollEditLog from 192.168.1.10:37953 Call#17 Retry#0: org.apache.hadoop.ipc.StandbyException: Operation category JOURNAL is not supported in state standby2021-03-06 07:00:36,898 INFO org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Triggering log roll on remote NameNode KEL/192.168.1.10:9000
*/

此时再启动zk,发现两个namenode的状态均为standby,检查日志:

代码语言:javascript
复制
2021-03-06 07:14:15,925 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby2021-03-06 07:14:17,290 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby2021-03-06 07:14:18,318 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby2021-03-06 07:14:19,361 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby2021-03-06 07:14:20,792 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby2021-03-06 07:14:21,855 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Get corrupt file blocks returned error: Operation category READ is not supported in state standby2021-03-06 07:14:37,845 INFO org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Triggering log roll on remote NameNode KEL/192.168.1.10:90002021-03-06 07:14:47,854 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: KEL/192.168.1.10:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=10000 MILLISECONDS)2021-03-06 07:14:52,935 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started for standby state2021-03-06 07:14:52,936 WARN org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unable to trigger a roll of the active NNjava.io.IOException: Failed on local exception: java.io.InterruptedIOException: Interrupted: action=RetryAction(action=RETRY, delayMillis=10000, reason=null), retry policy=RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=10000 MILLISECONDS); Host Details : local host is: "KEL1/192.168.1.99"; destination host is: "KEL":9000;   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:773)  at org.apache.hadoop.ipc.Client.call(Client.java:1479)  at org.apache.hadoop.ipc.Client.call(Client.java:1412)  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)  at com.sun.proxy.$Proxy15.rollEditLog(Unknown Source)  at org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.rollEditLog(NamenodeProtocolTranslatorPB.java:148)  at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.triggerActiveLogRoll(EditLogTailer.java:273)  at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$600(EditLogTailer.java:61)  at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:315)  at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:284)  at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:301)  at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)  at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:297)Caused by: java.io.InterruptedIOException: Interrupted: action=RetryAction(action=RETRY, delayMillis=10000, reason=null), retry policy=RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=10000 MILLISECONDS)  at org.apache.hadoop.ipc.Client$Connection.handleConnectionFailure(Client.java:868)  at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:633)  at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)  at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)  at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)  at org.apache.hadoop.ipc.Client.call(Client.java:1451)  ... 11 moreCaused by: java.lang.InterruptedException: sleep interrupted  at java.lang.Thread.sleep(Native Method)  at org.apache.hadoop.ipc.Client$Connection.handleConnectionFailure(Client.java:866)  ... 16 more

发现没有zkfc进程,启动zkfc进程,standby变成active,没有zkfc,不能自动完成切换。

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2021-03-05,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 SRE运维实践 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
大数据
全栈大数据产品,面向海量数据场景,帮助您 “智理无数,心中有数”!
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档