0

I am running a Spark on Hadoop cluster. I tried running a Spark job and noticed I was getting some issues, eventually realised by looking at the logs of the data node that the file system of one of the datanodes is full

I looked at hdfs dfsadmin -report to identify this. The category DFS remaining is 0B because the non-DFS used is massive (155GB of 193GB configured capacity).

When I looked at the file system on this data node I could see most of this comes from the /usr/local/hadoop_work/ directory. There are three block pools there and one of them is very large (98GB). When I look on the other data node in the cluster it only has one block pool.

What I am wondering is can I simply delete two of these block pools? I'm assuming (but don't know enough about this) that the namenode (I have only one) will be looking at the most recent block pool which is smaller in size and corresponds to the one on the other data node.

soundofsilence
  • 330
  • 2
  • 12
  • See https://stackoverflow.com/questions/45157724/some-datanodes-still-showing-block-pool-used-after-clearing-the-hdfs. I would not recommend deleting just like that - the sound may deafening as a consequence. – thebluephantom Sep 01 '19 at 16:12
  • @thebluephantom thanks for the pointer to your question. Why do you think it might cause a problem? – soundofsilence Sep 01 '19 at 19:15
  • Not sure as I am more a designer, architect – thebluephantom Sep 01 '19 at 19:53
  • Okay fair enough. For posterity I did just delete the two block pools that seemed redundant and nothing terrible seems to have happened. I previously couldn't start the datanode running on the node whose disk was full but deleting the block pool seemed to enable the datanode to start – soundofsilence Sep 01 '19 at 20:21
  • Interesting to know, I would have been very careful, but documentation is not so great in this space. – thebluephantom Sep 01 '19 at 21:03
  • may be answer your own question?.... – thebluephantom Sep 01 '19 at 21:11

1 Answers1

0

As outlined in the comment above, eventually I did just delete the two block pools. I did this based on the fact that these block pool ID's didn't exist in the other data node and by looking through the local filesystem I could see the files under these ID's hadn't been updated for a while.

soundofsilence
  • 330
  • 2
  • 12