0

I have a file on hdfs having size 11 gb. I want to split it into multiple files in 1 gb. How can I do that? My hadoop version is 2.7.3

sathya
  • 1,982
  • 1
  • 20
  • 37
Vish
  • 186
  • 3
  • 17

2 Answers2

0

If you have spark, try below-

Below example splits input-file into 2 files:

spark-shell

scala> sc.textFile("/xyz-path/input-file",2).saveAsTextFile("/xyz-path/output-file")
Rahul Sharma
  • 5,614
  • 10
  • 57
  • 91
0

In HDFS, the file is already split, per dfs.block.size setting

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245