0

I have a problem with write.csv function. I have a DB in an external csv-file, which I load in R through read.csv2 function. Then the script does some normalization, after it, I write the DB through write.csv2. The number of entries is 24500.

Loading the normalized DB in an another script (again through read.csv2), it results to be just 23700 entries. In the same time, in Excel, I can see all the 24500 entries. Re-saving it from Excel, and re-loading it in R through read.csv2, the result is correct (24500 entries).

I suppose to make a mistake when I try to save the DB.

The sketch of the scripts:

#SCRIPT 1

DB <- read.csv2("DB.csv", header = TRUE, 
                na.strings = c("MISSING_VALUE", "", " ", "NA", "#DIV/0!"),
                stringsAsFactors = FALSE) #24577 entries

#normalization

write.csv2(DB, "DB.csv", quote = FALSE, row.names = FALSE) #data.frame DB contains 24577 entries

#SCRIPT 2

DB <- read.csv2("DB.csv", na.strings = "NA", stringsAsFactors = FALSE) #23787 entries

I've tried also fwrite function from data.table library, but the result is the same.

Thank you so much!

  • At a minimum you need to suppress the comment character handling with `comment.char=""` because of the values of "#DIV/0!" . The octothorpe is the default R "comment.char". I've explained this in more detail with additional tips on how to use `count.fields` at https://stackoverflow.com/questions/8568968/r-programming-read-csv-skips-lines-unexpectedly – IRTFM May 25 '17 at 03:48
  • With count.fields my script is more corresctm thank you. Thank you so much for the link also! Now it works with `quote = ""`. – Pavel Erokhin May 25 '17 at 08:27
  • OK, I'll add it as an answer to keep it from lying in hte unanswered questions queue. – IRTFM May 25 '17 at 16:07

1 Answers1

0

At a minimum you need to suppress the comment character handling with comment.char="" because of the values of "#DIV/0!" . The octothorpe is the default R "comment.char". I've explained this in more detail with additional tips on how to use count.fields at: R Programming: read.csv() skips lines unexpectedly

IRTFM
  • 258,963
  • 21
  • 364
  • 487