I'm trying to read just a couple of fields from each of a bunch of xml files. I wrote a little function that extracts the fields I need and returns them as a vector:
id_dir <- function(d) {
xml <- read_xml(d)
id <- xml_text(xml_node(xml, 'AwardID'))
dir <- xml_text(xml_node(xml, 'Abbreviation'))
phone <- xml_text(xml_node(xml, 'PhoneNumber'))
return(c(id, phone, dir))
}
But when I wrap it with ldply
the following happens:
setwd('xmls/2017')
files <- list.files()[1:100]
sev_data <- plyr::ldply(files, id_dir)
Error in read_xml.character(d) : xmlParseEntityRef: no name [68]
This happens despite the fact that the following code works as intended:
id_dir(glue('xmls/2017/{files[1]}'))
"1700003" "5746317432" "MPS"
I've tried poking around SO for quite a while now, but mostly I'm seeing people talking about PHP and stuff that is most likely irrelevant.
For reproducibility here are a couple of files I'm reading in.