3

The problem is very simple, but I'm having no luck fixing it. strsplit() is a fairly simple function, and I am surprised I am struggling as much as I am:

# temp is the problem string. temp is copy / pasted from my R code.
# i am hoping the third character, the space, which i think is the error, remains the error 
temp = "GS PG"

# temp2 is created in stackoverflow, using an actual space
temp2 = "GS PG"

unlist(strsplit(temp, split = " "))
[1] "GS PG"
unlist(strsplit(temp2, split = " "))
[1] "GS" "PG"

.
even if it doesn't work here with me trying to reproduce the example, this is the issue i am running into. with temp, the code isn't splitting the variable on the space for some odd reason. any thoughts would be appreciated!

Best,

EDIT - my example failed to recreate the issue. For reference, temp is being created in my code by scraping code from online with rvest, and for some reason it must be scraping a different character other than a normal space, i think? I need to split these strings by space though.

smci
  • 32,567
  • 20
  • 113
  • 146
Canovice
  • 9,012
  • 22
  • 93
  • 211
  • i can post with reproducible code, however this would involve posting the rvest() scraping code as well, which i don't mind, but wanted to see if we could find a solution without first – Canovice Sep 01 '16 at 19:07
  • What happens when you do `grep(" ", temp)`? Then you can try `grep(" \t\n\r\v\f", temp)` to see if any of these whitespace characters work. – USER_1 Sep 01 '16 at 19:08
  • `grep(" ", temp)` returns `integer(0)` – Canovice Sep 01 '16 at 19:16
  • You can see what your mystery space-like character is with e.g. `charToRaw` or `utf8ToInt` [How to convert characters into ASCII code?](https://stackoverflow.com/questions/32160958/how-to-convert-characters-into-ascii-code) – smci May 25 '18 at 00:10

2 Answers2

7

Try the following:

unlist(strsplit(temp, "\\s+"))

The "\\s+" is a regex search for any type of whitespace instead of just a standard space.

ode2k
  • 2,653
  • 13
  • 20
0

As in the comment,

It is likely that the "space" is not actually a space but some other whitespace character. Try any of the following to narrow it down:

whitespace <- c(" ", "\t" , "\n", "\r", "\v", "\f")
grep(paste(whitespace,collapse="|"), temp)

Related question here: How to remove all whitespace from a string?

Community
  • 1
  • 1
USER_1
  • 2,409
  • 1
  • 28
  • 28