1

i have the flowing text string:

string <- "['CBOE SHORT-TERM VIX FUTURE DEC 2016', 81.64],\n\n    ['CBOE SHORT-TERM VIX FUTURE JAN 2017', 18.36]"

is there a simple way of extracting numerical elements from text without having to use:

string_table <- strsplit(string, " ")

and then select n-th element and continue to strsplit until i have what i need.

the result should be:

result <- c(2016, 81, 64, 2017, 18, 36)

thank you.

Alex Bădoi
  • 830
  • 2
  • 9
  • 24

1 Answers1

13

We can use str_extract_all by specifying the pattern as one or more number ([0-9]+). The output will be a list of length 1, extract the vector with [[ and convert to numeric.

library(stringr)
as.numeric(str_extract_all(string, "[0-9]+")[[1]])
#[1] 2016   81   64 2017   18   36

If we are using strsplit, split by the non-numeric characters

as.numeric(strsplit(string, "\\D+")[[1]][-1])
#[1] 2016   81   64 2017   18   36
akrun
  • 874,273
  • 37
  • 540
  • 662
  • In the first you are searching for all numbers 0 to 9. could you please explain what the + is for. same question for "\\D+" if you could explain the logic there that would b great. Thank you very much for the quick answer . – Alex Bădoi Nov 22 '16 at 14:33
  • 1
    @AlexBădoi The `\\D+` specifies one or more non-numeric characters – akrun Nov 22 '16 at 14:35