7

Say I have a .Rnw file containing the usual LaTex mixed in with R code chunks. (I'm especially interested in converting a .Rnw slides document, but this question applies to any .Rnw document). Now I want to convert this to a file which contains all of the R code, plus all of the text that would normally be generated by LaTex, as R comments. In other words, the functionality I want is similar to what Stangle() does, but I also want all the text part of the LaTex converted to plain text that's commented out in the resulting .R file.

This would be a very convenient way to automatically generate a commented R file that's easy to look at in your favorite syntax-highlighting editor (e.g. emacs). This might not sound like a great idea for an Sweave document that's a long article with only a bit of R code, but it starts to look appealing when the .Rnw document is actually a slide presentation (e.g. using beamer) -- then the text portion of the slides would make perfect comments for the R code.

Anyone have any ideas on how to do this? Thanks in advance.

Prasad Chalasani
  • 19,912
  • 7
  • 51
  • 73

1 Answers1

8

Here is one approach using regex. There are still some issues that remain, and I will maintain a list which will be updated with resolutions.

# READ LINES FROM RNW FILE
lines <- readLines('http://users.stat.umn.edu/~charlie/Sweave/foo.Rnw')

# DETECT CODE LINES USING SWEAVE CHUNK DEFINITIONS
start_chunk <- grep("^<<.*=$", lines)
end_chunk   <- grep("^@" , lines)
r_lines     <- unlist(mapply(seq, start_chunk + 1, end_chunk - 1))

# COMMENT OUT NON CODE LINES AND WRITE TO FILE
lines[-r_lines] <- paste("##", lines[-r_lines])
writeLines(lines, con='codefile.R')

ISSUES REMAINING:

  1. Does not deal well with chunks called inside other chunks using <<chunk_name>>
Prasad Chalasani
  • 19,912
  • 7
  • 51
  • 73
Ramnath
  • 54,439
  • 16
  • 125
  • 152
  • You may want to define more complex regex to mend the chunk naming issue, like: `"^<<.*>>=?$"` for start chunk. But it's very neat approach, anyway... nice usage of `mapply`. – aL3xa Nov 10 '11 at 00:10
  • That's a neat approach, and yes, nice use of `mapply`. One more thing that would be nice is to get rid of all the `LaTeX` markup (things like `\begin{frame}`, `\frametitle`, ...) to produce clean, purely textual comments -- At least, getting rid of all LaTeX keywords would be a start. I suppose one could write a `regex` to replace all the reserved words of LaTeX with empty strings. That would be a start, but I'm hoping there's some way to leverage the LaTeX parser, and somehow capture the *text* that latex would have generated. – Prasad Chalasani Nov 10 '11 at 01:39
  • Combining @Ramnath's idea with one of the LaTeX-to-text solutions from another SO question (http://stackoverflow.com/questions/530121/how-do-i-convert-latex-to-plain-text-ascii) may get me what I want. – Prasad Chalasani Nov 10 '11 at 13:51