4

As LIWC software and dictionaries are proprietary, I was pleased to see they seemed to play well with the still-in-development but excellent R package Quanteda.

The documentation for the R package Quanteda demonstrates its use with a LIWC-format dictionary, as does this SO post.

I purchased LIWC 2015 but can't figure out how to export the dictionary outside the application other than as a PDF.

Community
  • 1
  • 1
Joshua Rosenberg
  • 4,014
  • 9
  • 34
  • 73
  • It seems that LIWC can now be used from the command line, thus can be used from R indirectly. Here is the official example: https://github.com/ryanboyd/liwc-22-cli-r/blob/main/LIWC-22-cli_Example.R – Tamas Nagy Mar 24 '22 at 11:23

1 Answers1

3

Edited by request of Receptivity ("the commercial side of LIWC")

I will not advise you on how to extract the English LIWC 2015 dictionary from the Java Archive (.jar) file that contains the software which requires a purchase.

Unlike in previous versions of the LIWC software, the dictionary files are not distributed directly with the software. But using your legally purchased serial number to log in, you can download the non-English dictionaries from LIWC2007 and LIWC2001 (depending on the language) from http://dictionaries.liwc.net, which includes German, Dutch, Italian, Russian, French, and Spanish versions.

If you have a dictionary formatted in the same manner as the LIWC dictionaries, for instance the Moral Foundations dictionary, then this will work:

require(quanteda)
mfdict <- dictionary(file = "http://www.moralfoundations.org/sites/default/files/files/downloads/moral%20foundations%20dictionary.dic", 
                    format = "LIWC")

which loads and converts the Moral Foundations dictionary into the quanteda format. You can use the dictionary in constructing a document-feature matrix using

dfm(x, dictionary = mfdict)
Ken Benoit
  • 14,454
  • 27
  • 50
  • Thanks so much, I was able to do this and download the non-English dictionaries as well. When I attempted to load the dictionary (in Quanteda), the following warnings appeared: Warning messages: `1: In readLIWCdict(file, maxcats = maxcats, enc = enc) : NAs introduced by coercion 2: In unique(c(as.numeric(x), as.numeric(y))) : NAs introduced by coercion 3: In unique(c(as.numeric(x), as.numeric(y))) : NAs introduced by coercion` - any thoughts on where I might be going wrong? – Joshua Rosenberg Dec 11 '15 at 01:56
  • 1
    I don't think it's an error, but I will investigate why there is a warning. The problem is that the LIWC dictionary files don't always strictly follow their own formatting rules! – Ken Benoit Dec 11 '15 at 09:35
  • Wondering if I'm saving or reading something incorrectly but the following errors appear when I use dfm(): `Error in which(stringi::stri_detect_regex(uniqueFeatures, paste(x, collapse = "|"), : error in evaluating the argument 'x' in selecting a method for function 'which': Error in stringi::stri_detect_regex(uniqueFeatures, paste(x, collapse = "|"), : Incorrectly nested parentheses in regexp pattern. (U_REGEX_MISMATCHED_PAREN)` – Joshua Rosenberg Dec 11 '15 at 13:14
  • Please file an issue on Github and we will get it solved. – Ken Benoit Dec 11 '15 at 14:02
  • As the CTO of Receptiviti (we are the commercial side of LIWC), I would like to clarify that what Ken Benoit has recommended is a violation of the terms of use. To quote specifically from the terms and conditions "The LIWC Software and Dictionaries cannot be integrated or codified into other computer programs or systems, and automation cannot be applied to the LIWC Software or Dictionaries." [Receptiviti](http://www.receptiviti.com) provides an api for this purpose. – Sharan Karanth Dec 11 '15 at 19:43
  • Here I have simply suggested the same as Provalis, whose [LIWC page](http://provalisresearch.com/products/content-analysis-software/wordstat-dictionary/linguistic-inquiry-and-word-count/) for WordStat suggests doing exactly what I proposed, which is purchasing the license, and then using the licensed dictionary with a different tool to apply it. But the LIWC dictionaries are not the only ones using this format - for instance the [Moral Foundations project](http://www.moralfoundations.org/othermaterials) distributes their dictionary (freely) uses the LIWC format. – Ken Benoit Dec 11 '15 at 23:46