2

how can I login to a mediawiki with RCurl (or Curl, and I can adapt it to the R package)?

I just want to parse a page but I need to login otherwise I can't access it.

RockScience
  • 17,932
  • 26
  • 89
  • 125
  • Did you google it? [This](http://www.wikihow.com/Use-the-MediaWiki-API) seems a good first step. – Rom1 Jun 16 '11 at 10:38
  • Check this question: [How to analyse Wikipedia article's data base with R?](http://stackoverflow.com/q/6095952/168747) – Marek Jun 16 '11 at 13:08
  • @Marek: Thanks. My problem is a bit more tricky as I do not use wikipedia but a private mediawiki that requires an authentication, but using the mediawiki API with the function ?getForm definitely seams to be a good idea :) – RockScience Jun 17 '11 at 02:34
  • Nobody's linked to http://www.mediawiki.org/wiki/API:Login yet, so let me do that. It doesn't provide explicit sample code, but as long as you know how to send HTTP POST requests and parse the results (which can be obtained in [a bunch of different formats](http://www.mediawiki.org/wiki/API:Data_formats) besides XML), it's not very hard to figure out. – Ilmari Karonen Oct 10 '11 at 19:45

1 Answers1

3

The Mediawiki API has a login function which returns cookies and a token. You have to save and send both back to the API in order to authenticate the session and login. Here's a way to do it with curl and XMLstarlet in bash:

Send a request for a login token, saving the cookies in cookies.txt and the output in output.xml.

curl -c cookies.txt -d "lgname=YOURNAME&lgpassword=YOURPASS&action=login&format=xml" http://your.mediawikiinstall.com/w/api.php -o output.xml

Then pull the token out of the xml using XMLstarlet, and save that as a bash variable.

YOURTOKEN=$(xml sel -t -m '//login' -v '//@token' output.xml)

Then send the login request, including the cookie file and the token.

curl -b cookies.txt -d "action=login&lgname=YOURNAME&lgpassword=YOURPASS&format=xml&lgtoken="$YOURTOKEN http://your.mediawikiinstall.com/w/api.php

meetar
  • 7,443
  • 8
  • 42
  • 73