1

I have a python script that is intended to run on my local machine every night. It's goal is to pull data from a third party server, do some processing on it, and execute bulk upload to GAE datastore.

My issue though is hot to run bulk upload from a python script. All examples I have seen (including Google's documentation) use command line "appcfg.py upload_data ..." and as far as I can see appcfg.py and bulkloader.py do not expose any API that is guaranteed not to change.

My two options as I see them now is to either execute "appcfg.py upload_data ..." command from my python script, which seems a roundabout way of doing things. Or to directly call appcfg.py's internal methods, which means I have to recode tings in case they change.

Cœur
  • 37,241
  • 25
  • 195
  • 267
sili
  • 25
  • 2

2 Answers2

2

Appengine can run cron jobs. All you need is to write is a single script which pulls the data from third party server and upload it to appengine engine, Appenigne will do the rest for you. Appengine cron this has everything you need to know about running a cron job in appengine

Abdul Kader
  • 5,781
  • 4
  • 22
  • 40
  • The reason I decided to use bulkloader is that the data that I get is in multiple csv files. With some minimal set up, bulkloader does all the work of uploading them for me. As much as I understand, if I use a cron job, I would have to parse the files and create and save entity objects myself. I would probably bite the bullet and use cron if there are no way around it. – sili Jun 01 '11 at 14:20
  • That's an interesting idea. But we come full circle. Does bulkloader provide an API that would not change in the future; is doing something like "bulkloader.main(argv_that_I_create)" safe? – sili Jun 01 '11 at 15:54
  • You cannot trust any API to *not* change in the future. However good APIs only change if truly necessary and they don't suddenly change without any notice/documentation. – Botond Béres Jun 01 '11 at 19:52
  • Parsing CSV files is not difficult - the Python `csv` module does it for you. Writing your own code is probably the best approach here - bulkloader is designed for 'one off' and backup bulkloading/downloading. Running it from within App Engine certainly won't work. – Nick Johnson Jun 02 '11 at 00:44
1

This answer is now outdated. Please see the below link for my latest answer for bulk upload data to app engine.

How to upload data in bulk to the appengine datastore? Older methods do not work

Community
  • 1
  • 1
Sriram
  • 8,574
  • 4
  • 21
  • 30