8

I have e elasticsearch environment configured on GCE (Google Compute Engine) with two nodes, therefore two VM's and I need to create a backup strategy for this. I first thought I could use the elasticsearch snapshot API to backup all my data to a given storage as the API supports a few ways to store the snapshot.

  • Shared filesystem, such as a NAS
  • Amazon S3
  • HDFS (Hadoop Distributed File System)
  • Azure Cloud

I tried to used the shared filesystem option, but it requires that the store location be shared between nodes. Is there a way I can do this on GCE?

curl -XPUT http://x.x.x.x:9200/_snapshot/backup -d '{
    "type": "fs",
    "settings": {
        "compress" : true,
        "location": "/elasticsearch/backup"
    }

}'

nested: RepositoryVerificationException[[backup] store location [/elasticsearch/backup] is not shared between node

I know there is a AWS plugin for elasticsearch for storing the backups. Is there any plugin for Google Cloud Storage? Is is possible to do that?

If any of those alternatives above are not possible, is there any other recommended strategy for backing-up my data?

Ryan
  • 2,512
  • 1
  • 13
  • 20
Edmar Miyake
  • 12,047
  • 3
  • 37
  • 38
  • Any updates on this @Edmar Miyake? Did you figure out a way to backup ES on GCE? – chrishiestand May 03 '15 at 09:41
  • 1
    I found that this is a feature currently scheduled to be supported in elasticsearch-cloud-gce 2.5.1 (the next release of the plugin) github.com/elastic/elasticsearch-cloud-gce/issues/11 – chrishiestand May 04 '15 at 02:25
  • @chrishiestand Unfortunately I ended up using Amazon S3. It's nonsense since Amazon and GCE are competitors. – Edmar Miyake May 06 '15 at 19:26

3 Answers3

2

Elasticsearch now has a plugin for Google Cloud Storage, so this is natively supported.

Will Hayworth
  • 1,272
  • 10
  • 22
0

You may be able to use the S3 plugin with Google Cloud Storage by way of interoperability. See this page for more details.

Alternatively, you can just create a normal backup in the filesystem then upload it to cloud storage using gsutil.

Jon Wayne Parrott
  • 1,341
  • 10
  • 18
  • Thanks for your suggestion. I understand the interoperability between S3 and GCS. But It doesn't necessary mean that elasticsearch is interopable with GCS by using Amazon S3 plugin for elastic search. By normal backup you mean creating a disk snapshot? That would make me create a backup for every elasticsearch node I have, right? – Edmar Miyake Feb 04 '15 at 19:22
  • 1
    @EdmarMiyake not so much a full disk snapshot, just do a normal, local filesystem backup with elasticsearch then copy that to cloud storage. Hopefully this is making sense, as I'm not overly familiar with how elasticsearch works. – Jon Wayne Parrott Feb 04 '15 at 20:05
0

I am having the same problem with my ES cluster (5 nodes) on Google Cloud. We can't use local backups on the actual disk as Jon mentioned above since not every node has all the data in my case.

It seems to me that only way is to create a small machine with a large disk and mounting that disk as a shared drive on all 5 ES nodes I have in the same path so that we can use the "Shared filesystem" option.

fsamara
  • 66
  • 1
  • 3