64

I'd like to know if it's possible to check if there are certain files in a certain bucket.

This is what I've found:

Checking if a file is in a S3 bucket using the s3cmd

It should fix my problem, but for some reason it keeps returning that the file doesn't exist, while it does. This solution is also a little dated and doesn't use the doesObjectExist method.

Summary of all the methods that can be used in the Amazon S3 web service

This gives the syntax of how to use this method, but I can't seem to make it work.

Do they expect you to make a boolean variable to save the status of the method, or does the function directly give you an output / throw an error?

This is the code I'm currently using in my bash script:

existBool=doesObjectExist(${BucketName}, backup_${DomainName}_${CurrentDate}.zip)

if $existBool ; then
        echo 'No worries, the file exists.'
fi

I tested it using only the name of the file, instead of giving the full path. But since the error I'm getting is a syntax error, I'm probably just using it wrong.

Hopefully someone can help me out and tell me what I'm doing wrong.

!Edit

I ended up looking for another way to do this since using doesObjectExist isn't the fastest or easiest.

Community
  • 1
  • 1
J. Swaelen
  • 778
  • 1
  • 8
  • 9
  • 2
    Isn't [this](http://stackoverflow.com/a/18645756/1535071) what you are looking for? – imTachu Jan 26 '17 at 10:58
  • @TachúSalamanca Kind of yes, thank you! I've quickly read the answers and I think I'm going to look for another way to check if files exist. There are probably ways to do it faster and easier than using the `doesBucketExist` method. – J. Swaelen Jan 26 '17 at 11:04

11 Answers11

70

Last time I saw performance comparisons getObjectMetadata was the fastest way to check if an object exists. Using the AWS cli that would be the head-object method, example:

aws s3api head-object --bucket www.codeengine.com --key index.html

which returns:

{
    "AcceptRanges": "bytes",
    "ContentType": "text/html; charset=utf-8",
    "LastModified": "Sun, 08 Jan 2017 22:49:19 GMT",
    "ContentLength": 38106,
    "ContentEncoding": "gzip",
    "ETag": "\"bda80810592763dcaa8627d44c2bf8bb\"",
    "StorageClass": "REDUCED_REDUNDANCY",
    "CacheControl": "no-cache, no-store",
    "Metadata": {}
}
Dave Maple
  • 8,102
  • 4
  • 45
  • 64
  • 5
    i like this because it also validates that what you are checking is an object. using aws s3 ls is a bit too forgiving in this respect. – Karl Rosaen Jun 08 '18 at 20:49
  • For those of you who are looking for `IF` statement here it is: `not_exist=$(aws s3api head-object --bucket "bucket_name" --key "file/path.ext" >/dev/null 2>1; echo $?) if [ $not_exist == 255 ]; then echo "it does not exist" else echo "it exists" fi` – Dimitry Orgonov Apr 11 '22 at 07:17
  • 1
    Testing for 255 didn't work for me, but 254 did. – Elifarley Nov 21 '22 at 08:28
  • Yes, as of June 2023 it's returning `254` for success, not `255`. – Garret Wilson Jun 03 '23 at 14:19
  • @DimitryOrgonov I think you wanted to use `2>&1` and not `2>1`, right? Otherwise wouldn't that redirect to a file named `1`. See https://stackoverflow.com/a/818284 . Nevertheless `2>&1` an even `2>'&1'` didn't work for me with Git Bash on Windows 10 (git version 2.40.1.windows.1), so I switched to `>/dev/null 2>/dev/null`, which I think does the same thing and should be even more compatible. – Garret Wilson Jun 03 '23 at 17:08
43

Following to @DaveMaple & @MichaelGlenn answers, here is the condition I'm using:

aws s3api head-object --bucket <some_bucket> --key <some_key> || not_exist=true
if [ $not_exist ]; then
  echo "it does not exist"
else
  echo "it exists"
fi
ItayB
  • 10,377
  • 9
  • 50
  • 77
  • 1
    This seems to echo out the response, is there a way to just assign the $not_exists variable without displaying the result or the error? – John Mellor Nov 16 '21 at 06:57
  • what about replacing the `echo` commands with any assignment that you like? – ItayB Nov 16 '21 at 08:18
  • I'm not sure I understand your point, or perhaps you misunderstood mine? I'm not saying it echo's "it does not exist" which could easily be changed obviously. I'm saying it prints "An error occurred (404) when calling the HeadObject operation: Not Found" to the terminal before echoing "it does not exist". Is there a way to prevent it printing the 404 message? – John Mellor Nov 18 '21 at 00:10
  • 2
    @JohnMellor add `> /dev/null 2>&1` to the first command: `aws s3api head-object --bucket --key > /dev/null 2>&1 || not_exist=true` – ItayB Nov 18 '21 at 16:54
  • 1
    @ItayB this makes the if state not work, looks like it assumes it's false always since there's no error output. adding this doesn't work for me, so that's my assumption -- i could be wrong – sojim2 Aug 03 '22 at 23:13
28

Note that "aws s3 ls" does not quite work, even though the answer was accepted. It searches by prefix, not by a specific object key. I found this out the hard way when someone renamed a file by adding a '1' to the end of the filename, and the existence check would still return True.

(Tried to add this as a comment, but do not have enough rep yet.)

Michael Glenn
  • 311
  • 4
  • 4
  • 2
    I just noticed this exact same behavior and this is what brought me to this question. – BiBi Jun 01 '21 at 08:47
8

One simple way is using aws s3 ls

exists=$(aws s3 ls $path_to_file)
if [ -z "$exists" ]; then
  echo "it does not exist"
else
  echo "it exists"
fi
traceformula
  • 383
  • 4
  • 4
  • 10
    Sorry if I sound too harsh, but this should **not** be accepted as an answer, due to the reasons explained in the other two posts. – nodakai Oct 16 '18 at 02:28
  • 2
    This solution does not work correctly if you have files with the same prefix. `s3://bucket/file.txt` would be treated as existing when there's file in bucket `s3://bucket/file.txt.gz`. head-object approach is probably a correct one but it forces you to split `s3://` uri into different parts. – Marius Grigaitis Jan 14 '19 at 16:24
8

I usually use set -eufo pipefail and the following works better for me because I do not need to worry about unset variables or the entire script exiting.

object_exists=$(aws s3api head-object --bucket $bucket --key $key || true)
if [ -z "$object_exists" ]; then
  echo "it does not exist"
else
  echo "it exists"
fi
Amri
  • 1,080
  • 9
  • 16
5

This statement will return a true or false response:

aws s3api list-objects-v2 \
  --bucket <bucket_name> \
  --query "contains(Contents[].Key, '<object_name>')"

So, in case of the example provided in the question:

aws s3api list-objects-v2 \
  --bucket ${BucketName} \
  --query "contains(Contents[].Key, 'backup_${DomainName}_${CurrentDate}.zip')"

I like this approach, because:

  • The --query option uses the JMESPath syntax for client-side filtering and it is well documented here how to use it.

  • Since the --query option is build into the aws cli, no additional dependencies need to be installed.

  • You can first run the command without the --query option, like:

      aws s3api list-objects-v2 --bucket <bucket_name> 
    

    That returns a nicely formatted JSON, something like:

{
    "Contents": [
        {
            "Key": "my_file_1.tar.gz",
            "LastModified": "----",
            "ETag": "\"-----\"",
            "Size": -----,
            "StorageClass": "------"
        },
        {
            "Key": "my_file_2.txt",
            "LastModified": "----",
            "ETag": "\"----\"",
            "Size": ----,
            "StorageClass": "----"
        },
        ...
    ]
}
  • This then allows you to design an appropriate query. In this case you want to check if the JSON contains a list Contents and that an item in that list has a Key equal to your file (object) name:

    --query "contains(Contents[].Key, '<object_name>')"
    
Arjaan Buijk
  • 1,306
  • 16
  • 18
  • https://awscli.amazonaws.com/v2/documentation/api/latest/reference/s3api/list-objects-v2.html says "Returns some or all (up to 1,000) of the objects in a bucket with each request." So I would assume that this might give you `false` if you have a bucket with more then 1000 items, even if the object exists, just because of the pagination – karfau Mar 30 '23 at 05:18
3

A simpler solution, but not as sophisticated as other aws s3 api's is to use the exit code

aws s3 ls <full path to object>

Returns a non-zero return code if the object doesn't exist. 0 if it exists.

Soundararajan
  • 2,000
  • 21
  • 23
2

From awscli, we do a ls along with a grep, example

aws s3 ls s3://<bucket_name> | grep 'filename'

This can be included in the bash script.

henrycarteruk
  • 12,708
  • 2
  • 36
  • 40
Sandy
  • 946
  • 11
  • 14
1

Inspired by the answers above, I use this to also check the file size, because my bucket was trashed by some script with a 404 answers. It requires jq tho.

minsize=100
s3objhead=$(aws s3api head-object \
  --bucket "$BUCKET" --key "$KEY" 
  --output json || echo '{"ContentLength": 0}')

if [ $(printf "%s" "$s3objhead" | jq '.ContentLength') -lt "$minsize" ]; then
  # missing or small
else
  # exist and big
fi
Colin
  • 1,112
  • 1
  • 16
  • 27
0

Here's a simple POSIX shell function (so it also works in Bash) based on @Dmitri Orgonov's answer:

s3_key_exists() {
  aws >/dev/null 2>&1 s3api head-object --bucket "$1" --key "$2"
  test $? != 254
}

And here's how to use it:

s3_key_exists myBucket path/to/my/file.txt \
  && echo "It's there!" \
  || echo "Not found..."

Now, if what you have is an S3 path instead of a bucket and a key:

s3_file_exists() {
  local bucketAndKey="$(s3_bucket_and_key "$1")"
  s3_key_exists "${bucketAndKey%:*}" "${bucketAndKey#*:}"
}
s3_bucket_and_key() {
  local input="${1#/}"; local bucket="${input%%/*}"; local key="${input#$bucket}"
  echo "$bucket:${key#/}"
}

And here's a usage example:

s3_file_exists /myBucket/path/to/my/file.txt \
  && echo "It's there!" \
  || echo "Not found..."

Or...

s3_file_exists myBucket/path/to/my/other-file.txt \
  && echo "It's there too!" \
  || echo "Not found either..."
Elifarley
  • 1,310
  • 3
  • 16
  • 23
-1

the cheapest way I found is

if aws s3 ls s3://mybucket
then  
    echo "exists"
else
    echo "does not exist"
fi
  • Please read the other answers to a question before adding your own. This solution was already posted multiple times, and it was pointed out it [doesn’t work](https://stackoverflow.com/a/50440725/735926). – bfontaine Mar 27 '23 at 08:51