Today, we look at how can we perform the following terminal command in python.
aws s3 ls s3://bucket/obj_in_bucket/
PRE subobj_1/
PRE subobj_2/
PRE subobj_3/
...
Why do we even care?
A lot of times, you just want to list all the existing subobjects in a given object without getting its content. A typical use case is to list all existing objects in the bucket, where here, the bucket is viewed as an object – the root object. This list action can be achieved using the simple aws s3 ls
command in the terminal.
But what if we want to do it natively as part of a module in python? For example, if we want to trigger a callback on the server whenever a new subobject is found in our bucket. A fast way is to just perform an equivalent aws s3 ls
in our python module, and this is our goal today.
To ensure clarity, let’s kick off with the definition of a path, a bucket and an object prefix.
- definition
-
A path is a string that consists of an S3 tag, a bucket and an object prefix. For example,
s3://bucket/object/subobject/...
is a generic path wheres3://
is the S3 tag,bucket
is the bucket andobject/subobject/...
is the object prefix. Note the position of/
carefully.
aws s3 ls
in python
I assume that you’ve done the standard AWS credentials step, storing it at ~/.aws/credentials
for example. We can then initialize an S3 client in Python using boto3.session.Session
, I hope this step is familiar to you.
import boto3
= boto3.session.Session()
session = session.client("s3") client
Now that we have our S3 client, define our bucket and object prefix of interest.
= "bucket_name"
bucket = "obj_in_bucket/" obj_prefix
Then we can simply list all the existing subobjects of our given object prefix using the following function:
def aws_s3_ls(bucket: str, obj_prefix: str):
= dict(Bucket=bucket, Prefix=obj_prefix, Delimiter="/")
params
= client.get_paginator("list_objects_v2")
paginator for page in paginator.paginate(**params):
for obj in page.get("CommonPrefix", []):
print("PRE", obj["Key"])
aws_s3_ls(bucket, obj_prefix)
PRE subobj_1/
PRE subobj_2/
PRE subobj_3/
...
And that’s it! That was a short one and I hope to cover more AWS things in the future.