Boto3 Python Uploading a Local File File_path
How To Upload and Download Files in AWS S3 with Python and Boto3
Introduction
In this How To tutorial I demonstrate how to perform file storage management with AWS S3 using Python's boto3 AWS library. Specifially I provide examples of configuring boto3, creating S3 buckets, as well equally uploading and downloading files to and from S3 buckets.
Creating a Boto3 User in AWS For Programmatic Admission
As a get-go step I brand a new user in AWS'south management console that I'll use in conjunction with the boto3 library to access my AWS account programmatically. Its considered a best practice to create a split and specific user for apply with boto3 as it makes information technology easier to rail and manage.
To kickoff I enter IAM in the search bar of the services card and select the menu particular.
Following that I click the Add user push button.
On the following screen I enter a username of boto3-demo and make certain only Programmatic access detail is selected and click the next button.
On the adjacent screen I attach a permission policy of AmazonS3FullAccess so click the side by side button
Then click side by side until the credentials screen is show equally seen below. On this screen I click the Download .csv button. I will need these credentials to configure Boto3 to allow me to admission my AWS account programmatically.
Installing Boto3
Before writing any Python lawmaking I must install the AWS Python library named Boto3 which I will employ to interact with the AWS S3 service. To attain this I fix up a Python3 virtual environment every bit I experience that is a best practice for any new projection regardless of size and intent.
$ mkdir aws_s3 $ cd aws_s3 $ python3 -chiliad venv venv $ source venv/bin/activate (venv) $ pip install boto3
Configuring Boto3 and Boto3 User Credentials
With the boto3-demo user created and the Boto3 parcel installed I can at present setup the configuration to enable authenticated access to my AWS account. In that location a few dissimilar ways to handle this and the i I like best is to store the access key id and clandestine access central values as environment variables then utilize the Python bone module from the standard library to feed them into the boto3 library for authentication. There is a handy Python package called python-dotenv which allows y'all to put surround variables in a file named .env then load them into y'all Python source code and so, I'll begin this section by installing it.
(venv) $ pip install python-dotenv
Following this I make a .env file and place the two variables in information technology every bit shown beneath but, apparently you'll want to put in your own values for these that you downloaded in the before pace for creating the boto3 user in AWS console.
AWS_ACCESS_KEY_ID=MYACCESSKEYID AWS_ACCESS_KEY_SECRET=MYACCESSKEYSECRET
Next I make a Python module named file_manager.py and so inside I import the os and boto3 modules every bit well as the load_dotenv part from the python-dotenv packet. Following that I call the load_dotenv() function which will autofind a .env file in the same directory and read in the variable into the environment making them accessible via the os module.
So I create a function named aws_session(...) for generating an authenticated Session object accessing the environmental variables with the bone.getenv(...) office while returning a session object.
# file_manager.py import bone import boto3 from dotenv import load_dotenv load_dotenv(verbose=True) def aws_session(region_name='u.s.a.-east-1'): return boto3.session.Session(aws_access_key_id=os.getenv('AWS_ACCESS_KEY_ID'), aws_secret_access_key=bone.getenv('AWS_ACCESS_KEY_SECRET'), region_name=region_name)
I will then utilize this session object to interact with the AWS platform via a high-level abstraction object Boto3 provides known as the AWS Resource. When used in conjunction with my aws_session() office I can create a S3 resource like and then.
session = aws_session() s3_resource = session.resource('s3')
Creating an S3 Saucepan Programmically with Boto3
I can now move on to making a publically readable bucket which will serve every bit the top level container for file objects within S3. I will do this within a role named make_bucket every bit shown below.
def make_bucket(name, acl): session = aws_session() s3_resource = session.resource('s3') render s3_resource.create_bucket(Saucepan=proper noun, ACL=acl) s3_bucket = make_bucket('tci-s3-demo', 'public-read')
The key signal to annotation here is that I've used the Resource class's create_bucket method to create the bucket passing it a cord name which conforms to AWS naming rules forth with an ACL parameter which is a string represeting an Access Control Listing policy which in this example is for public reading.
Uploading a File to S3 Using Boto3
At this indicate I can upload files to this newly created buchet using the Boto3 Bucket resource class. Below is a demo file named children.csv that I'll be working with.
name, historic period Kallin, 3 Cameron, 0
In conjunction with good practice of reusability I'll over again make a function to upload files given a file path and saucepan name as shown below.
def upload_file_to_bucket(bucket_name, file_path): session = aws_session() s3_resource = session.resource('s3') file_dir, file_name = os.path.split up(file_path) bucket = s3_resource.Bucket(bucket_name) bucket.upload_file( Filename=file_path, Key=file_name, ExtraArgs={'ACL': 'public-read'} ) s3_url = f"https://{bucket_name}.s3.amazonaws.com/{file_name}" return s3_url s3_url = upload_file_to_bucket('tci-s3-demo', 'children.csv') print(s3_url) # https://tci-s3-demo.s3.amazonaws.com/children.csv
Here I use the Bucket resources form'due south upload_file(...) method to upload the children.csv file. The parameters to this method are a fiddling confusing and so let me explain them a little. Get-go you have the Filename parameter which is really the path to the file yous wish to upload and then at that place is the Key parameter which is a unique identifier for the S3 object and must confirm to AWS object naming rules similar to S3 buckets.
The upload_file_to_bucket(...) function uploads the given file to the specified bucket and returns the AWS S3 resources url to the calling code.
Uploading In-Retention Data to S3 using Boto3
While uploading a file that already exists on the filesystem is a very common use case when writing software that utilizes S3 object based storage there is no need to write a file to disk but for the sole purpose of uploading information technology to S3. You can instead upload any byte serialized data in a using the put(...) method on a Boto3 Object resource.
Below I am showing another new resuable function that takes bytes information, a bucket proper name and an s3 object key which it and so uploads and saves to S3 as an object.
def upload_data_to_bucket(bytes_data, bucket_name, s3_key): session = aws_session() s3_resource = session.resource('s3') obj = s3_resource.Object(bucket_name, s3_key) obj.put(ACL='private', Body=bytes_data) s3_url = f"https://{bucket_name}.s3.amazonaws.com/{s3_key}" return s3_url data = [ 'My proper name is Adam', 'I alive in Lincoln', 'I have a beagle named Doc Holiday' ] bytes_data = '\north'.join(data).encode('utf-8') s3_url = upload_data_to_bucket(bytes_data, 'tci-s3-demo', 'about.txt') print(s3_url) # 'https://tci-s3-demo.s3.amazonaws.com/about.txt'
Downloading a File from S3 using Boto3
Side by side I'll demonstrate downloading the same children.csv S3 file object that was simply uploaded. This is very similar to uploading except you use the download_file method of the Bucket resource class.
def download_file_from_bucket(bucket_name, s3_key, dst_path): session = aws_session() s3_resource = session.resources('s3') bucket = s3_resource.Bucket(bucket_name) bucket.download_file(Central=s3_key, Filename=dst_path) download_file_from_bucket('tci-s3-demo', 'children.csv', 'children_download.csv') with open('children_download.csv') equally fo: print(fo.read())
Which outputs the following from the downloaded file.
name,age Kallin,3 Cameron,0
Downloading a File from S3 to Memory using Boto3
There will likely be times when you are only downloading S3 object data to immediately process and so throw away without ever needing to save the data locally. Downloading information in this style still requires using some sort of file-similar object in binary manner only, luckily the Python linguistic communication provides the helpful streaming form BytesIO from the io module which handles in retentivity stream handling lke this.
To download the S3 object information in this way y'all will want to use the download_fileobj(...) method of the S3 Object resources class as demonstrated below by downloading the about.txt file uploaded from in-retentiveness data perviously.
def download_data_from_bucket(bucket_name, s3_key): session = aws_session() s3_resource = session.resource('s3') obj = s3_resource.Object(bucket_name, s3_key) io_stream = io.BytesIO() obj.download_fileobj(io_stream) io_stream.seek(0) data = io_stream.read().decode('utf-8') return data about_data = download_data_from_bucket('tci-s3-demo', 'about.txt') impress(about_data)
Prints the following data.
My name is Adam I alive in Lincoln I have a beagle named Physician Vacation
Resources for Learning More
- Python Tricks: A Cafe of Awesome Python Features is truly a fantastic collection of Python productivity hacks
- Fluent Python: Articulate, Concise, and Effective Programming is a must read for whatever Python developer serious near lawmaking adroitness
- Amazon Web Services in Activeness is a pragmatic volume total of excellent employ cases of the many services provided past AWS
thecodinginterface.com earns commision from sales of linked products such equally the books to a higher place. This enables providing connected free tutorials and content and then, thank you for supporting the authors of these resources as well equally thecodinginterface.com
Conclusion
In this How To article I take demonstrated how to prepare up and use the Python Boto3 library to access files transferring them to and from AWS S3 object storage.
For completeness here is the complete source code for the file_manager.py module that was used in this tutorial.
# file_manager.py import os import boto3 import io from dotenv import load_dotenv load_dotenv(verbose=True) def aws_session(region_name='us-east-ane'): return boto3.session.Session(aws_access_key_id=os.getenv('AWS_ACCESS_KEY_ID'), aws_secret_access_key=os.getenv('AWS_ACCESS_KEY_SECRET'), region_name=region_name) def make_bucket(name, acl): session = aws_session() s3_resource = session.resource('s3') render s3_resource.create_bucket(Bucket=name, ACL=acl) def upload_file_to_bucket(bucket_name, file_path): session = aws_session() s3_resource = session.resource('s3') file_dir, file_name = bone.path.separate(file_path) bucket = s3_resource.Saucepan(bucket_name) bucket.upload_file( Filename=file_path, Fundamental=file_name, ExtraArgs={'ACL': 'public-read'} ) s3_url = f"https://{bucket_name}.s3.amazonaws.com/{file_name}" return s3_url def download_file_from_bucket(bucket_name, s3_key, dst_path): session = aws_session() s3_resource = session.resource('s3') bucket = s3_resource.Saucepan(bucket_name) saucepan.download_file(Key=s3_key, Filename=dst_path) def upload_data_to_bucket(bytes_data, bucket_name, s3_key): session = aws_session() s3_resource = session.resource('s3') obj = s3_resource.Object(bucket_name, s3_key) obj.put(ACL='private', Body=bytes_data) s3_url = f"https://{bucket_name}.s3.amazonaws.com/{s3_key}" render s3_url def download_data_from_bucket(bucket_name, s3_key): session = aws_session() s3_resource = session.resource('s3') obj = s3_resource.Object(bucket_name, s3_key) io_stream = io.BytesIO() obj.download_fileobj(io_stream) io_stream.seek(0) information = io_stream.read().decode('utf-8') return data
Equally always, I give thanks you lot for reading and feel free to enquire questions or critique in the comments section below.
Share with friends and colleagues
Source: https://thecodinginterface.com/blog/aws-s3-python-boto3/