Amazon S3 (Simple Storage Service) is a popular object storage service. To interact with S3 programmatically using Python, you need two things: AWS access keys (for authentication) and the boto3 library (AWS SDK for Python). This guide walks through the process step by step without heavy code samples.
Step 1: Obtaining AWS Access Keys
Access keys consist of an Access Key ID and a Secret Access Key. They work like a username and password for AWS API requests.
How to generate them:
Sign in to AWS Console – Use your root account or an IAM user with admin privileges.
Navigate to IAM (Identity and Access Management) – Search for "IAM" in the AWS Management Console.
Select Users – Choose the user who needs access to S3 (or create a new user).
Go to Security Credentials tab – Scroll down to "Access keys."
Create New Access Key – Click "Create access key." Choose "Local code" or "Application running outside AWS" as the use case.
Download or copy – AWS shows the keys only once. Save them securely. Never share or commit them to code repositories.
Best practices:
Never use root account access keys for applications
Attach the least privilege policy (e.g., AmazonS3FullAccess or a custom restricted policy)
Rotate keys regularly
Use IAM roles instead of keys if your code runs on AWS (e.g., EC2, Lambda)
Step 2: Setting Up AWS Credentials Locally
Once you have access keys, configure them so boto3 can find them automatically.
Methods (from most to least recommended):
AWS CLI configuration – Run aws configure and enter your keys, region (e.g., us-east-1), and output format. Credentials save to ~/.aws/credentials.
Environment variables – Set AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_DEFAULT_REGION in your terminal or .env file.
Hardcoded in scripts – securely use and store data, recommended for production. Use only for local testing.
Why this matters:
Boto3 automatically checks these locations (CLI config → environment variables → hardcoded) for credentials. Proper setup keeps keys out of your Python code.
Step 3: Installing Required Library
You need boto3 – the official AWS SDK for Python.
Installation command (not shown as per your request, but you can look up pip install boto3).
Verify installation by checking the version in a Python shell.
Step 4: Uploading Files to S3
The upload process follows this logical flow:
Create an S3 client – This object holds your connection to AWS.
Specify bucket name – The target S3 bucket must already exist. Create one via Console or CLI if needed.
Choose upload method – For single files, use the upload_file() method. For folders, you need to iterate through local directory contents.
Handle paths – Provide the local file path and the desired S3 object key (path within the bucket).
For a single file:
Point to the local file path
Define the destination bucket and key name
Call the upload method (boto3 handles multipart uploads automatically for large files)
For a folder (directory):
Use Python’s os.walk() to traverse all subfolders and files
For each local file, construct a corresponding S3 key (preserving folder structure or flattening as needed)
Upload each file individually with the same upload_file() method
Consider parallel uploads using threads for large numbers of small files
Step 5: Error Handling & Verification
Common issues to watch for:
Invalid credentials – Double-check your Access Key ID and Secret Key
Bucket does not exist – Create it first; bucket names are globally unique
Permission denied – Your IAM policy must allow s3:PutObject on the specific bucket
Region mismatch – Client and bucket must be in the same AWS region unless configured otherwise
After upload:
List objects in the bucket to confirm
Check S3 Console or use head_object() method to verify metadata (size, ETag)
Set appropriate S3 permissions (public/private) during upload via ACL or bucket policies
Security Reminder
Never hardcode access keys in your Python scripts
Use .gitignore to exclude .env files and AWS credential folders
Consider AWS Secrets Manager for production applications
Enable MFA delete on critical buckets