Cloud |1/25/2024-6 min read

Using AWS S3 on a Vercel-hosted NextJS app

A detailed journey from concept to implementation, highlighting challenges and solutions Using AWS S3 on a Vercel-hosted NextJS app

When I built my website, I decided not to use any CMS because I wanted to reduce the complexity and have more flexibility. I chose MDX. So, I was doing this:

Diagram showing a Cloud Architecture with Vercel and Github

As you can see, NextJS used the MDX static files at build time, and later, Vercel serviced the static page and MDX files generated during the build.

The downside of this approach is that I was hosting the MDX static files in the repository, mixing concepts, and serving them in the final app.

I always considered the possibility of moving those source files somewhere else. Like this:

Diagram showing a Cloud Architecture with Vercel, AWS and Github

Why AWS S3?

Vercel hosts my website, and I want to keep that part (for now). I love the easiness of their CICD setup. When I did it, it was a matter of some clicks. Automagically, I had webhooks to trigger feature branch builds and smooth deployments without worrying about CDN, scalability, etc.

Nowadays, I want to leverage some of AWS's capabilities. AWS started in 2006, and it has hundreds of well-connected services, allowing you total control over your setup. Vercel, on the other hand, started in 2015 and is more limited in the number of services it offers.

Let's explore some of the benefits and challenges this multi-cloud configuration presents.

Benefit	Description
S3 Scalable Storage	Easily handles increasing storage needs for static files
S3 High Availability and Durability	Data is stored across multiple data centers for reliability
S3 Security	Encryption at rest
S3 Version Control	Manages and rolls back fixed assets as needed

Challenge	Description
Multi-Cloud	Learning two different clouds, configurations, and rules
Cost Management	Monitoring and optimizing costs can be challenging
Learning Curve	Requires learning AWS services and S3 specifics
Access Management	Managing access rights and permissions can be intricate

After considering the benefits and challenges, it is still a good choice for the upcoming features and learning purposes.

Vercel Blob is in beta, but for the sake of simplicity, if you are looking just to store those MDX files, it would probably be a more straightforward option.

Let's dive into the solution

Prerequisites

AWS root account
Two steps authentication in the root account
Create a user account - avoid using the root account

S3 bucket and access policies

Create an S3 bucket and activate versioning
Create a new IAM user
Generate access keys (only used by Vercel through environment variables)
Define a specific policy for my needs. Policy for my bucket:

Bucket's policy

1{
2	"Version": "2012-10-17",
3	"Statement": [
4		{
5			"Sid": "VisualEditor0",
6			"Effect": "Allow",
7			"Action": "s3:ListBucket",
8			"Resource": "arn..." // my S3 bucket
9		},
10		{
11			"Sid": "VisualEditor1",
12			"Effect": "Allow",
13			"Action": "s3:GetObject",
14			"Resource": "arn..." // my S3 bucket
15		}
16	]
17}
json

Assing this policy to the IAM user

I have some concerns about this approach. I know that only Vercel will use the access keys during build time, and I can monitor the inbound traffic in AWS S3, but still, having a more secure connection between both networks would be ideal. Consider leveraging AWS site-to-site VPN, changing to temporary access for your user, or defining whitelist domains to the policy.

This part bothers me a bit because having everything within AWS would allow me to hide the S3 bucket within a private subnet, and having only my final site in the public subnet would completely isolate the access to S3. I mean having this:

AWS Subnets

The previous diagram is simplified. I'm not counting on the external connection with GitHub. The build process and the “server” probably would require either ECS or EKS or a combination of an S3 and some lambdas. The goal was to display the potential of isolating your data by adding different security mechanisms.

Setting Up Environment Variables

I don't want to include any references to AWS information within the code; a combination of security and reusability. So, I defined the next environment variables:

AWS_REGION
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCES_KEY
AWS_BUCKET_NAME
AWS_BUCKET_BLOG_PREFIX

Learn how to use env variables with NextJS and Vercel.

Using AWS SDK v3

At the time of writing, the latest version of the AWS SDK is v3. You can find it here. I'm using it through the node package:

Package

1npm install @aws-sdk/client-s3
bash

I split the code into some small functions:

Create the Client

1const createS3Client = () =>
2  new S3Client({
3    region: process.env.AWS_REGION,
4    credentials: {
5      accessKeyId: process.env.AWS_ACCESS_KEY_ID,
6      secretAccessKey: process.env.AWS_SECRET_ACCES_KEY,
7    },
8  } as S3ClientConfig);
typescript

Read a folder

1const readFolderContent = async (client: S3Client, Prefix: string) => {
2  const command = new ListObjectsV2Command({
3    Bucket: process.env.AWS_BUCKET_NAME,
4    Prefix,
5  });
6
7  try {
8    const response = await client.send(command);
9    return await response.Contents!;
10  } catch (err) {
11    console.error(err);
12  }
13};
typescript

Read a file

1const readFile = async (
2  client: S3Client,
3  Key: string,
4  format: BufferEncoding,
5) => {
6  const command = new GetObjectCommand({
7    Bucket: process.env.AWS_BUCKET_NAME,
8    Key,
9  });
10
11  const streamToString = async (stream) =>
12    new Promise((resolve, reject) => {
13      const chunks: Uint8Array[] = [];
14      stream.on("data", (chunk: Uint8Array) => chunks.push(chunk));
15      stream.on("error", reject);
16      stream.on("end", () => resolve(Buffer.concat(chunks).toString(format)));
17    });
18
19  try {
20    const response = await client.send(command);
21    return (await streamToString(response.Body!)) as string;
22  } catch (err) {
23    console.error(err);
24  }
25};
typescript

How to use them

1const readS3BlogFolder = async () => {
2  const client = createS3Client();
3  return await readFolderContent(client, process.env.AWS_BUCKET_BLOG_PREFIX!);
4};
5
6const readS3File = async (Key: string) => {
7  const client = createS3Client();
8  const file = await readFile(client, Key, "utf8");
9  return file!;
10};
11
12const readS3Img = async (Key: string) => {
13  const client = createS3Client();
14  return await readFile(client, Key, "base64");
15};
typescript

The code is pretty straightforward. The AWS SDK was a clear time saver and a good way of minimizing errors.

How much time did it take me?

When I did this, it took about one day, three and a half hours, and then another two hours and fifteen minutes the next day, making a total of six hours.

What I described here is my personal experience. This time would vary from person to person, depending on their background and knowledge of the topic.

In addition, I needed to spend more time adjusting my MDX content because, unfortunately, some URLs were incorrect, and I had problems reading images because now I needed to get them from S3. So, I had to adjust a few components.

Summary

This setup is, without a doubt, a Frankenstein complex one. Cloud providers make your life easy if you stick to them. Unfortunately, more freedom and a multi-cloud approach bring challenges like a global monitoring system, security risks, and different deployment methods.

You can solve those challenges by leveraging cloud-native solutions, using a shared infrastructure as code language, and hooking stats to a global monitoring system. But is it worth it?

Without any doubt, setting up a new website with Vercel is way easier. No CI/CD knowledge is required.

Any side project, startup, blog, or online store could leverage the simplicity provided by Vercel and speed up its go-to-market by focusing on coding the actual product.

But what about scaling the product? Is staying with them sustainable over time? What number of data centers do they have? And what about the number of services offered?

Every software architect should ask those open questions; the answers will differ from project to project.

I didn't need to get into this for my website. I did it because my website is a multi-purpose project: my blog, personal portfolio, and real-life playground where I can learn and explore solutions with a small project.

In any case, I'm thrilled with the results, especially with the learning. I'm more confident about this topic and have a deeper understanding of the benefits and challenges of a solution like this one.