Paul BradleySolutions Architect & Software Developer


Published:

Using SQLite on AWS EFS for a Serverless Database/Application

Can SQLite be used on Amazon's Elastic File System to create a serverless database application?

midjourney ai - mechanical structures in the clouds, steam punk style, golden hour lighting

Table of Contents
  1. SQLite as a Serverless Database in AWS?
  2. Using EFS Elastic File Storage
  3. Go Code Examples
  4. Conclusion

SQLite as a Serverless Database in AWS?

SQLite is a solid database choice for specific projects. It’s insanely fast when used with consideration in the broader application architecture.

Projects like litestream offer streaming replication for SQLite, which means you could run it on a single AWS EC2 instance and stream the database changes to S3. If you need more resilience than a single server, you could have an autoscaling group that enforces at least one healthy server. If the primary server fails, the auto-scaling group will spin up another server in another availability zone. If the AMI used for the auto-scaling group does a litestream restore on start-up, the application could continue working with minimal downtime.

The litefs project by fly.io takes the litestream concept one step further. It allows you to create a cluster of servers with one writer node and any number of reader nodes. It uses leases to determine the primary node in your cluster. By default, it uses Hashicorp’s Consul

With Consul, any node marked as a candidate can become the primary node by obtaining a time-based lease and is the sole node that can write to the database during that time. This fits well in SQLite’s single-writer paradigm. When you deploy your application and need to shut down the primary, that node can release its lease, and the “primary” status will instantly move to another node.

Using EFS Elastic File Storage

While litestream and litefs are great products, depending on the application’s requirements, you can probably get away with using EFS (Elastic File Storage).

I’m currently working on a serverless dashboard application with the architecture shown below.

AWS Cloud Architecture Diagram

The concept relies on a single collector application that collects and collates data and stores it in an SQLite database. The collector application, in this case, is a Lambda function invoked periodically from the Event Bridge scheduler. The collector application is your single-writer thread. Once it has gathered the needed data, it pushes the updated SQLite database to an EFS mount.

The dashboard application is another Lambda function with the same EFS mounted at start-up. Lambros Petrou did some experiments in 2020 on the read speed from an SQLite database on EFS. The findings showed an additional latency of 100ms when reading data from EFS. While Amazon Web Services announced lower latencies for Amazon Elastic File System in November 2022, my recent tests showed an average of 64.66ms to retrieve ten records from an SQLite database on EFS.

This level of latency might be acceptable for your application. So you’ll need to make that decision. However, compared to 517.022µs (microseconds) for the ten records being retrieved from the same SQLite database, which has been copied to the Lambda functions ephemeral storage.

In this use case, as the database file only changes periodically by the collector application, we could copy the database file from the EFS mount to Lambda’s ephemeral storage during initialisation. Lambda functions now support more ephemeral storage than their original 512MB limit. Depending on the size of your database, it makes sense to copy the file locally to the function to get the best performance.

Go Code Examples

I use golang as my programming language for writing serverless Lambda functions. I also compile my go programs to use the ARM64 architecture targeting the Graviton2 Lambda execution environment. I use the cgo-free port of the SQLite driver library.

Within our Go Lambda function, we have code to periodically check the file sizes of the databases to see if the master copy has been updated. We check the size of the latest copy on the EFS mount and compare it with the copy in the /tmp/ folder of the ephemeral storage.

func getFileSize(s string) int64 {
    fi, err := os.Stat(s)
    if err != nil {
        return 0
    }
    return fi.Size()
}

if getFileSize("/mnt/efs/sqlite.db") !=
    getFileSize("/tmp/sqlite.db") {

    copyFilesToTmp("/mnt/efs/sqlite.db", "/tmp/sqlite.db")
}

If the file sizes are different, we copy the SQLite database.

I initially tried to use the io.Copy function to copy the database file across; however, this resulted in the copy being the encrypted version from the EFS file system, which rendered it unusable.

To get around this, I had to read the file into a byte array and then write the file to the destination folder.

func copyFilesToTmp(s, d string) err {
    input, err := ioutil.ReadFile(s)
    if err != nil {
        return err
    }

    err = ioutil.WriteFile(d, input, 0644)
    if err != nil {
        return err
    }
}

Conclusion

In conclusion, yes, SQLite makes a good database choice in certain serverless applications. As long as care is given to the architectural design then it shouldn’t cause any problems for certain workloads.

Thinking about it, SQLite could also be used with EFS to replace most small LAMP stack web sites that are read heavy. For example, I’ve a friend who does pet portraits. Their web site is a typical LAMP web site, using MySQL as the database. This is probably overkill, as she is the only person who logs into the database to update/add new portraits. This could be totally changed for an SQLite database/application running on Lambda and EFS and would reduce her hosting bill to almost nothing.

A small writer Lambda function could be woken up when the web site log in page is displayed. As the user enters their credentials, the writer function execution environment could be cold booting and initialising. By the time the user has logged in the writer function would be ready to accept any database updates.

If anyone is interested in streamlining their LAMP web site in this manner, then please reach out to me via Twitter. Send me a direct message to @nhsdeveloper and we can arrange a short meeting to discuss requirements.