Paul Bradley • Solutions Architect & Software Developer


Published:

Creating a serverless video transcoding Lambda function using Go

How-to create a cost effective video transcoding function on top of S3. This function converts Apple's MOV movie files to the international standard MP4 format.

picture of a toy robot

Table of Contents
  1. Introduction
  2. The Function Specification
  3. Compiled for Amazon Linux 2
  4. Provisioning the Function with Terraform
  5. Data Structure
  6. Lambda Handler
  7. Downloading File From S3
  8. Uploading File To S3

 Introduction

While designing and developing the medical imaging acquisition application, it became clear that I would need to transcode uploaded video files. Transcoding is the process of converting video files between different formats. The medical imaging application makes extensive use of the HTML5 video control for playback of uploaded videos. The HTML5 video control can natively support videos in the MP4 format, but can’t playback videos in Apple’s MOV format.

As patients can use either Android or Apple phones, I needed a way to convert video files uploaded by Apple users.

I could have used Amazon Elastic Transcoder to convert the video files between the different formats. However, after looking at the costs I decided to use a pre-built Lambda layer which includes a static build of FFMPEG for Amazon Linux. FFMPEG is the leading multimedia framework, able to decode, encode, transcode, mux, demux, stream, filter and play pretty much any video ever created. Since AWS announced in March 2022 that all Lambda functions could now support up to 10GB of ephemeral-storage, it made sense to use these two capabilities and transcode the videos within a custom function.

 The Function Specification

The transcoding function is written using the Go programming language. It imports the AWS GO SDK. The function is invoked from an S3 object creation event. When an .MOV file gets uploaded to the S3 bucket the function is triggered. When the function is invoked it performs the following actions.

 Compiled for Amazon Linux 2

The Go function is compiled for the Amazon Linux 2 execution environment rather than the default go1.x environment. By compiling for the Amazon Linux 2 environment we’ve ensured the function is operating within the latest possible execution environment. The pattern for compiling the function is slightly different. Below is the makefile that I use to compile and package the zip file.

GOARCH=amd64 GOOS=linux go build
    -tags="lambda.norpc"
    -ldflags="-w -s"
    -o ./bootstrap *.go

upx -9 bootstrap
zip -9 main.zip bootstrap

The main difference is the use of tags command-line switch to instruct the Go build process to exclude the RPC control plane. We take the finished binary, bootstrap, and compress it further by using upx, a free, portable, extendable, high-performance executable packer. This makes the resulting binary as small as possible. The smaller the ZIP file the lower the Lambda cold start will be; as there is less code to deploy to the Lambda execution environment.

 Provisioning the Function with Terraform

Below is the sample Terraform provision script for the Lambda function. The main points to note include the layers statement which points to the ARN (Amazon Resource Name) reference for the FFMPEG layer already deployed to our AWS account. The runtime statement has been changed from go1.x to provided.al2. We’ve also doubled the ephemeral storage from the default of 512MB.

The code also demonstrates how to use the aws_s3_bucket_notification resource to configure invoking the function when a MOV file is persisted to the S3 bucket.

resource "aws_lambda_function" "video-transcode" {
    function_name = "${var.name_prefix}video-transcoder"
    description   = "video transcoder - triggered from S3 actions"

    package_type  = "Zip"
    filename = "/aws/paulx030/video-transcode/main.zip"
    role = var.lambda_role_arn
    handler = "bootstrap"
    runtime = "provided.al2"
    memory_size = "1024"
    timeout = "300"
    publish = true
    layers = ["arn:aws:lambda:eu-west-2:xxxxxxxxxx:layer:ffmpeg:1"]

    vpc_config {
        subnet_ids         = var.private_subnets
        security_group_ids = var.lambda_app_security_group_id
    }

    ephemeral_storage {
        size = 10240
    }
}

resource "aws_s3_bucket_notification" "video-transcode-trigger" {
    bucket = var.patientobjs_bucket_id
    lambda_function {
        lambda_function_arn = aws_lambda_function.video-transcode.arn
        events = ["s3:ObjectCreated:*"]
        filter_suffix = ".mov"
    }
}

 Data Structure

Before looking at the Go code, here is the data structure used within the main program. The files stored on S3 use S3 prefixes. Each patient can upload their files with prefixes that match their encounter reference from the applications database. So as part of downloading the video files we need to split the name into encounter ID and filename; and then further split the filename to extract the file extension. This allows us then generate temporary filenames while processing the video with FFMPEG, it also allows us to grab the encounter ID, so that we can update the database once the conversion has completed.

// given an S3 object named:
// e5fa44f2b31/7448d8798a4380162d4b56f9b452e2f6f9e24e7a.mov

type s3Document struct {
    bucketName         string   // holds the name of the S3 bucket
    encounterID        string   // holds the S3 object prefix - i.e. e5fa44f2b31
    fileNameParts      []string // holds the parts of the object key
                                // [0] e5fa44f2b31
                                // [1] 7448d8798a4380162d4b56f9b452e2f6f9e24e7a.mov
    localInputFile     string   // holds the temporary input filename
    localOutputFile    string   // holds the temporary output filename
    s3Filename         string   // holds the name with schema and bucket name
                                // s3://bucketname/object key
    s3File             string   // holds the S3 objet key
    localIputFileParts []string // holds the parts of the filename
                                // [0] 7448d8798a4380162d4b56f9b452e2f6f9e24e7a
                                // [1] mov
}

 Lambda Handler

func main() {
    lambda.Start(handler)
}

func handler(ctx context.Context, s3Event events.S3Event) {

    // configure logging
    log.SetFlags(log.Ldate | log.Ltime | log.Lshortfile)
    log.Println("*** START ***")

    var s3doc = s3Document{}

    for _, record := range s3Event.Records {
        s3 := record.S3

        // initialise the variables within the data structure
        s3doc.bucketName = s3.Bucket.Name
        s3doc.s3File, _ = url.QueryUnescape(s3.Object.Key)
        s3doc.s3Filename = "s3://" + s3.Bucket.Name + "/" + s3doc.s3File
        s3doc.fileNameParts = strings.Split(s3doc.s3File, `/`)
        s3doc.encounterID = s3doc.fileNameParts[0]
        s3doc.localInputFile = "/tmp/" + s3doc.fileNameParts[1]
        s3doc.localOutputFile = "/tmp/" + strings.TrimSuffix(s3doc.fileNameParts[1], filepath.Ext(s3doc.fileNameParts[1])) + ".mp4"
        s3doc.localIputFileParts = strings.Split(s3doc.localInputFile, `.`)

        // download file from the S3 bucket
        s3doc.getDataFromS3()

        // if the file has been download to the ephemeral storage
        // then invoke the FFMPEG command from the Lambda layer
        if fileExists(s3doc.localInputFile) {

            cmd := exec.Command("/opt/bin/ffmpeg",
                "-i", s3doc.localInputFile, 
                "-vcodec", "h264",
                "-acodec", "aac",
                s3doc.localOutputFile)

            err := cmd.Run()
            if err != nil {
                log.Println(err.Error())
            }
        }

        // if the MP4 was generated by FFMPEG then
        // upload the file back to the S3 bucket
        if fileExists(s3doc.localOutputFile) {
            s3doc.putDataToS3()
        }

        // remove the local files from the ephemeral storage
        if fileExists(s3doc.localInputFile) {
            os.Remove(s3doc.localInputFile)
        }
        if fileExists(s3doc.localOutputFile) {
            os.Remove(s3doc.localOutputFile)
        }
    }

    log.Println("*** END ***")
}

 Downloading File From S3

The getDataFromS3 function downloads the MOV file from S3 and stores it within the tmp folder which is the only writable folder within a Lambda function.

func (s *s3Document) getDataFromS3() {
    var err error
    var file *os.File
    var s3session *session.Session

    file, err = os.Create(s.localInputFile)
    if err != nil {
        log.Printf("WARN: Unable to open file %q, %v", s.localInputFile, err)
    }
    defer file.Close()

    s3session, err = session.NewSession(
        &aws.Config{Region: aws.String("eu-west-2")})

    if err != nil {
        log.Printf("WARN: Unable to set region: %v", err)
    }

    downloader := s3manager.NewDownloader(s3session)
    _, err = downloader.Download(
        file,
        &s3.GetObjectInput {
            Bucket: aws.String(s.bucketName),
            Key:    aws.String(s.s3File),
        })
    if err != nil {
        log.Printf("WARN: Unable to download s3File %q, %v", s.s3File, err)
    }
}

 Uploading File To S3

The putDataToS3 function uploads the converted MP4 file to the same S3 bucket as the original MOV file. Besides uploading the file, the function sets some specific S3 object properties. It sets the ACL access control list to being privte. It also sets the server side encryption to use AWS’s AES256 encryption algorithm.

func (s *s3Document) putDataToS3() {
    var err error
    var file *os.File
    var s3session *session.Session

    file, err = os.Open(s.localOutputFile)
    if err != nil {
        log.Println(err.Error())
    }
    defer file.Close()

    // get the file size and read
    // the file content into a buffer
    fileInfo, _ := file.Stat()
    var size = fileInfo.Size()
    buffer := make([]byte, size)
    file.Read(buffer)

    s3session, err = session.NewSession(
        &aws.Config{Region: aws.String("eu-west-2")},
    )

    if err != nil {
        log.Printf("WARN: Unable to set region: %v", err)
    }

    uploader := s3manager.NewUploader(s3session)
    _, err = uploader.Upload(&s3manager.UploadInput{
        Bucket:               aws.String(s.bucketName),
        Key:                  aws.String(s.encounterID + "/" + s.localOutputFile),
        Body:                 bytes.NewReader(buffer),
        ACL:                  aws.String("private"),
        ContentType:          aws.String(http.DetectContentType(buffer)),
        ContentDisposition:   aws.String("attachment"),
        ServerSideEncryption: aws.String("AES256"),
    })

    if err != nil {
        log.Println("WARN: " + err.Error())
    }
}