Creating a serverless video transcoding Lambda function using Go

Written by Paul Bradley

How-to create a cost effective video transcoding function on top of S3. This function converts Apple's MOV movie files to the international standard MP4 format.

picture of a toy robot

Table of Contents
  1. Introduction
  2. The Function Specification
  3. Compiled for Amazon Linux 2
  4. Provisioning the Function with Terraform
  5. Data Structure
  6. Lambda Handler
  7. Downloading File From S3
  8. Uploading File To S3

Introduction

While designing and developing the medical imaging acquisition application, it became clear that I would need to transcode uploaded video files. Transcoding is the process of converting video files between different formats. The medical imaging application makes extensive use of the HTML5 video control for playback of uploaded videos. The HTML5 video control can natively support videos in the MP4 format, but can’t playback videos in Apple’s MOV format.

As patients can use either Android or Apple phones, I needed a way to convert video files uploaded by Apple users.

I could have used Amazon Elastic Transcoder to convert the video files between the different formats. However, after looking at the costs I decided to use a pre-built Lambda layer which includes a static build of FFMPEG for Amazon Linux. FFMPEG is the leading multimedia framework, able to decode, encode, transcode, mux, demux, stream, filter and play pretty much any video ever created. Since AWS announced in March 2022 that all Lambda functions could now support up to 10GB of ephemeral-storage, it made sense to use these two capabilities and transcode the videos within a custom function.

The Function Specification

The transcoding function is written using the Go programming language. It imports the AWS GO SDK. The function is invoked from an S3 object creation event. When an .MOV file gets uploaded to the S3 bucket the function is triggered. When the function is invoked it performs the following actions.

Compiled for Amazon Linux 2

The Go function is compiled for the Amazon Linux 2 execution environment rather than the default go1.x environment. By compiling for the Amazon Linux 2 environment we’ve ensured the function is operating within the latest possible execution environment. The pattern for compiling the function is slightly different. Below is the makefile that I use to compile and package the zip file.

1GOARCH=amd64 GOOS=linux go build
2    -tags="lambda.norpc"
3    -ldflags="-w -s"
4    -o ./bootstrap *.go
5
6upx -9 bootstrap
7zip -9 main.zip bootstrap

The main difference is the use of tags command-line switch to instruct the Go build process to exclude the RPC control plane. We take the finished binary, bootstrap, and compress it further by using upx, a free, portable, extendable, high-performance executable packer. This makes the resulting binary as small as possible. The smaller the ZIP file the lower the Lambda cold start will be; as there is less code to deploy to the Lambda execution environment.

Provisioning the Function with Terraform

Below is the sample Terraform provision script for the Lambda function. The main points to note include the layers statement which points to the ARN (Amazon Resource Name) reference for the FFMPEG layer already deployed to our AWS account. The runtime statement has been changed from go1.x to provided.al2. We’ve also doubled the ephemeral storage from the default of 512MB.

The code also demonstrates how to use the aws_s3_bucket_notification resource to configure invoking the function when a MOV file is persisted to the S3 bucket.

 1resource "aws_lambda_function" "video-transcode" {
 2    function_name = "${var.name_prefix}video-transcoder"
 3    description   = "video transcoder - triggered from S3 actions"
 4
 5    package_type  = "Zip"
 6    filename = "/aws/paulx030/video-transcode/main.zip"
 7    role = var.lambda_role_arn
 8    handler = "bootstrap"
 9    runtime = "provided.al2"
10    memory_size = "1024"
11    timeout = "300"
12    publish = true
13    layers = ["arn:aws:lambda:eu-west-2:xxxxxxxxxx:layer:ffmpeg:1"]
14
15    vpc_config {
16        subnet_ids         = var.private_subnets
17        security_group_ids = var.lambda_app_security_group_id
18    }
19
20    ephemeral_storage {
21        size = 10240
22    }
23}
24
25resource "aws_s3_bucket_notification" "video-transcode-trigger" {
26    bucket = var.patientobjs_bucket_id
27    lambda_function {
28        lambda_function_arn = aws_lambda_function.video-transcode.arn
29        events = ["s3:ObjectCreated:*"]
30        filter_suffix = ".mov"
31    }
32}

Data Structure

Before looking at the Go code, here is the data structure used within the main program. The files stored on S3 use S3 prefixes. Each patient can upload their files with prefixes that match their encounter reference from the applications database. So as part of downloading the video files we need to split the name into encounter ID and filename; and then further split the filename to extract the file extension. This allows us then generate temporary filenames while processing the video with FFMPEG, it also allows us to grab the encounter ID, so that we can update the database once the conversion has completed.

 1// given an S3 object named:
 2// e5fa44f2b31/7448d8798a4380162d4b56f9b452e2f6f9e24e7a.mov
 3
 4type s3Document struct {
 5    bucketName         string   // holds the name of the S3 bucket
 6    encounterID        string   // holds the S3 object prefix - i.e. e5fa44f2b31
 7    fileNameParts      []string // holds the parts of the object key
 8                                // [0] e5fa44f2b31
 9                                // [1] 7448d8798a4380162d4b56f9b452e2f6f9e24e7a.mov
10    localInputFile     string   // holds the temporary input filename
11    localOutputFile    string   // holds the temporary output filename
12    s3Filename         string   // holds the name with schema and bucket name
13                                // s3://bucketname/object key
14    s3File             string   // holds the S3 objet key
15    localIputFileParts []string // holds the parts of the filename
16                                // [0] 7448d8798a4380162d4b56f9b452e2f6f9e24e7a
17                                // [1] mov
18}

Lambda Handler

 1func main() {
 2    lambda.Start(handler)
 3}
 4
 5func handler(ctx context.Context, s3Event events.S3Event) {
 6
 7    // configure logging
 8    log.SetFlags(log.Ldate | log.Ltime | log.Lshortfile)
 9    log.Println("*** START ***")
10
11    var s3doc = s3Document{}
12
13    for _, record := range s3Event.Records {
14        s3 := record.S3
15
16        // initialise the variables within the data structure
17        s3doc.bucketName = s3.Bucket.Name
18        s3doc.s3File, _ = url.QueryUnescape(s3.Object.Key)
19        s3doc.s3Filename = "s3://" + s3.Bucket.Name + "/" + s3doc.s3File
20        s3doc.fileNameParts = strings.Split(s3doc.s3File, `/`)
21        s3doc.encounterID = s3doc.fileNameParts[0]
22        s3doc.localInputFile = "/tmp/" + s3doc.fileNameParts[1]
23        s3doc.localOutputFile = "/tmp/" + strings.TrimSuffix(s3doc.fileNameParts[1], filepath.Ext(s3doc.fileNameParts[1])) + ".mp4"
24        s3doc.localIputFileParts = strings.Split(s3doc.localInputFile, `.`)
25
26        // download file from the S3 bucket
27        s3doc.getDataFromS3()
28
29        // if the file has been download to the ephemeral storage
30        // then invoke the FFMPEG command from the Lambda layer
31        if fileExists(s3doc.localInputFile) {
32
33            cmd := exec.Command("/opt/bin/ffmpeg",
34                "-i", s3doc.localInputFile, 
35                "-vcodec", "h264",
36                "-acodec", "aac",
37                s3doc.localOutputFile)
38
39            err := cmd.Run()
40            if err != nil {
41                log.Println(err.Error())
42            }
43        }
44
45        // if the MP4 was generated by FFMPEG then
46        // upload the file back to the S3 bucket
47        if fileExists(s3doc.localOutputFile) {
48            s3doc.putDataToS3()
49        }
50
51        // remove the local files from the ephemeral storage
52        if fileExists(s3doc.localInputFile) {
53            os.Remove(s3doc.localInputFile)
54        }
55        if fileExists(s3doc.localOutputFile) {
56            os.Remove(s3doc.localOutputFile)
57        }
58    }
59
60    log.Println("*** END ***")
61}

Downloading File From S3

The getDataFromS3 function downloads the MOV file from S3 and stores it within the tmp folder which is the only writable folder within a Lambda function.

 1func (s *s3Document) getDataFromS3() {
 2    var err error
 3    var file *os.File
 4    var s3session *session.Session
 5
 6    file, err = os.Create(s.localInputFile)
 7    if err != nil {
 8        log.Printf("WARN: Unable to open file %q, %v", s.localInputFile, err)
 9    }
10    defer file.Close()
11
12    s3session, err = session.NewSession(
13        &aws.Config{Region: aws.String("eu-west-2")})
14
15    if err != nil {
16        log.Printf("WARN: Unable to set region: %v", err)
17    }
18
19    downloader := s3manager.NewDownloader(s3session)
20    _, err = downloader.Download(
21        file,
22        &s3.GetObjectInput {
23            Bucket: aws.String(s.bucketName),
24            Key:    aws.String(s.s3File),
25        })
26    if err != nil {
27        log.Printf("WARN: Unable to download s3File %q, %v", s.s3File, err)
28    }
29}

Uploading File To S3

The putDataToS3 function uploads the converted MP4 file to the same S3 bucket as the original MOV file. Besides uploading the file, the function sets some specific S3 object properties. It sets the ACL access control list to being privte. It also sets the server side encryption to use AWS’s AES256 encryption algorithm.

 1func (s *s3Document) putDataToS3() {
 2    var err error
 3    var file *os.File
 4    var s3session *session.Session
 5
 6    file, err = os.Open(s.localOutputFile)
 7    if err != nil {
 8        log.Println(err.Error())
 9    }
10    defer file.Close()
11
12    // get the file size and read
13    // the file content into a buffer
14    fileInfo, _ := file.Stat()
15    var size = fileInfo.Size()
16    buffer := make([]byte, size)
17    file.Read(buffer)
18
19    s3session, err = session.NewSession(
20        &aws.Config{Region: aws.String("eu-west-2")},
21    )
22
23    if err != nil {
24        log.Printf("WARN: Unable to set region: %v", err)
25    }
26
27    uploader := s3manager.NewUploader(s3session)
28    _, err = uploader.Upload(&s3manager.UploadInput{
29        Bucket:               aws.String(s.bucketName),
30        Key:                  aws.String(s.encounterID + "/" + s.localOutputFile),
31        Body:                 bytes.NewReader(buffer),
32        ACL:                  aws.String("private"),
33        ContentType:          aws.String(http.DetectContentType(buffer)),
34        ContentDisposition:   aws.String("attachment"),
35        ServerSideEncryption: aws.String("AES256"),
36    })
37
38    if err != nil {
39        log.Println("WARN: " + err.Error())
40    }
41}