Creating a serverless video transcoding Lambda function using Go
How-to create a cost effective video transcoding function on top of S3. This function converts Apple's MOV movie files to the international standard MP4 format.
Table of Contents
↑Introduction
While designing and developing the medical imaging acquisition application, it became clear that I would need to transcode uploaded video files. Transcoding is the process of converting video files between different formats. The medical imaging application makes extensive use of the HTML5 video control for playback of uploaded videos. The HTML5 video control can natively support videos in the MP4 format, but can’t playback videos in Apple’s MOV format.
As patients can use either Android or Apple phones, I needed a way to convert video files uploaded by Apple users.
I could have used Amazon Elastic Transcoder to convert the video files between the different formats. However, after looking at the costs I decided to use a pre-built Lambda layer which includes a static build of FFMPEG for Amazon Linux. FFMPEG is the leading multimedia framework, able to decode, encode, transcode, mux, demux, stream, filter and play pretty much any video ever created. Since AWS announced in March 2022 that all Lambda functions could now support up to 10GB of ephemeral-storage, it made sense to use these two capabilities and transcode the videos within a custom function.
↑The Function Specification
The transcoding function is written using the Go programming language. It imports the AWS GO SDK. The function is invoked from an S3 object creation event. When an .MOV file gets uploaded to the S3 bucket the function is triggered. When the function is invoked it performs the following actions.
- Splits the object name into its component parts, extracting the prefix and generating a new output filename.
- Downloads the original MOV file from the S3 bucket to the functions ephemeral-storage for processing.
- Invokes the FFMPEG command from the Lambda layer to convert the file to an MP4. The output MP4 is also saved to the functions ephemeral-storage.
- The function then uploads the newly created MP4 file back to the S3 bucket for safe keeping.
- Finally, the function tidies up the ephemeral-storage by deleting the input and output files.
↑Compiled for Amazon Linux 2
The Go function is compiled for the Amazon Linux 2 execution environment rather than the default go1.x environment. By compiling for the Amazon Linux 2 environment we’ve ensured the function is operating within the latest possible execution environment. The pattern for compiling the function is slightly different. Below is the makefile that I use to compile and package the zip file.
1GOARCH=amd64 GOOS=linux go build
2 -tags="lambda.norpc"
3 -ldflags="-w -s"
4 -o ./bootstrap *.go
5
6upx -9 bootstrap
7zip -9 main.zip bootstrap
The main difference is the use of tags command-line switch to instruct the Go build process to exclude the RPC control plane. We take the finished binary, bootstrap, and compress it further by using upx, a free, portable, extendable, high-performance executable packer. This makes the resulting binary as small as possible. The smaller the ZIP file the lower the Lambda cold start will be; as there is less code to deploy to the Lambda execution environment.
↑Provisioning the Function with Terraform
Below is the sample Terraform provision script for the Lambda function. The main points to note include the layers statement which points to the ARN (Amazon Resource Name) reference for the FFMPEG layer already deployed to our AWS account. The runtime statement has been changed from go1.x to provided.al2. We’ve also doubled the ephemeral storage from the default of 512MB.
The code also demonstrates how to use the aws_s3_bucket_notification resource to configure invoking the function when a MOV file is persisted to the S3 bucket.
1resource "aws_lambda_function" "video-transcode" {
2 function_name = "${var.name_prefix}video-transcoder"
3 description = "video transcoder - triggered from S3 actions"
4
5 package_type = "Zip"
6 filename = "/aws/paulx030/video-transcode/main.zip"
7 role = var.lambda_role_arn
8 handler = "bootstrap"
9 runtime = "provided.al2"
10 memory_size = "1024"
11 timeout = "300"
12 publish = true
13 layers = ["arn:aws:lambda:eu-west-2:xxxxxxxxxx:layer:ffmpeg:1"]
14
15 vpc_config {
16 subnet_ids = var.private_subnets
17 security_group_ids = var.lambda_app_security_group_id
18 }
19
20 ephemeral_storage {
21 size = 10240
22 }
23}
24
25resource "aws_s3_bucket_notification" "video-transcode-trigger" {
26 bucket = var.patientobjs_bucket_id
27 lambda_function {
28 lambda_function_arn = aws_lambda_function.video-transcode.arn
29 events = ["s3:ObjectCreated:*"]
30 filter_suffix = ".mov"
31 }
32}
↑Data Structure
Before looking at the Go code, here is the data structure used within the main program. The files stored on S3 use S3 prefixes. Each patient can upload their files with prefixes that match their encounter reference from the applications database. So as part of downloading the video files we need to split the name into encounter ID and filename; and then further split the filename to extract the file extension. This allows us then generate temporary filenames while processing the video with FFMPEG, it also allows us to grab the encounter ID, so that we can update the database once the conversion has completed.
1// given an S3 object named:
2// e5fa44f2b31/7448d8798a4380162d4b56f9b452e2f6f9e24e7a.mov
3
4type s3Document struct {
5 bucketName string // holds the name of the S3 bucket
6 encounterID string // holds the S3 object prefix - i.e. e5fa44f2b31
7 fileNameParts []string // holds the parts of the object key
8 // [0] e5fa44f2b31
9 // [1] 7448d8798a4380162d4b56f9b452e2f6f9e24e7a.mov
10 localInputFile string // holds the temporary input filename
11 localOutputFile string // holds the temporary output filename
12 s3Filename string // holds the name with schema and bucket name
13 // s3://bucketname/object key
14 s3File string // holds the S3 objet key
15 localIputFileParts []string // holds the parts of the filename
16 // [0] 7448d8798a4380162d4b56f9b452e2f6f9e24e7a
17 // [1] mov
18}
↑Lambda Handler
1func main() {
2 lambda.Start(handler)
3}
4
5func handler(ctx context.Context, s3Event events.S3Event) {
6
7 // configure logging
8 log.SetFlags(log.Ldate | log.Ltime | log.Lshortfile)
9 log.Println("*** START ***")
10
11 var s3doc = s3Document{}
12
13 for _, record := range s3Event.Records {
14 s3 := record.S3
15
16 // initialise the variables within the data structure
17 s3doc.bucketName = s3.Bucket.Name
18 s3doc.s3File, _ = url.QueryUnescape(s3.Object.Key)
19 s3doc.s3Filename = "s3://" + s3.Bucket.Name + "/" + s3doc.s3File
20 s3doc.fileNameParts = strings.Split(s3doc.s3File, `/`)
21 s3doc.encounterID = s3doc.fileNameParts[0]
22 s3doc.localInputFile = "/tmp/" + s3doc.fileNameParts[1]
23 s3doc.localOutputFile = "/tmp/" + strings.TrimSuffix(s3doc.fileNameParts[1], filepath.Ext(s3doc.fileNameParts[1])) + ".mp4"
24 s3doc.localIputFileParts = strings.Split(s3doc.localInputFile, `.`)
25
26 // download file from the S3 bucket
27 s3doc.getDataFromS3()
28
29 // if the file has been download to the ephemeral storage
30 // then invoke the FFMPEG command from the Lambda layer
31 if fileExists(s3doc.localInputFile) {
32
33 cmd := exec.Command("/opt/bin/ffmpeg",
34 "-i", s3doc.localInputFile,
35 "-vcodec", "h264",
36 "-acodec", "aac",
37 s3doc.localOutputFile)
38
39 err := cmd.Run()
40 if err != nil {
41 log.Println(err.Error())
42 }
43 }
44
45 // if the MP4 was generated by FFMPEG then
46 // upload the file back to the S3 bucket
47 if fileExists(s3doc.localOutputFile) {
48 s3doc.putDataToS3()
49 }
50
51 // remove the local files from the ephemeral storage
52 if fileExists(s3doc.localInputFile) {
53 os.Remove(s3doc.localInputFile)
54 }
55 if fileExists(s3doc.localOutputFile) {
56 os.Remove(s3doc.localOutputFile)
57 }
58 }
59
60 log.Println("*** END ***")
61}
↑Downloading File From S3
The getDataFromS3 function downloads the MOV file from S3 and stores it within the tmp folder which is the only writable folder within a Lambda function.
1func (s *s3Document) getDataFromS3() {
2 var err error
3 var file *os.File
4 var s3session *session.Session
5
6 file, err = os.Create(s.localInputFile)
7 if err != nil {
8 log.Printf("WARN: Unable to open file %q, %v", s.localInputFile, err)
9 }
10 defer file.Close()
11
12 s3session, err = session.NewSession(
13 &aws.Config{Region: aws.String("eu-west-2")})
14
15 if err != nil {
16 log.Printf("WARN: Unable to set region: %v", err)
17 }
18
19 downloader := s3manager.NewDownloader(s3session)
20 _, err = downloader.Download(
21 file,
22 &s3.GetObjectInput {
23 Bucket: aws.String(s.bucketName),
24 Key: aws.String(s.s3File),
25 })
26 if err != nil {
27 log.Printf("WARN: Unable to download s3File %q, %v", s.s3File, err)
28 }
29}
↑Uploading File To S3
The putDataToS3 function uploads the converted MP4 file to the same S3 bucket as the original MOV file. Besides uploading the file, the function sets some specific S3 object properties. It sets the ACL access control list to being privte. It also sets the server side encryption to use AWS’s AES256 encryption algorithm.
1func (s *s3Document) putDataToS3() {
2 var err error
3 var file *os.File
4 var s3session *session.Session
5
6 file, err = os.Open(s.localOutputFile)
7 if err != nil {
8 log.Println(err.Error())
9 }
10 defer file.Close()
11
12 // get the file size and read
13 // the file content into a buffer
14 fileInfo, _ := file.Stat()
15 var size = fileInfo.Size()
16 buffer := make([]byte, size)
17 file.Read(buffer)
18
19 s3session, err = session.NewSession(
20 &aws.Config{Region: aws.String("eu-west-2")},
21 )
22
23 if err != nil {
24 log.Printf("WARN: Unable to set region: %v", err)
25 }
26
27 uploader := s3manager.NewUploader(s3session)
28 _, err = uploader.Upload(&s3manager.UploadInput{
29 Bucket: aws.String(s.bucketName),
30 Key: aws.String(s.encounterID + "/" + s.localOutputFile),
31 Body: bytes.NewReader(buffer),
32 ACL: aws.String("private"),
33 ContentType: aws.String(http.DetectContentType(buffer)),
34 ContentDisposition: aws.String("attachment"),
35 ServerSideEncryption: aws.String("AES256"),
36 })
37
38 if err != nil {
39 log.Println("WARN: " + err.Error())
40 }
41}