Defensive Functions - Azure Blob Virus Scanning

Defensive Functions - Azure Blob Virus Scanning

This is the first in a series on writing some simple security focused functions that can help you protect your business without a lot of work.  This first article is focused on virus scanning in Azure Storage.  


Every business I've worked for has to do a lot of file processing for various reasons and Azure storage is a great way to store and process vast amounts of data. Unfortunately, it doesn’t do anything to check the files that are uploaded to ensure they are safe.  These days we should assume everything is bad, and do our best to be secure by design.  

I am aware that virus scanning isn't a guarantee that something is safe, but it is still an essential component of a defense in depth strategy.  If you have customers uploading files through your application it is a good idea to scan them for viruses. Don't trust anyone.  

The good news is that there is an open source virus scanner called ClamAV (https://www.clamav.net/) that is easy to manage, has an awesome api, and does a great job at virus scanning.  

There are a couple of ways to use the ClamAV api, but it is GPL and if you link directly against it you will be subject to the licensing restrictions.  This method does not do that, and uses ClamAV as a server.

Step 1:  Setup a container structure to something like what is shown below.  In this example your customers will upload blobs to the staging container.  This will fire an event that will trigger the function and process the uploaded file.

Step 2:  Implement the function trigger

You will notice that I am providing bindings that are not used in the actual function sample.  These are just here for education and how to get other properties of the blob.

Step 3:  Add nClam package and add some functionality.  My skeleton is below for reference.  Your needs may vary.  Be aware that functions currently have memory limitations of 1.5gb and ClamAV has a limit of 4GB, but a limit far lower than that is recommended in the documentation.  I don't expect that really large files in the GB range will work very well with any virus scanner due to memory considerations.  

The rest of this function is pretty self explanatory.  Do some checks to ensure health, pass the stream to the ClamAV server using nClam, and await the result.  Once you have that you can move the blob to the other containers.  Please reference https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-error-pages to ensure you have proper error handling and management if something goes awry.

Step 4: Implement a ClamAV Server

Here you have a lot of options.  For local testing of this solution I used a simple public container from docker hub https://hub.docker.com/r/mkodockx/docker-clamav/ There are a number of these available.  When moving to production you could stand this up in a VM or continue to use containers with AKS.  I'll leave that up to you.  ClamAV is pretty easy to setup and maintain.  Just run Freshclam daily to maintain your virus signatures.

Testing:

Testing can be done with the eicar test file to ensure everything is working as you expect.  Just do a quick search for eicar virus test file and grab it from any reputable av vendor.

Happy scanning!

Full skeleton source is at my public repo https://github.com/JTAtkins/Defensive.Functions