A client of mine operates a fairly large trading website that allows users to upload media (e.g. images, videos and documents) to accompany their listings, and respondent’s to do the same with their responses. The uploaded files are stored on disk, i.e. not in a database. Following some operational re-architecture, it has also been decided that the architecture and development of the application will also be tidied up a bit. As a good portion of the application is already in Amazon, it has been suggested that one option is to store the flatfiles in S3 – cue my input on how this could be achieved.
S3SecureUpload is a single page MVC4 project, developed to illustrate to my client how the AWS .NET APIs can be used to facilitate the efficient and secure uploading/downloading of files to/from S3 storage. The benefit of using S3 is that it will save on hardware costs (no need for a SAN) and bandwidth costs (as files are sent directly to/from S3 and browsers), and provide excellent availability (99.99% according to Amazon) and security (encryption at rest by default, secure datacentres, etc).
Note: this is a proof-of-concept, and by no means a solution suitable for rolling straight into production applications.
The upload process that Amazon uses is, in itself, fairly well designed. When you make an S3 bucket, each user you define for it is assigned a private key. When an upload is to be performed, there are two main prerequisites that make it possible:
- Policy: A base64 encoded JSON string that defines – at a minimum – the S3 bucket name, the file ACL (e.g. public, private), the content length restrictions and the filename.
- Signature: An HMAC-SHA1 hash of the policy, generated using the private key.
As the transfer occurs entirely client-side, client-side validation of file-size and file-type cannot be totally relied upon (however, it does take place), so it’s recommended that a Bucket policy is also enforced within S3.
In the case of my demo application, requests are first made to an upload handler, (uploadHandler.ashx) which upon receiving a filename (referred to as a ‘key’), responds with a valid policy and signature, allowing the user to continue and XHR POST the file directly to S3. Filenames are prefixed with a timestamp to ensure duplicates cannot be made, and that there is an element of randomness to the filename – so to prevent manual or automated guessing. The timestamp could of course be a totally random string too.
To download a file, the file ‘key’ (i.e. the filename, which may include additional subdirectories) is specified and used to query the S3 bucket, producing a ‘pre-signed’ URL that is valid for a pre-defined length of time. Visiting the URL will allow the requester to download the file directly from S3.
Around the standard upload/download mechanism I have wrapped additional CSRF mitigation. When a user loads the main page or successfully completes a request the server generates a 64-character, cryptographically secure random string, and assigns this as the value of a hidden form on the page. Server-side, an object is also built that holds several strings:
- The random string.
- The requesters user agent.
- The requesters IP.
The object is serialised as JSON, encrypted using machine key encryption (i.e. AES256 with HMAC-SHA512) and set as a cookie in the browser of the requester.
When a request is made to upload or download a file, it is ensured that:
- The cookie decrypts successfully.
- The random string in the form is the same as the random string in the cookie.
- The requesters user agent is the same as the user agent in the cookie.
- The requesters IP is the same as the IP in the cookie.