4.1 KiB
S3Proxy
Encryption
Motivation
The motivation behind this implementation is to provide a fully transparent and secure encryption to the s3 client while having the ability to write into different clouds.
Cipher mode
The chosen cipher is AES/CFB/NoPadding
because it provides the ability to read from an offset like in the middle of a Blob
.
While reading from an offset the decryption process needs to consider the previous 16 bytes of the AES block.
Key generation
The encryption uses a 128-bit key that will be derived from a given password and salt in combination with random initialization vector that will be stored in each part padding.
How a blob is encrypted
Every uploaded part get a padding of 64 bytes that includes the necessary information for decryption. The input stream from a s3 client is passed through CipherInputStream
and piped to append the 64 byte part padding at the end the encrypted stream. The encrypted input stream is then processed by the BlobStore
to save the Blob
.
Name | Byte size | Description |
---|---|---|
Delimiter | 8 byte | The delimiter is used to detect if the Blob is encrypted |
IV | 16 byte | AES initialization vector |
Part | 4 byte | The part number |
Size | 8 byte | The unencrypted size of the Blob |
Version | 2 byte | Version can be used in the future if changes are necessary |
Reserved | 26 byte | Reserved for future use |
Multipart handling
A single Blob
can be uploaded by the client into multiple parts. After the completion all parts are concatenated into a single Blob
.
This procedure will result in multiple parts and paddings being held by a single Blob
.
Single blob example
-------------------------------------
| ENCRYPTED BYTES | PADDING |
-------------------------------------
Multipart blob example
-------------------------------------------------------------------------------------
| ENCRYPTED BYTES | PADDING | ENCRYPTED BYTES | PADDING | ENCRYPTED BYTES | PADDING |
-------------------------------------------------------------------------------------
How a blob is decrypted
The decryption is way more complex than the encryption. Decryption process needs to take care of the following circumstances:
- decryption of the entire
Blob
- decryption from a specific offset by skipping initial bytes
- decryption of bytes by reading from the end (tail)
- decryption of a specific byte range like middle of the
Blob
- decryption of all previous situation by considering a underlying multipart
Blob
Single blob decryption
First the BlobMetadata
is requested to get the encrypted Blob
size. The last 64 bytes of PartPadding
are fetched and inspected to detect if a decryption is necessary.
The cipher is than initialized with the IV and the key.
Multipart blob decryption
The process is similar to the single Blob
decryption but with the difference that a list of parts is computed by fetching all PartPadding
from end to the beginning.
Blob suffix
Each stored Blob
will get a suffix named .s3enc
this helps to determine if a Blob
is encrypted. For the s3 client the .s3enc
suffix is not visible and the Blob
size will always show the unencrypted size.
Tested jClouds provider
- S3
- Minio
- OBS from OpenTelekomCloud
- AWS S3
- Azure
- GCP
- Local
Limitation
- All blobs are encrypted with the same key that is derived from a given password
- No support for re-encryption
- Returned eTag always differs therefore clients should not verify it
- Decryption of a
Blob
will always result in multiple calls against the backend for instance a GET will result in a HEAD + GET because the size of the blob needs to be determined