s3proxy/docs/Encryption.md

77 wiersze
4.1 KiB
Markdown

S3Proxy
# Encryption
## Motivation
The motivation behind this implementation is to provide a fully transparent and secure encryption to the s3 client while having the ability to write into different clouds.
## Cipher mode
The chosen cipher is ```AES/CFB/NoPadding``` because it provides the ability to read from an offset like in the middle of a ```Blob```.
While reading from an offset the decryption process needs to consider the previous 16 bytes of the AES block.
### Key generation
The encryption uses a 128-bit key that will be derived from a given password and salt in combination with random initialization vector that will be stored in each part padding.
## How a blob is encrypted
Every uploaded part get a padding of 64 bytes that includes the necessary information for decryption. The input stream from a s3 client is passed through ```CipherInputStream``` and piped to append the 64 byte part padding at the end the encrypted stream. The encrypted input stream is then processed by the ```BlobStore``` to save the ```Blob```.
| Name | Byte size | Description |
|-----------|-----------|----------------------------------------------------------------|
| Delimiter | 8 byte | The delimiter is used to detect if the ```Blob``` is encrypted |
| IV | 16 byte | AES initialization vector |
| Part | 4 byte | The part number |
| Size | 8 byte | The unencrypted size of the ```Blob``` |
| Version | 2 byte | Version can be used in the future if changes are necessary |
| Reserved | 26 byte | Reserved for future use |
### Multipart handling
A single ```Blob``` can be uploaded by the client into multiple parts. After the completion all parts are concatenated into a single ```Blob```.
This procedure will result in multiple parts and paddings being held by a single ```Blob```.
### Single blob example
```
-------------------------------------
| ENCRYPTED BYTES | PADDING |
-------------------------------------
```
### Multipart blob example
```
-------------------------------------------------------------------------------------
| ENCRYPTED BYTES | PADDING | ENCRYPTED BYTES | PADDING | ENCRYPTED BYTES | PADDING |
-------------------------------------------------------------------------------------
```
## How a blob is decrypted
The decryption is way more complex than the encryption. Decryption process needs to take care of the following circumstances:
- decryption of the entire ```Blob```
- decryption from a specific offset by skipping initial bytes
- decryption of bytes by reading from the end (tail)
- decryption of a specific byte range like middle of the ```Blob```
- decryption of all previous situation by considering a underlying multipart ```Blob```
### Single blob decryption
First the ```BlobMetadata``` is requested to get the encrypted ```Blob``` size. The last 64 bytes of ```PartPadding``` are fetched and inspected to detect if a decryption is necessary.
The cipher is than initialized with the IV and the key.
### Multipart blob decryption
The process is similar to the single ```Blob``` decryption but with the difference that a list of parts is computed by fetching all ```PartPadding``` from end to the beginning.
## Blob suffix
Each stored ```Blob``` will get a suffix named ```.s3enc``` this helps to determine if a ```Blob``` is encrypted. For the s3 client the ```.s3enc``` suffix is not visible and the ```Blob``` size will always show the unencrypted size.
## Tested jClouds provider
- S3
- Minio
- OBS from OpenTelekomCloud
- AWS S3
- Azure
- GCP
- Local
## Limitation
- All blobs are encrypted with the same key that is derived from a given password
- No support for re-encryption
- Returned eTag always differs therefore clients should not verify it
- Decryption of a ```Blob``` will always result in multiple calls against the backend for instance a GET will result in a HEAD + GET because the size of the blob needs to be determined