S3Proxy implements the S3 API and proxies requests

atmos aws-s3 azure backblaze-b2 google-cloud-storage openstack-swift proxy s3

Go to file

Musab Shakeel 7191123bad Docker: Use dumb-init as init system PID 1 is special in Linux kernels; do not start the `java` process as PID 1. Instead, use dumb-init [1] as the init system and start it at PID 1. This is generally done for two main reasons [1]: - Signal handling When k8s terminates a Pod, it sends a SIGTERM to PID 1 in the containers. When a non-PID-1 process receives SIGTERM and it does not register a SIGTERM handler, the kernel will fall back to the default behavior for a SIGTERM signal: killing the process. However, this is not the case for PID 1; the kernel does not fallback to killing the PID 1 process if no SIGTERM handler is registered. This means that the SIGTERM will have no effect on the process. k8s will wait for `terminationGracePeriodSeconds` (default: 30 seconds) and only then SIGKILL the process (having wasted 30 seconds where nothing useful was happening). With `Java` as PID 1 though, I don't run into this issue. Even though s3Proxy does not register a SIGTERM handler, it seems like the JVM does react to the SIGTERM and immediately kills s3Proxy. Still, let's use dumb-init as a best-practice. Init-systems like dumb-init take the responsibliity for properly registering signal handlers and passing signals to children processes correctly. - Orphaned processes Any running process that becomes an orphan (parent process dies) is adopted by PID 1. This means PID 1 is responsible for cleaning up (reaping) the orphan process after it has terminated (become zombie /defunct) However, Java as PID 1 will not know about these zombie processes and will not reap them. Using dumb-init will reap such processes. This is a non-breaking change; users of the docker image do not need any action on their part. [1] https://engineeringblog.yelp.com/2016/01/dumb-init-an-init-for-docker.html [2] https://daveiscoding.hashnode.dev/why-do-you-need-an-init-process-inside-your-docker-container-pid-1		2025-03-06 12:33:30 -05:00
.github	Bump docker/metadata-action from 5.6.1 to 5.7.0	2025-03-01 22:56:22 -08:00
docs	…
s3-tests@4157e811cf	…
src	docker: Match s3proxy.v4-max-non-chunked-request-size default (128 MB)	2025-02-28 22:08:58 -08:00
.gitignore	…
.gitmodules	…
.mailmap	…
.releaserc	…
Dockerfile	Docker: Use dumb-init as init system	2025-03-06 12:33:30 -05:00
LICENSE	…
README.md	Update copyright year to 2025	2025-02-01 20:36:47 -08:00
pom.xml	Bump com.github.spotbugs:spotbugs-annotations from 4.9.0 to 4.9.2	2025-03-01 22:56:16 -08:00

README.md

S3Proxy

S3Proxy implements the S3 API and proxies requests, enabling several use cases:

translation from S3 to Backblaze B2, EMC Atmos, Google Cloud, Microsoft Azure, and OpenStack Swift
testing without Amazon by using the local filesystem
extension via middlewares
embedding into Java applications

Usage with Docker

Docker Hub hosts a Docker image and has instructions on how to run it.

Usage without Docker

Users can download releases from GitHub. Developers can build the project by running mvn package which produces a binary at target/s3proxy. S3Proxy requires Java 11 or newer to run.

Configure S3Proxy via a properties file. An example using the local file system as the storage backend with anonymous access:

s3proxy.authorization=none
s3proxy.endpoint=http://127.0.0.1:8080
jclouds.provider=filesystem
jclouds.filesystem.basedir=/tmp/s3proxy

First create the filesystem basedir:

mkdir /tmp/s3proxy

Next run S3Proxy. Linux and Mac OS X users can run the executable jar:

chmod +x s3proxy
s3proxy --properties s3proxy.conf

Windows users must explicitly invoke java:

java -jar s3proxy --properties s3proxy.conf

Finally test by creating a bucket then listing all the buckets:

$ curl --request PUT http://localhost:8080/testbucket

$ curl http://localhost:8080/
<?xml version="1.0" ?><ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>75aa57f09aa0c8caeab4f8c24e99d10f8e7faeebf76c078efc7c6caea54ba06a</ID><DisplayName>CustomersName@amazon.com</DisplayName></Owner><Buckets><Bucket><Name>testbucket</Name><CreationDate>2015-08-05T22:16:24.000Z</CreationDate></Bucket></Buckets></ListAllMyBucketsResult>

Usage with Java

Maven Central hosts S3Proxy artifacts and the wiki has instructions on Java use.

Supported storage backends

atmos
aws-s3 (Amazon-only)
azureblob
azureblob-sdk (newer but lacks multi-part upload, see Azure/azure-sdk-for-java#42603)
b2
filesystem (on-disk storage, deprecated)
filesystem-nio2 (on-disk storage, recommended)
google-cloud-storage
openstack-swift
rackspace-cloudfiles-uk and rackspace-cloudfiles-us
s3 (all implementations)
transient (in-memory storage, deprecated)
transient-nio2 (in-memory storage, recommended)

See the wiki for examples of configurations.

Assigning buckets to backends

S3Proxy can be configured to assign buckets to different backends with the same credentials. The configuration in the properties file is as follows:

s3proxy.bucket-locator.1=bucket
s3proxy.bucket-locator.2=another-bucket

In addition to the explicit names, glob syntax can be used to configure many buckets for a given backend.

A bucket (or a glob) cannot be assigned to multiple backends.

Middlewares

S3Proxy can modify its behavior based on middlewares:

SSL Support

S3Proxy can listen on HTTPS by setting the secure-endpoint and configuring a keystore. You can read more about how configure S3Proxy for SSL Support in the dedicated wiki page with Docker, Kubernetes or simply Java.

Limitations

S3Proxy has broad compatibility with the S3 API, however, it does not support:

ACLs other than private and public-read
BitTorrent hosting
bucket logging
bucket policies
CORS bucket operations like getting or setting the CORS configuration for a bucket. S3Proxy only supports a static configuration (see below).
hosting static websites
object server-side encryption
object tagging
object versioning, see #74
POST upload policies, see #73
requester pays buckets
select object content

S3Proxy emulates the following operations:

copy multi-part objects, see #76

S3Proxy has basic CORS preflight and actual request/response handling. It can be configured within the properties file (and corresponding ENV variables for Docker):

s3proxy.cors-allow-origins=https://example\.com https://.+\.example\.com https://example\.cloud
s3proxy.cors-allow-methods=GET PUT
s3proxy.cors-allow-headers=Accept Content-Type
s3proxy.cors-allow-credential=true

CORS cannot be configured per bucket. s3proxy.cors-allow-all=true will accept any origin and header. Actual CORS requests are supported for GET, PUT, POST, HEAD and DELETE methods.

The wiki collects compatibility notes for specific storage backends.

Support

References

Apache jclouds provides storage backend support for S3Proxy
Ceph s3-tests help maintain and improve compatibility with the S3 API
fake-s3, gofakes3, minio, S3 ninja, and s3rver provide functionality similar to S3Proxy when using the filesystem backend
GlacierProxy and SwiftProxy provide similar functionality for the Amazon Glacier and OpenStack Swift APIs
s3mock - Adobe's s3 mock implementation
sbt-s3 runs S3Proxy via the Scala Build Tool
swift3 provides an S3 middleware for OpenStack Swift
Zenko provide similar multi-cloud functionality

License

Licensed under the Apache License, Version 2.0