kopia lustrzana https://github.com/jupyterhub/repo2docker
Move to docs/source/design.md
rodzic
d13980d422
commit
1f0cbe0511
72
README.md
72
README.md
|
@ -100,75 +100,3 @@ be as they are released.
|
||||||
|
|
||||||
This will be treated as a regular Dockerfile and a regular Docker build will be performed. The presence
|
This will be treated as a regular Dockerfile and a regular Docker build will be performed. The presence
|
||||||
of a Dockerfile will cause all other building behavior to not be triggered.
|
of a Dockerfile will cause all other building behavior to not be triggered.
|
||||||
|
|
||||||
## Design
|
|
||||||
|
|
||||||
Two primary use cases for `repo2docker` drive most design decisions:
|
|
||||||
|
|
||||||
1. Automated image building used by projects like
|
|
||||||
[BinderHub](http://github.com/jupyterhub/binderhub)
|
|
||||||
2. Manual image building and running the image from the command line client,
|
|
||||||
`jupyter-repo2docker`, by users interactively on their workstations
|
|
||||||
|
|
||||||
We share our guiding design principles here. This is not an exhaustive
|
|
||||||
list :)
|
|
||||||
|
|
||||||
### Deterministic output
|
|
||||||
|
|
||||||
The core of `repo2docker` can be considered a
|
|
||||||
[deterministic algorithm](https://en.wikipedia.org/wiki/Deterministic_algorithm).
|
|
||||||
When given an input directory which has a particular repository checked out, it
|
|
||||||
deterministically produces a Dockerfile based on the contents of the directory.
|
|
||||||
So if we run `repo2docker` on the same directory multiple times, we get the
|
|
||||||
exact same Dockerfile output.
|
|
||||||
|
|
||||||
This provides a few advantages:
|
|
||||||
|
|
||||||
1. Reuse of cached built artifacts based on a repository's identity increases
|
|
||||||
efficiency and reliability. For example, if we had already run `reop2docker`
|
|
||||||
on a git repository at a particular commit hash, we know we can just re-use
|
|
||||||
the old output, since we know it is going to be the same. This provides
|
|
||||||
massive performance & architectural advantages when building additional
|
|
||||||
tools (like BinderHub) on top of `reop2docker`.
|
|
||||||
2. We produce Dockerfiles that have as much in common as possible across
|
|
||||||
multiple repositories, enabling better use of the Docker build cache. This
|
|
||||||
also provides massive performance advantages.
|
|
||||||
|
|
||||||
### Unix principles "do one thing well"
|
|
||||||
|
|
||||||
`repo2docker` should do one thing, and do it well. This one thing is:
|
|
||||||
|
|
||||||
> Given a repository, deterministically build a docker image from
|
|
||||||
> it.
|
|
||||||
|
|
||||||
There's also some convenience code (to run the built image) for users, but
|
|
||||||
that's separated out cleanly. This allows easy use by other projects (like
|
|
||||||
BinderHub).
|
|
||||||
|
|
||||||
There is additional (and very useful) design advice on this in
|
|
||||||
the [Art of Unix Programming](http://www.faqs.org/docs/artu/ch01s06.html) which
|
|
||||||
is a highly recommended quick read.
|
|
||||||
|
|
||||||
### Composability
|
|
||||||
|
|
||||||
Although other projects, like
|
|
||||||
[s2i](https://github.com/openshift/source-to-image), exist to convert source to
|
|
||||||
Docker images, `repo2docker` provides the additional functionality to support
|
|
||||||
*composable* environments. We want to easily have an image with
|
|
||||||
Python3+Julia+R-3.2 environments, rather than just one single language
|
|
||||||
environment. While generally one language environment per container works well,
|
|
||||||
in many scientific / datascience computing environments you need multiple
|
|
||||||
languages working together to get anything done. So all buildpacks are
|
|
||||||
composable, and need to be able to work well with other languages.
|
|
||||||
|
|
||||||
### [Pareto principle](https://en.wikipedia.org/wiki/Pareto_principle) (The 80-20 Rule)
|
|
||||||
|
|
||||||
Roughly speaking, we want to support 80% of use cases, and provide an escape
|
|
||||||
hatch (raw Dockerfiles) for the other 20%. We explicitly want to provide support
|
|
||||||
only for the most common use cases - covering every possible use case never ends
|
|
||||||
well.
|
|
||||||
|
|
||||||
An easy process for getting support for more languages here is to demonstrate
|
|
||||||
their value with Dockerfiles that other people can use, and then show that this
|
|
||||||
pattern is popular enough to be included inside `reop2docker`. Remember that 'yes'
|
|
||||||
is forever (very hard to remove features!), but 'no' is only temporary!
|
|
||||||
|
|
|
@ -0,0 +1,71 @@
|
||||||
|
# Design
|
||||||
|
|
||||||
|
Two primary use cases for `repo2docker` drive most design decisions:
|
||||||
|
|
||||||
|
1. Automated image building used by projects like
|
||||||
|
[BinderHub](http://github.com/jupyterhub/binderhub)
|
||||||
|
2. Manual image building and running the image from the command line client,
|
||||||
|
`jupyter-repo2docker`, by users interactively on their workstations
|
||||||
|
|
||||||
|
We share our guiding design principles here. This is not an exhaustive
|
||||||
|
list :)
|
||||||
|
|
||||||
|
## Deterministic output
|
||||||
|
|
||||||
|
The core of `repo2docker` can be considered a
|
||||||
|
[deterministic algorithm](https://en.wikipedia.org/wiki/Deterministic_algorithm).
|
||||||
|
When given an input directory which has a particular repository checked out, it
|
||||||
|
deterministically produces a Dockerfile based on the contents of the directory.
|
||||||
|
So if we run `repo2docker` on the same directory multiple times, we get the
|
||||||
|
exact same Dockerfile output.
|
||||||
|
|
||||||
|
This provides a few advantages:
|
||||||
|
|
||||||
|
1. Reuse of cached built artifacts based on a repository's identity increases
|
||||||
|
efficiency and reliability. For example, if we had already run `reop2docker`
|
||||||
|
on a git repository at a particular commit hash, we know we can just re-use
|
||||||
|
the old output, since we know it is going to be the same. This provides
|
||||||
|
massive performance & architectural advantages when building additional
|
||||||
|
tools (like BinderHub) on top of `reop2docker`.
|
||||||
|
2. We produce Dockerfiles that have as much in common as possible across
|
||||||
|
multiple repositories, enabling better use of the Docker build cache. This
|
||||||
|
also provides massive performance advantages.
|
||||||
|
|
||||||
|
## Unix principles "do one thing well"
|
||||||
|
|
||||||
|
`repo2docker` should do one thing, and do it well. This one thing is:
|
||||||
|
|
||||||
|
> Given a repository, deterministically build a docker image from
|
||||||
|
> it.
|
||||||
|
|
||||||
|
There's also some convenience code (to run the built image) for users, but
|
||||||
|
that's separated out cleanly. This allows easy use by other projects (like
|
||||||
|
BinderHub).
|
||||||
|
|
||||||
|
There is additional (and very useful) design advice on this in
|
||||||
|
the [Art of Unix Programming](http://www.faqs.org/docs/artu/ch01s06.html) which
|
||||||
|
is a highly recommended quick read.
|
||||||
|
|
||||||
|
## Composability
|
||||||
|
|
||||||
|
Although other projects, like
|
||||||
|
[s2i](https://github.com/openshift/source-to-image), exist to convert source to
|
||||||
|
Docker images, `repo2docker` provides the additional functionality to support
|
||||||
|
*composable* environments. We want to easily have an image with
|
||||||
|
Python3+Julia+R-3.2 environments, rather than just one single language
|
||||||
|
environment. While generally one language environment per container works well,
|
||||||
|
in many scientific / datascience computing environments you need multiple
|
||||||
|
languages working together to get anything done. So all buildpacks are
|
||||||
|
composable, and need to be able to work well with other languages.
|
||||||
|
|
||||||
|
## [Pareto principle](https://en.wikipedia.org/wiki/Pareto_principle) (The 80-20 Rule)
|
||||||
|
|
||||||
|
Roughly speaking, we want to support 80% of use cases, and provide an escape
|
||||||
|
hatch (raw Dockerfiles) for the other 20%. We explicitly want to provide support
|
||||||
|
only for the most common use cases - covering every possible use case never ends
|
||||||
|
well.
|
||||||
|
|
||||||
|
An easy process for getting support for more languages here is to demonstrate
|
||||||
|
their value with Dockerfiles that other people can use, and then show that this
|
||||||
|
pattern is popular enough to be included inside `reop2docker`. Remember that 'yes'
|
||||||
|
is forever (very hard to remove features!), but 'no' is only temporary!
|
Ładowanie…
Reference in New Issue