When I first proposed using Docker for development, and began doing my work that way, there were some doubts.
- Doesn’t it seem like a lot of trouble to set up Docker to get my work done?
- Isn’t it easier to use Homebrew to install the services and database servers I need?
At Artsy, our main API aka Gravity uses MongoDB, Solr, Elasticsearch, and memcached. In development, we use Mailcatcher so we can view emails. When a new software engineer starts, that person studies a big Getting Started document, and spends most of a day to get everything installed and configured. Not only do they need to get the software installed, figuring out all of the environment variables that need to be set up can take some time too. While we have good documentation, it is still a tedious and repetitive process that takes up the time of our new employee, and more experienced developers who need to answer questions.
Now that Gravity has been dockerized, getting set up consists of a one-time install of Docker Toolbox followed by running
docker-compose build && docker-compose up
in the root directory of the checked-out repo. Here is a simplified version of our docker-compose setup. Because we run a web server and a delayed_job process,
docker-compose.yml uses a
common.yml file for shared setup:
gravity: build: . environment: MEMCACHE_SERVERS: memcached SOLR_URL: http://solr4:8983/solr/gravity MONGO_HOST: mongodb ELASTICSEARCH_URL: elasticsearch:9200 SMTP_PORT: 1025 SMTP_ADDRESS: mailcatcher env_file: .env
.env file is used for secrets such as Amazon Web Services credentials we don’t want to put into the git repository.
docker-compose.yml looks like this:
mongodb: image: mongo:2.4 command: bash -c "rm -f /data/db/mongod.lock; mongod --smallfiles --quiet --logpath=/dev/null" ports: - "27017:27017" solr4: image: artsy/solr4 memcached: image: memcached elasticsearch: image: artsy/elasticsearch ports: - "9200:9200" - "9300:9300" web: extends: file: common.yml service: gravity command: script/rails s -b 0.0.0.0 -p 80 ports: - "80:80" volumes: - .:/app links: - elasticsearch - mongodb - memcached - solr4 - mailcatcher dj: extends: file: common.yml service: gravity command: bundle exec rake jobs:work volumes: - .:/app links: - elasticsearch - mongodb - memcached - solr4 mailcatcher: image: zolweb/docker-mailcatcher ports: - "1080:1080"
The command for the MongoDB section removes a lock file that can remain in place sometimes when the container is killed. Do not use that in production! We mount the local directory into the container with a
volumes: command, so that local changes are reloaded in the running containers.
Recently, Ashkan Nasseri began to move our delayed jobs from delayed_job_mongoid to sidekiq, which brings in Redis and another process that needs to run during development. Since we are using Docker, all we have to do is add a couple of new sections to our
redis: image: redis ports: - "6379:6379" sidekiq: extends: file: common.yml service: gravity command: bundle exec sidekiq volumes: - .:/app links: - elasticsearch - mongodb - memcached - solr4 - redis
and add this line to
The next time someone runs
docker-compose up, this will cause a one-time download of a redis image, and then it brings up the additional sidekiq service.
For development which involves multiple applications in separate git repositories, we use Dusty, which was created by GameChanger. Some of the advantages of using Dusty include the use of NFS (which performs much better than shared volumes in VirtualBox), and a built-in nginx proxy along with modifications to your
/etc/hosts file so that you can more easily connect to your applications.
With Dusty, you set up services, apps, and bundles of apps with YAML files. Here is a repo with sample Dusty specs.
Our MongoDB service is defined as:
# services/mongo2.yml image: mongo:2.4 volumes: - /persist/persistentMongo:/data/db entrypoint: ["sh", "-c", "rm -f /data/db/mongod.lock; mongod --smallfiles --quiet --logpath=/dev/null"] ports: - "27017:27017"
It’s not necessary to expose the ports, but in case we want to connect directly to the MongoDB instance with the
mongo command without shelling into a container, we need it to be available on our Docker VM’s IP address.
Our Gravity app’s Dusty YAML file is:
# apps/gravity.yml repo: github.com/artsy/gravity mount: /app build: . depends: services: - mongo2 - memcached - solr4 - es15 - mailcatcher - redis apps: - radiation host_forwarding: - host_name: gravity host_port: 80 container_port: 80 compose: environment: RADIATION_URL: http://radiation MONGO_HOST: mongo2 MEMCACHE_SERVERS: memcached SOLR_URL: http://solr4:8983/solr/gravity ELASTICSEARCH_URL: es15:9200 SMTP_ADDRESS: mailcatcher SMTP_PORT: 1025 REDIS_URL: redis://redis commands: once: - bundle install -j 10 - bundle exec rake db:client_applications:create_all - bundle exec rake db:admin:create always: - rails s -b 0.0.0.0 -p 80
depends: configuration is similar to the links functionality of docker-compose. It makes sure that those applications (as defined in
apps/*.yml) are running, and sets up
/etc/hosts in the containers to allow your applications to refer to other services using their hostnames.
For now, Dusty doesn’t have a way of sharing common setup like
common.yml above, so there are similar configurations for our Sidekiq and Delayed Job workers.
Dusty uses bundles for clusters of applications that need to work together. An example bundle, for a CMS application that needs many APIs, is:
# apps/volt.yml description: Volt apps: - tangentApi - radiation - superposition - gravity - volt
We bring up that cluster of applications with
dusty bundles activate volt dusty up
As we have added new services over time, using Docker and Dusty to bring clusters of apps together has made it much easier for developers to work on projects without having to spend time on installation and configuration. Having Docker configuration in the repo also serves as good (and up-to-date) documentation of how a given application is configured and its dependencies. It is also much less resource-intensive compared to using virtual machines configured with Vagrant or another provisioning tool. All of our Docker applications and services can run in a single VM. If you are developing on Linux, you don’t even need a VM!
We are also starting to use Docker to run integrated testing across multiple applications using Selenium. That will be covered in a future blog post.