Artsy's Technology Stack, 2017

By Orta Therox, The Artsy Engineering Team

History

Artsy was launched in 2012 as the "Art Genome Project" and grew exponentially ever since.

By 2014 we had 230,000 works of art from 600 museums and institutions and launched our first business, a subscription service for commercial galleries, bringing over 80,000 works for sale and partnerships with 37 art fairs and a handful of benefit auctions. That year collectors from 82 countries inquired on over $5.5B of art.

By 2015 we doubled our "for sale" inventory and aggregated 4,000 of the world's leading galleries and 60 art fairs. We also launched two new businesses: commercial auctions and online media.

Finally, in 2016 we, again, doubled our paid gallery network size to become the largest gallery network in the world and grew to become the most-read online art publication as our highly engaging editorial traffic ballooned 320%. We also launched a platform to bid in live auctions and a consignments service with all major auction houses.

The Artsy Business in 2017

Artsy in 2017 is a very wide platform and it can be challenging to characterize simply. But when you boil it down to its essence, Artsy offers information and a marketplace. Our written content and fair coverage keep people informed about the art world, and the Art Genome powers our tools for exploration. Through our partnerships with the major player in the art market, galleries and auction houses, we offer our users a unified platform for buying and selling art.

Internally we consider Artsy to have three businesses: Auctions, Content and Listings.

  • Auctions: Auction houses and charities use Artsy as a sales channel for a commission because collectors want to discover and buy art in a single, central platform that excels at surfacing the art they want from a global market.

  • Content: Brands pay Artsy to reach the first art audience at scale by enabling evergreen content online and for offline engagement during art world events.

  • Listings: Galleries, Fairs and Institutions subscribe to Artsy for a fee because we bring a very large audience of art collectors and enthusiasts to their virtual doors.

The Artsy team is now 166 employees across three offices in New York, Berlin and London. The Engineering organization is now 28 engineers, including 4 leads, 3 directors and a CTO. In this post, we'd like to comprehensively cover what, and how we make the technical and human sides of Artsy businesses work.

Organizational Structures

In 2016, we updated the Engineering organization to be oriented around product verticals for businesses. We used to focus more on practices to group engineers working with the same technologies across product teams to facilitate knowledge sharing and avoid redundant efforts.

Since then, web and mobile "practices" have largely been subsumed into the separate product teams. Mobile's increasing reliance on React Native has aligned nicely with web tooling. It no longer made sense to keep the teams separate, so where product teams used to have 2 separate sub-teams of engineers, they've now merged into 1.

The Platform "practice" has remained as a way to coordinate and share work among product teams, as well as monitor and upgrade Artsy's platform over time. Most platform engineers operate from within product teams, while a few focusing on data and infrastructure form a core, dedicated Platform team.

Artsy Technology Infrastructure 2017 - Splitting the Monolith

The Artsy Tech Stack 2017

User Facing

A lot of the user-facing focus is on being able to present interfaces with a quality worthy of art.

What you see today when you go to www.artsy.net is a website built with Ezel.js, which is a boilerplate for Backbone projects running on Node and using Express and Browserify. We used to have separate projects for mobile and desktop web, but they are now merged. The combined app is hosted on Heroku and uses Redis for caching. Assets, including artwork images, are served from Amazon S3 via the CloudFront CDN. This code is open-source.

What you see today when you open the Artsy iOS app is a mix of Objective-C, Swift and React Native. Objective-C and Swift continue to provide a lot of over-arching cross-View Controller code. While individual representations of Artsy resources tend to be built in React Native. All of our React Native code uses Relay to handle API integration. This code is open-source.

You can also find Artsy on Alexa and Google Home, which are both open-source Node.js applications. There is also an open-source Apple TV app built in Swift.

Our core API serves the public facets of our product, many of our own internal applications, and even some of your own projects. It's built with Ruby, Rack, Rails, and Grape serving primarily JSON. The API is hosted on AWS OpsWorks and retrieves data from several MongoDB databases hosted with Compose. It also uses Memcached for caching and Redis for background queues with Sidekiq. It runs background jobs with delayed_job. We used to employ Apache Solr and even Google Custom Search for the many search functions, but have since consolidated on Elasticsearch.

Most modern code for both the website and the iOS app use an orchestration layer which is powered by GraphQL to streamline their data fetching and reduce front-end complexity. Our GraphQL server is an Express app, using express-graphql to provide a single API end-point. The API does not access our data directly, but forwards requests to the core API or other services. We have been migrating shared display logic into the GraphQL server, to make it easier to build consistent clients. This code is open-source.

Consistently, our front-end code has moved towards using React across all platforms along with introducing stricter JavaScript languages like TypeScript over CoffeeScript in order to provide better tooling.

We continue to have a public HAL+JSON API for external developers. This API is in active use for contemporary production services inside Artsy and the website is open-source, too.

Partner-Facing

The vast customer-facing business is powered by a Content Management System (CMS) for gallery and institutional partners. This CMS lets them upload and manage gallery shows, fair booths, create artists, and edit artwork metadata. All CMS components talk to our core API. We also have a number of CMS-like internal applications to manage partners, auctions, art genomes, configuring fairs or performing recurrent billing (we use Stripe for storing and charging credit cards and ACH) with invoicing.

CMS applications are based on stable, mature technologies like Rails, Bootstrap, Turbolinks and CoffeeScript, and gradually adopts modern client-side technologies like React and Browserify. They share a lot of common infrastructure.

We have a generic image-processing service in-house, which uses Rails, Sidekiq, Redis, and RMagick with ImageMagick. It receives image processing requests from many Artsy applications and generates thumbnails, tiles and watermarks images on S3.

Collector-Facing

Collectors inquire on artworks and engage in conversations with partners. For this purpose we have built a generic messaging system that manages communications between different parties. It receives messages via API or e-mail, finds or creates a conversation based on the recipients and forwards them to the proper addresses in that conversation. Its doesn't assume anything about the contents of the messages, which makes it a generic system for any type of conversation. The conversations surface to our partners via CMS.

Running Auctions

The Auctions business began with doing the occasional benefit auctions for charities. Most of these auctions are online-only, timed sales. The initial version of our auction systems came together before we began our move to microservices, and so it is baked into our core API. Last year, we launched a live auction integration product to allow users to bid on works at commercial sales at the actual auction house sale rooms. The real-time requirements of this system required a rethinking of how we process our bids.

The core API for a commercial auction is a Scala micro-service that uses Akka for distributed computing. It stores information in an append-only storage engine, based on Akka Persistence, with a small library we developed called atomic-store. Communication with external clients can either be done via a REST API, or via WebSockets. People visiting a Live Auction on the web are interacting with a universal React+Redux JavaScript app, served from an Express server. Bidders visiting a Live Auction on iOS are interacting with a Swift application built with Interstellar, Starscream and SwiftyJSON.

A more detailed overview of the Auctions stack can be found in The Tech Behind Live Auction Integration.

Publishing

Our in-house editorial team and partners use an open-source platform called "Writer" (which we've built) to publish rich content across the web. Writer is split in two parts: the editorial-focused CMS and a JSON API that stores and distributes content separately from the rest of Artsy's stack.

Writer's frontend is built with Ezel.js, which is a boilerplate for Backbone projects running on Node and using Express and Browserify. We also heavily use React and write in CoffeeScript. Writer's backend exposes REST-based and GraphQL APIs that are consumed by our applications.

You can see Writer being put to work when you see articles on www.artsy.net, Facebook Instant Articles, Google AMP, RSS, Apple News, and email. We handle the distribution and display in all of these channels. We also support brand sponsorship deals and produce front-end heavy projects such as Year in Art 2016, and Year in Art 2015.

Data Pipeline

Data generally flows from consumer applications and services into AWS RedShift. We use a set of rake tasks run on Jenkins to move data from our several MongoDB and PostgreSQL databases to Redshift via S3. These rake tasks shell out to psql or mongo-export to generate CSV files for a list of services and upload them to an S3 bucket, then load those CSV files plus others found in that bucket (placed there by other services) into Redshift. If a Redshift copy fails due to data changes we sample the CSV and generate a working schema from its contents.

We also store application usage data provided by Segment Warehouses as well as data from vendors such as Salesforce and Sailthru.

For production data processing (such as recommendations), large-scale machine learning or even simpler parallel processing such as generating website sitemaps, we have our own Hadoop cluster configured and managed by Cloudera Manager and running on EC2. We leverage Apache Spark and Hadoop with some Ooozie workflow scheduling. The same data pipeline that writes data to S3 also pumps data to HDFS with either Ruby code or Sqoop and is read by Spark jobs written in Scala using Hive. Spark has improved performance and capacity tenfold over our older in-house systems and we will be moving all lengthy processing implemented in Ruby to this system gradually.

Analytics

For general data access and dashboards we rely on Looker. This system empowers all non-engineers to access all of our data. At the time of writing, there are 50 users running 3,500 queries a day against Redshift via Looker. We've found it expedient to pre-compute common denormalized views, and to create our own session rollups from raw pageviews and events for the additional flexibility it gives us in understanding user behavior.

For more in-depth work, we use Jupyter Notebooks to connect to our Redshift cluster and by default import pandas, sci-kit learn, and pyplot for data analysis.

Search

We completed our full migration from Solr to Elasticsearch in the last 18 months, and now use Elasticsearch across all front-ends. This ranges from our artwork filter interfaces through to our real-time artwork similarity features. Elasticsearch gives us high availability clustering features out of the box and easy horizontal scaling on demand.

Platform Services

As Artsy's business has grown more complex, so has the data and concepts handled by its core API. We've begun supporting certain product areas with separate, dedicated API services, and even extracting existing API domains into separate services when practical. These services tend to expose simple REST-ful HTTP APIs, maintain separate data sources, and even do their own authentication. This has certain advantages:

  • Each system can be deployed and scaled independently.
  • Each chooses the best-suited languages and technologies for its purpose.
  • Code bases remain more focused and developers' cognitive overhead is minimized.

Balancing these out are some very real disadvantages:

  • Development must sometimes touch multiple systems.
  • Some data is copied between services. These can become out-of-sync, though we always try to have a single authoritative source in such cases.
  • Deploys must be coordinated.

At our size and complexity, a single code base is simply impractical. So, we've tried to be consistent in the coding, deployment, monitoring, and logging practices of these services. The more repeatable and disciplined our process, the less overhead is introduced by additional systems.

We've also explored alternate communication patterns, so systems aren't as dependent on each other's APIs. Recently we've begun publishing a stream of data events from our core systems that other systems can consume. Other systems can simply subscribe to the notifications they care about, so the source system doesn't need to be concerned about integrating with one more destination. After experimenting with Kafka but finding it hard to manage, we switched to RabbitMQ for this purpose. To provide consistency when publishing events we have our own gem.

Operations

All our recent AWS infrastructure is configured in code using Terraform. This approach has allowed us to quickly replicate entire deployments along with their dependencies and has increased visibility into the state of our infrastructure across our teams. We started developing an open source Docker workflow toolkit named Hokusai in order to manage a containerized workflow, CI and deployment to Kubernetes. Our Kubernetes clusters are managed using Kops and similarly provisioned using Terraform. This new workflow is reducing our dependence on Heroku, giving us more flexibility in our deployments and a more efficient use of server resources.

Closing Remarks

Like any attempts at mapping something as large as the daily work for a thirty-ish person engineering team, the map is not the territory. However, the exploration is worth the time it takes to keep notes for reading again in the next two years.

If you're interested in helping us make this an even longer post in two more years, or more interestingly shorter - we nearly always have a position open for engineers.