Practical Time Series Visualization using D3 + OM


At LiquidLandscape, we believe interactive visuals are the best way to explore and understand multi-dimensional data. This holds especially true when dealing with time series data, where being able to travel back and forth in time is crucial for understanding how other dimensions relate. When we were tasked with creating an insightful visualization for portfolio (time series) data, we knew that something highly interactive would be necessary.

We wanted to present multiple interactive visualizations on a single page, showing the data and relationships from different perspectives. We found the combination of D3 and OM to be perfect for this task. D3’s ability to create beautiful visualizations and OM’s sensible approach to state makes for an awesome marriage of design and engineering. Realizing this benefit isn’t free – the two libraries come from different worlds, with conflicting philosophies. However, by following a few guidelines and by looking at our code example, developing in (and building on top of) this model becomes fairly simple.

In order of importance, our goals are:

  1. data consistency/correctness – ’nuff said
  2. responsiveness – the benefits of an interactive visualization greatly diminishes with decreasing responsiveness. When we can’t interact with the data at human-speed, much is lost.
  3. object constancy – with so many changing variables, visual assistance is helpful for following changes in the data

Tools

We’re big fans of how much D3 simplified the creation of beautiful visualizations and knew that it would play a big role.

Clojure(Script) was already a key part of our technical arsenal. The language is natural for data manipulation, and its persistent data structures are a great fit for snapshotting real-time data and making it available as a time series. Since accessing the time series data becomes trivial, this opens up the possibility of having playback functionality (play, pause, rewind) for your visualization app.

As for the UI itself, Facebook’s React, and its ClojureScript interface, OM, seemed ideal for developing rich UIs that are easy to reason about.

OM from 10,000 ft

At its core, OM is about having a single, atomic representation of application state, and having hierarchical components render the latest snapshot of that state. DOM manipulation is expensive, so when application state changes, OM calculates and executes the minimum set of corresponding DOM updates. A big value-added of OM over React (besides the fact that it’s part of the Clojure ecosystem) is that changes in application state are very efficiently detected due to the fact you can simply do reference equality checks on persistent data structures.

D3 + OM, I’m familiar with both. This should be a no-brainer, right?

This is fine and well if you’re having OM render static html to reflect the data, OM will take care of efficiently rendering the data. However, we’re interested in creating highly interactive visuals using D3. D3 has a very powerful, declarative way of expressing how visual elements relate to data using selections. In my opinion, this is beautifully done, and we don’t want OM components to take over that responsibility. In addition, object constancy (via transitions) is crucial for understanding time series data, hence we don’t want to simply replace DOM elements.

That said, a mix of D3 and OM sounds like a good way to approach this. But, OM/cljs works with (immutable) persistent data structures while D3 works with (mutable) native js data structures. Some data marshalling would be necessary for this to work, right? Where’s the data demarcation?

This is the approach that worked for us given our goals:

  • Raw/canonical time series data should be held atomically in OM – to take advantage of efficiently representing time series data in persistent data structures.
  • D3 works with native javascript data structures, and data marshalling (via clj->js and js->clj) is expensive, so we want to minimize this translation.
  • Therefore, it is reasonable to represent a (time) snapshot of the data as a cursor
  • AND, be comfortable with having some mutable data (stored in component state) necessary for a dynamic visualization.

There is a performance hit when naively using multiple visual OM components. Remember that OM renders consistent snapshots, meaning that it will want to render all applicable components on the page in a single render pass. With sophisticated visuals, this may result in an unacceptable refresh-rate. The problem and proposed solution is detailed below in the Inter-Component Communication section.

An Example

Let’s now go through an example to see how building something using D3 + OM would look. The advent of fracking and the recent fall in price of crude oil has been an important macro-economic driver of the economy. As an example, we’ll create a visualization app showing oil production on a US map and how it changes with price of oil over time. We’ll be continuing the rest of the article using this example.

The full source is available github

OM Component life-cycle with D3

Let’s create a map-chart component, and see how it uses the OM life-cycle protocols.

map chart

om.core/IInitState – initializes local state. Good for storing core.async channels, D3 mutable state, etc.

(init-state [_]
  (let [width 960 height 500
        projection (-> js/d3 .-geo .albersUsa (.scale 1000) (.translate #js [(/ width 2) (/ height 2)]))]
    {:width width :height height
     :path (-> js/d3 .-geo .path (.projection projection))
     :color (-> js/d3 .-scale .linear
              (.range #js ["green" "red"]))
     :comm (chan)
     :us nil ;topojson data
     :svg nil
     }))

om.core/IWillMount – any set-up tasks and core.async loops to consume from core.async channels. core.async is very important in OM because outside of the render phase, you cannot treat cursors as values . Any user-driven event (keyboard, mouse-click, etc) is outside of the render phase, so you should relay those events via core.async channels to core.async loops created during IWillMount.

(will-mount [_]
  (c/shared-async->local-state owner [:oil-prod-max-val]) ;subscribe to updates to shared-async-state
  (let [{:keys [comm]} (om/get-state owner)]
    (-> js/d3 (.json "/data/us-named.json"
                  (fn [error us]
                    (put! comm [:us us])))) ;callbacks are not part of the render phase, need to relay
    (go (while true
          (let [[k v] (<! comm)]
            (case k
              :us (om/update-state! owner #(assoc % :us v))
              ))))))

om.core/IRender/om.core/IRenderState – create placeholder div for SVG element

(render [_]
  (html
    [:div#map]))

om.core/IDidMount – create SVG element

(did-mount [_]
  (let [{:keys [width height projection comm]} (om/get-state owner)
        svg (-> js/d3 (.select "#map") (.append "svg")
              (.attr "width" width) (.attr "height" height))]
    (om/update-state! owner #(assoc % :svg svg))))

om.core/IDidUpdate – main hook for rendering and transitioning visual via D3.

(did-update [_ prev-snapshot prev-state]
  (let [{:keys [svg us path color oil-prod-max-val]} (om/get-state owner)]
    (when us
      (-> color (.domain #js [0 oil-prod-max-val]))
      (let [states (-> svg (.selectAll ".state")
                     (.data (-> js/topojson (.feature us (-> us .-objects .-states)) .-features)))]
        (-> states
            (.attr "fill" (fn [d-]
                             (let [code (-> d- .-properties .-code)
                                   prod (or (get snapshot code) 0)]
                               (color prod))))
          (.enter)
            (.append "path")
              (.attr "class" (fn [d-] (str "state " (-> d- .-properties .-code))))
              (.attr "fill" (fn [d-]
                              (let [code (-> d- .-properties .-code)
                                    prod (or (get snapshot code) 0)]
                                (color prod))))
              (.attr "d" path)
            (.append "title"))
        (-> states (.select "path title")
            (.text (fn [d-]
                           (let [code (-> d- .-properties .-code)
                                 prod (or (get snapshot code) 0)]
                             (NUMBER_FORMAT prod)))))
        ))))

We’ll also create a root component, called app which lays out the entire OM app, and is responsible for loading all data in om/IWillMount. You can check this in the full source.

Where’s the state with D3 + OM?

It’s important to have a clear understanding of the different levels where state can be kept, and what each level is appropriate for.

  • shared state – this is created during om.core/root under :shared and where a global pub/sub core.async channel (used to communicate shared async state changes) can be kept.

    (om/root app app-state {:target (. js/document (getElementById "app"))
                            :shared (let [publisher (chan)]
                                      {:publisher publisher
                                       :publication (pub publisher first)
                                       :init {:oil-prod-max-val 0
                                              :ts (js/Date. 0)}
                                       })})
    
  • app state – the data. Available to OM components as cursors.

    (def app-state
      (atom {:oil-data (sorted-map) ; {date1 {:production {"CA" 232 ...} "SpotPrice" 23} date2 {...}
             }))
    
  • component/local state – state only relevant within the component, appropriate for storing some mutable state for D3 visuals. Initialized in om/IInitState, accessed via om/get-state, and modified via om/set-state! and om/update-state!

  • shared async state – a concept we introduced for Inter-Component Communication. It’s state which is asynchronously communicated between components and eventually consistent. Basically, a more formalized take on Publish & Notification Channels. Usage detailed below.

Inter-Component Communication

Discovering hidden gems in data often require simultaneously having multiple visuals on the same page to see the data from different perspectives. These visuals must be correlated and dynamic in order for the user to detect complex relationships. Using shared async state as a means of inter-component communication can be very effective.

How is shared async state different than app state?

  • it’s not the data (which is what app state is appropriate for), but rather how to display the data.
  • app state can only be accessed via cursors, which has a couple limitations:
    • must be a map or vector. Can be unnatural for passing display-related values like color, timestamp, index, etc.
    • cursors are hierarchical, meaning a component’s cursor is a subset of its parent’s cursor. This is partially mitigated via reference cursors, but I’ve personally found this approach to be less explicit about what the dependencies are.
  • following React/OM’s philosophy, we don’t care how the related components are rendered, just that we’ll end up with a consistent view of the related components. Changes in app state will only render consistent views, whereas changes in shared async state will eventually render consistent views.
    • This means that with related components A & B, component A could be rendering changes in :ts at 10 Hz, whereas the more computationally expensive component B could be rendering changes in :ts at a much lower rate of 1 Hz.
    • By relaxing the consistent-rendering constraint, we decouple the visual renderings of related components to achieve a higher (perceived) refresh-rate.

In our example, we’ll add a component, timeseries, which will show a time series of the price of oil, and function as a time-slider for map-chart. They have timestamp :ts as their shared async state.

timeseries graph

shared async state is implemented using core.async channels:

  1. determine the shared async state between components, this could mean time, zoom-level, etc.
  2. every component is capable of changing shared async state via the global pub/sub core.async channel
  3. each component should subscribe to and keep a copy (in local state) of the shared async state it’s interested in. Copying an update to local state will automatically trigger rerenders via IDidUpdate. A convenience function d3om.core/shared-async->local-state does this, and should be called in om/IWillMount.

In this way, visual representations of the components are eventually consistent, and you gain a whole lot of responsiveness.

Conclusion

There is definitely some amount of overhead to using D3 + OM over using just D3 to create a visualization. Without having to create multiple, correlated visualizations, using D3 alone is often the better choice. However, D3 + OM really excels when you want to build modular visual components that are reusable and easy to combine (and interact) with other components built in the same way. It’s an approach that works well under changing requirements and pays off in the long run.

It’s worth mentioning that some examples in the D3 gallery don’t easily translate to OM-land. Patterns that are natural in js (using mutable data, callbacks, etc) don’t necessarily translate to the stricter, immutable world of Clojure/OM. In some instances, the more tractable approach is to first fully understand the example, then rewrite it with a Clojure mind-set.

That’s all for now. Join us next time as we explore 3D + OM!

Appendix

Alternative Tools

As usual, we evaluated existing tools for the job – in this case, using D3 from ClojureScript. In the end, we decided to use JavaScript interop directly. Here are some libraries we evaluated:

  • strokes – uses mrhyde to “remove as much of the interop glue and clj->js->clj data marshalling from application code as possible”

    • the library has fallen behind and doesn’t work with the latest cljs. The polyfill approach means that there will always be a maintenance cost, and with an constantly changing/improving library like cljs, the cost is high.
    • seamless interop is very attractive, but this isn’t what mrhyde attempts to do, nor would it be possible. Instead, it has a blurred line approach, which makes the data structures from both (js & cljs) sides partially compatible with APIs from the other side. Ultimately, this means that you need to painstakingly know precisely what “partially” means. Since js data structures are inherently different than cljs data structures, we felt it more appropriate to explicitly separate the two sides.
  • c2 – inspired from d3 to “build declarative mappings from your data to HTML or SVG markup”.

What’s mentioned in this article but not illustrated in our example?

  • object constancy – there’s no movement in the map-chart, but if there were, transitions would be appropriate.
  • using persistent data structures to snapshot streaming, realtime data. The example loads all data on start up.

Coding convention

  • Follow D3’s indentation convention
  • Appending variable names with - to represent a js data structure (e.g. d-). Again, to be explicit about the differences.
  • Appending s to sequential data structures, ss to nested sequential data structures, and so on (e.g. seriess for a 2d vector). Apply this to irregular plurals, at the cost of improper grammar (e.g. datas).

Other useful links

discuss this on Hacker News


Published by

David Lin

Founder at LiquidLandscape

Leave a Reply

Your email address will not be published. Required fields are marked *

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax