The Data Artist: Painting with Data

For data artists, data is paint. And the palette of the data artist has expanded quite a bit. There’s a wide variety of data sources and types of data, including streaming data, unstructured information, SQL and NoSQL databases, HDFS, and custom data sources.

This expanded palette means the data artist can tell a richer story. But it also means that the actual approach to handling data on the backend has to change. If you’re dealing with a dataset of one billion records, you can’t utilize the same approach that worked when you were looking at a single Excel or SQL database.

One of those new approaches, we call microquerying and data sharpening. It gives the data artist an initial look at the data, which grows sharper as the queries return more data. Otherwise, they’re stuck waiting for an entire massive query to complete. Data artists also want to combine data sources into a single visualization. Consider that mixing paints right on the canvas. We call that data fusion.

Today’s data artists also need a forgiving canvas. Think back for a moment to Charles Minard and his great troop movement map. And consider what it must have been like for him if he made a mistake. Paper, pen, and ink are not forgiving. Mistakes are difficult and messy to correct.

Painting with data and searching for insights is a very iterative, trial and error process. Artists need an interactive canvas that can easily undo and redo. For example, they should be able to filter their data sets as they work. Not go back to the raw data and filter it there.

This type of canvas should support collaboration with other data artists and give the artist the power to go beyond standard visualizations like pie charts and bar graphs. Again, think about Minard. He invented a way of visualizing quantitative information that no one had seen before. Zoomdata, for example, can tap into the very large ecosystem of open-source and commercial visualizations beyond what it offers out of the box.


