VizFix Critique Approach and Rubric

Through case study, this site will research where did some visualizations go astray, what does it take to correct them, and how do we avoid these mistakes.  This will captured in an exhaustive long-form blog post.  Each post will begin with a critique.  The rubric for the critique is based on the critique framework laid out in the course materials.

  1. Overview — What the visualization is about: a brief overview of the data, objective, and techniques of the visualization
  2. Explanation of data: What kinds of datasets are used in the visualization? Is it a time series, categorical, geospatial, network, or something else? How many dimensions does the visualization use? How about the credibility of the dataset?
  3. Explanation of visualization techniques: what kinds of visualization methods are used? Does it use histogram? or scatterplot?
  4. Effectiveness of the visualization: does the visualization achieve its objective well? Which methods do or do not work? Why? Are there better ways to visualize the same information?
  5. Integrity of the visualization: does it distort data or make use of perceptual biases to give wrong impression? are there any biases that can be corrected by employing other ways to visualize the data?
  6. Design: how well/badly is it designed? Why? Is it engaging? Can there be improvements?
  7. Interesting: Is it interesting? Does it draw the attention of the reader? Is it only interesting to a specific audience, or will mass media pick up on it quickly?

Those topics are a bit broad, so  I will ask the following questions (and perhaps new ones) in order to put specific points.

  • Are the axis properly marked?
  • Are things arranged such that they mislead?
  • How does the positioning they compare to Gestalt principles?
  • Does the choice of colors cause perception issues?
  • Does the use of light cause issues properly perceiving colors?
  • If there is a gradient, does it lend itself to being perceived properly?
  • Are labels visible?
  • Are labels consistently applied?
  • Would color-blindness lead to misperceptions or difficulty understanding it?
  • Does the visualization print well in grayscale?
  • Can things be removed to simplify the graphic?
  • How many dimensions are used? How many are actually needed?
  • For histograms, are bins used properly?
  • Is the appropriate scale used? (linear v. log)
  • Is the data properly interpolated, extrapolated, averaged, etc?
  • If high-dimensional, are they properly represented?

Note, many of these questions cut across some of the six general categories noted above.  This list is not meant to be exhaustive, and will be updated as new questions come up that should be used.

Once the image is critiqued, I will create a new version.  If done properly, the systematic approach to the critique will hopefully lead to actionable items to be addressed in the revised version of the images.  This will be an iterative process, with multiple versions shared in each blog post.  A ‘best’ improvement will be selected and compared side-by-side with the original image.  Subsequently, each article will contain a summation paragraph about the lessons learned from rehabbing that visualization and advice on avoiding that trap in the future.

Please feel free to comment below how you think I might be able to improve my rubric for the critiques.

Okay, let’s get to work.

References:

  1. https://github.com/yy/dviz-course/wiki/Critique-examples
  2. https://www.nature.com/nmeth/journal/v7/n11/full/nmeth1110-863.html

Featured image courtesy of Bruno Boutot.  Used under license by CC BY-NC-SA 2.0.

Related Work on Better Data Visualizations

Am I the only one who thinks that critiquing and improving data visualizations is a great way to spend an evening? No! In fact, it’s rather an old tradition.  In Edward Tufte’s The Visual Display of Quantitative Information, perhaps the preeminent work on data visualizations, dedicates Chapter 4 (and subsequent areas) to maximizing the “data-ink” of a graphic.  Graphic creators should only use ink to add additional meaning.  This results in graphics that minimize the amount of ink and are relatively clear.  Everything else is “chart junk.”  For example, most strikingly, he proposes a much simplistic version of the boxplot, which most individuals would already describe as a simple graphic to begin with:

10 Traditional Boxplots

Those are pretty simple boxplots, aren’t they? Most people would be hard pressed to find a way to simplify them beyond.  Tufte did it fairly easily, as seen in the following:

10 Simplified Boxplots

Here he leverages white space to add value.  Tufte argues that since we know that whiskers of a boxplot begin at the first and third quartiles, we can actually remove them.

Tufte’s approach is focused on making sure that graph creators use only a minimal amount of ink, which allows the data tell its story.  Up close, each bit of ink relates to a specific data point, while at a 40,000-foot level, we can pull out trends and bigger picture narratives.  Everything else is “chart-junk.”

Tufte is a bit extreme in his work that he only wants the data to tell the story.  He fails to recognize that there is a place and some value for “chart-junk.”  Bateman, et al, argue that some chart junk is actually useful.  We create graphs and charts because we want to easily share the information.  We want the information to be easily understood.  By eliminating chart junk as Tufte suggests, we make it easier to understand.  Tufte is only thinking about the current moment.  In his research, Bateman focuses on retention and recall of information.  He finds that chart junk can actually increase the likelihood someone would be able to recall the chart’s information.

This leads us to the age-old adage that moderation is likely the best course of action.  While Tufte argues for a stark minimalist view of the world, Bateman recognizes that there is some value in Tufte’s junk.

With that in mind, a modern take on improving data graphics can be seen everyday on the internet.  One website, Viz WTF highlights graphics that fall short of their objective or are clearly manipulated.  In highlighting it, the sites provides a short summary as to the issues these images.  The end result in these images the same — the graphic does not accurately portray the data.  The first is a design problem, while the second is an integrity issue that Tufte also cautions creators to avoid.  A google search of “bad visualizations” will yield articles and listicles from sites like Gizmodo and BusinessInsider.  Plenty of people have noted that data visualizations can be improved.  Those failed visualizations will be an excellent source material for this blog.

VizFix is not the first to attempt to regularly rehabilitate data visualizations.  As part of #MakeoverMonday, Eva Murray’s Tri My Data, and Andy Kriebel’s VizWiz, they both reinvision a troubled graphic on their site.  As part of the analysis, they each highlight what they think is good about the original graphic, the issues they see with it, and what they are trying to do in their reimagination.

In one recent example, they took a map of bike thefts in London:

Original Map of London Stolen Bikes

Which Murray rethought as:

Tri My Data’s Bike Thefts Recreation

While Kriebel broke down into several different components:

Their work is spectacular, and they do it with a joy only a data enthusiast could.  For VizFix, I aim to have that same enthusaism.  Since I do not have the graphic creating experience or second sense either of them possess (both are Tableau gurus), I’ll create a rubric that brings up many of the issues that graphics often face.  Then, my recreations will seek to alleviate many of those issues.

Please share in the comments any additional thoughts about relevant work in the word of data visualization rehabilitation.

References:

  1. https://www.edwardtufte.com/tufte/books_vdqi
  2. http://hci.usask.ca/uploads/173-pap0297-bateman.pdf
  3. http://viz.wtf/
  4. https://www.google.com/search?q=bad+visualizations
  5. https://gizmodo.com/8-horrible-data-visualizations-that-make-no-sense-1228022038
  6. http://www.businessinsider.com/the-27-worst-charts-of-all-time-2013-6
  7. http://www.makeovermonday.co.uk/
  8. https://trimydata.com/
  9. http://www.vizwiz.com/
  10. https://trimydata.com/2017/09/11/mm-week37/
  11. http://www.vizwiz.com/2017/09/stolen-bikes.html
  12. https://www.tableau.com/

Featured image by Lauren Manning.  Used under Creative Commons 2.0 License.

Hello World! About VizFix

Welcome to the first post on VizFix.  I thought this would be a great way to study the nuances of Data Visualizations.  This will largely be done through case studies where I’ll break down what was done well, what can be approved, and then attempt to improve them.  I’ll also discuss interesting articles and thoughts about data visualization that I came across.  Through these case studies, my hope is that it leads us (you, me) produce better data visualizations.

Better is a relative term, and the first thing I plan to do is create a an evaluation rubric to be used in all of the case studies.  Stay tuned.

This is my semester project for INFO-I590 Data Visualization at Indiana University‘s School of Informatics, Computing, and Engineering taught by Professor Yong-Yeol “YY” Ahn.  I am currently a Data Science Masters candidate at IU.

Featured image “Hello World” from Wikimedia Commons. Used under  the Creative Commons Attribution-Share Alike 3.0 Unported License