Case Study: Solar Eclipse

Remember back to the summer and the mass hysteria (well-deserved hysteria, but hysteria nonetheless) of the solar eclipse? All sorts of fun graphics were floating around.  As part of #MakeoverMonday during the summer, Eva Murray at Tri My Data and Andy Kriebel at Viz Wiz took their hand making over this map:

Lots and Lots of Solar Eclipses.

This map has a lot going on, so it’s clear why Kriebel and Murray took a crack at it.  I picked this one for a case study because I really like the approaches that Murray and Kriebel took.  I didn’t try to outdo them (because that wasn’t going to happen).  I wanted to see if I could find something that could complement their efforts.

Murray created this captivating visualization focused on the duration of solar eclipses related to latitutde:

Eva Murray did an awesome job with this solar eclipse makeover!

It’s pretty clear that as you get closer to the equator, the eclipses last longer, with the longest eclipses almost exclusively located at the equator.

I recommend you check out her write-up at Tri My Data.  At Viz Wiz, Andy Kriebel created this simplification of latitude versus eclipse duration by century.

Andy Kriebel shows where the eclipses are likely to occur.

Kriebel’s visualization shows another dimension of the data.  While the longer eclipses tend towards the equator, you are more likely to see an eclipse (of any duration) somewhere around 44-degrees North or South Latitude.

Now that we have seen what can be done, let’s go back to our original visualization and critique it.  Then, I’ll take my own cut at it.

Overview

The original data visualization’s objective is to allow users to interact with eclipses from the past twenty years.  It focuses on the path of eclipses from 2001 through 2020.  The path of the maps are plotted on a mercator projection of the earth.  It is interactive data visualization that allows the user to get details about a specific eclipse or explore an area to find out when eclipses have or might be coming to the region.

Explanation of data

The eclipse data set is a subset of an extremely large eclipse data set originally pulled from NASA.  It has nearly 12,000 data points of eclipses going back 5000 years.  There are all sorts of fun types of data sets.  There are some geographic features like latitude, longitude, and sun altitude.  There is a time-date feature.  Of most interest, are the quantitative features that describe the characteristics of the eclipses.  These include the eclipse magnitude, duration, and gamma.

The data is highly credible as it originated from NASA data.  If we can’t trust NASA’s calculations of eclipses, we can’t trust anything.

The original visualization used almost all of the features.  It does use only a 20 year window of data of that data.  For the my version, I want to simplify the amount of features.  While the  original map attempts to leverage every dimension, I think taking it back a step and simplifying to focus on one or two features, like Kriebel and Murray did, would result in a potentially more impactful graphic.

Recommendation 1: Explore a different feature that was not utilized by the previous versions.

Explanation of visualization techniques

The visualization is a plot of solar eclipse swaths on a Mercator map of the world.  In fact, it’s an interactive visualization of these eclipses from the 2001 until 2020.   Built on top of the Google Maps API, it allows users to zoom in and out of different regions and get an unlimited amount of macroscopic and microscopic views of the paths of the eclipses.  Both annular and total eclipses are included on the map, and designated with different colors.

Overall, not a bad approach, aside from the standard issues with Mercator maps.  As we will discuss, the design itself becomes more problematic.

Effectiveness of the visualization

Interactive data visualizations are a great way for people to have a personal experience with data visualization.  When everyone looks at a map like this, more than likely, the first thing they want to do is so where they live and ask “When’s the eclipse coming to me?”  By making this an interactive map, it becomes highly effective at captivating the reader.

The reader taking that action is based on the assumption that the reader is not overwhelmed by the map.  My first reaction when I looked at that map was to think that there’s a lot going on.  That is why I would s the point that I would worry about the map losing people’s interest before they even ever have it.

There are a number of different ways to visualize the eclipse data, particularly since there are many different features.  As I mentioned earlier, one of my objectives to use a different feature and try to create something that complements some of the other enhanced visualizations, so no additional recommendation is needed.  It will be done.

Integrity of the visualization

The biggest issue with Mercator projections is that it warps the poles, adding extra value when there is none.  The following graphic was taken from the “Elements of map projection with applications to map and chart construction” written in 1921 by Charles H. Deetz and Oscar S. Adams.  We’ve known for a very long time that Mercator is not that great a map to use.

This is why we care about map projections!

The point of the original data visualization is to show where eclipses occur.  We know from our other visualizations that eclipses tend closer to the equator.  Because of that, I’m okay with the use of a Mercator projection.  It does introduce some distortion and error into our visualization near the poles.  In return, we have map, that despite its flaws, is familiar to most people, and accurately portrays most of data, since our data tends towards the equator.  As we learned earlier, data visualizations are about trade-offs.  For my new data visualization, I may not use Mercator, but I do not fault the use of it on the original.  It made sense.

Design

The eclipse map hits you like a freight train.  “Bam! Everything you wanted to know about the last two years of eclipses all at once!”  It’s a bit overwhelming.  The paths of the eclipses are not easy to pick out since there are so many of them, and they are densely populated near the equator.

Recommendation 2: Create a simplified design that does not overwhelm.

When I look at the original visualization a little more, I think it used opacity nicely to show the regions of the total eclipse within the context of the total swath.  The map really becomes overwhelming when we can’t see the paths behind all of the labels for each of the paths.

That’s a lot of labels. Has anyone seen Northern Africa?

Really, that’s a lot of paths.  When you look at the visualization, the thing you see first are all of the label tags, and not the paths of the eclipses.

Recommendation 3: Reduce the use of overly burdensome labels and put more focus on the data itself.

Interesting

I struggle with determining if it’s interesting.  At first glance, it is a regular Mercator map, with lines and labels on it, meant for a mass audience.  Only with the use of a mouse, does it get interesting to the reader.  On the whole, I would not classify this as interesting because most of the audience would be lost before they even found the interesting part.

The timing of the map should be taken into consideration.  The total eclipse of 2017 in the United States received a ton of coverage.  At that time, people were consuming all sorts of eclipse-related content.  It was inherently interesting, which allows people to overlook design and dig in to the map.  Without eclipse fever forcing the interaction, the visualization has to be more interesting now.

Recommendation 4: Make it more interesting; something that might attract a casual reader to stop and review the data visualization instead

Solar Eclipse – Rethought

Let’s review the recommendations so far:

  1. Explore a different feature that was not utilized by the previous versions
  2. Create a simplified design that does not overwhelm.
  3. Reduce the use of overly burdensome labels and put more focus on the data itself.
  4. Make it more interesting; something that might attract a casual reader to stop and review the data visualization instead.

First thing I did was look at the data set in order to get a feel for the data.  One of the things that I thought was interesting was eclipse magnitude.  This is a measure of the ratio of how big the moon appears in the sky relative to the sun.  Only when the eclipse is greater than one do you have a total eclipse.  If it is less than one, you have an annular eclipse.  An annular eclipse is when a ring appears around the moon as the moon passes completely in front of the sun.

So, I plotted eclipse magnitude against duration, and got the following plot.

Sketch 1  — Eclipse Magnitude vs Eclipse Duration — Pretty interesting, huh?

When I saw Sketch 1, I was pretty captivated.  I haven’t seen anything like it before, and it drew me in right away.  What’s going on here exactly? When you look at Magnitude = 1, you see the duration is just a few seconds.  That makes sense because the moon is just barely bigger than the sun.  So, as the moon continues to process through its orbit, it will move out of view relatively quickly, ending the total eclipse.

As you move to the right, where the moon appears larger than the sun, it blocks the sun longer.  And similarly, as you move to the left, where the moon gets smaller relative to the sun, the annular ring appears for a longer period of time.

Just for fun, I plotted the eclipse duration on a logarthmic scale.

Sketch 2 – Eclipse Duration on Log Scale – Interesting, but not necessary

Sketch 2 is pretty interesting to look out, and tells the same story.  Since logarthmic scale is not that intuitive for mass media, I decided to follow the path I started with Sketch 1.

While Sketch 1 is accurate, simple, and precise, it’s a boring scatterplot.  We are not any better off than the original visualization.  In fact, I would say we are worse off because most people would just look right past it.  Sketch 1 would be my basis for the visualization, but it would need to be improved.

After playing around with some colors, I created this final visualization using d3 and SVG:

Sketch 3 – Eclipse Final

It’s really cool!  I’m still amazed I came up with that!  There are a few different things that I had never done before in SVG that I did here.

First, since we are talking about eclipses and dark skies, I first used a black background.  That was too stark, and went with a much richer charcoal black.  Each eclipse is represented with a lunar gray circle, which gets larger the longer the duration of the eclipse.  The data points are 80% translucent since there thousands of them.  I do have to admit that Murray’s data visualization inspired me to use that scaling for the circles.   Then, I added a legend on the bottom to explain the size of data points.  It’s just a little bit of text and then

Next, since eclipse magnitude is not necessarily intuitive or that well understood for many people, I thought some explanatory text and graphics in the middle would help.  Not too much text, two simple images that most can conceptualize and understand.

One thing of interesting that I discovered is that doing text-wrap in svg is not that easy.  You actually need to create a javascript function to do it.  Since I was only using a minimal amount of text, I just manually created the word wrap by creating multiple text statements.  If I was using SVG to create a poster or something with more extensive text, I would likely go the function route.

Lastly, I thought about making the title a little more interesting. When I first saw plot in Sketch 1, it reminded me of wings.  Then, the annular ring reminded me of a halo.  With those pieces of imagery in mind, that’s where I came up with the title for the data visualization. of “On Angel’s Wings? Eclipse Magnitude Lets You See the Heavebs” While a little metaphorical, the title is a lot more interesting than “Eclipse Magnitude vs. Eclipse Duration.”

Conclusion

How did my new version do against the recommendations:

  1. Explore a different feature that was not utilized by the previous versions
    • Achieved! The magnitude of the eclipse was not featured, and it made for an interesting visualization.
  2. Create a simplified design that does not overwhelm.
    • Achieved! It’s a simple scatterplot, with a little bit of explanatory text and imagery.  There is a lot being explained with a minimal amount of items.
  3. Reduce the use of overly burdensome labels and put more focus on the data itself.
    • Achieved! Despite having thousands of data points, there is a minimal amount of labeling needed.
  4. Make it more interesting; something that might attract a casual reader to stop and review the data visualization instead.
    • Achieved!  The shape of the plot is unlike anything I’ve seen before, which should draw some interest right away.  The descriptive text also provides some education. The coloring is clean and consistent.

This visualization and the one in the prior case study on Baby Boomers have really challenged me to make sure that in addition to making the graphics accurate and simple.

While my recent visualizations may have issues of its own, overall they are improving.  These are accurate and are also interesting items that people will want to read and understand.    People understanding the information and then being able to use it in their own lives is our ultimate goal with data visualization.

Thank you for your time, and please leave any comments below. I would love to hear from you.

References:

  1. http://moonblink.info/Eclipse/lists/solcat
  2. https://trimydata.com/2017/08/20/mm-week34/
  3. http://www.vizwiz.com/2017/08/solar-eclipses.html
  4. http://www.makeovermonday.co.uk/data/
  5. https://books.google.com/books/about/Elements_of_Map_Projection_with_Applicat.html?id=0QnlAAAAMAAJ&printsec=frontcover&source=kp_read_button#v=onepage&q&f=false
  6. http://geoawesomeness.com/amazing-image-1921-will-explain-essence-map-projections/