A follow up to The Women of #IronViz

It’s now 5 days removed from the Tableau Conference (#data17) and the topic of women in data visualization and the particularly pointed topic of women competing in Tableau’s #IronViz competition is still fresh on everyone’s mind.

First – I think it’s important to recognize how awesome the community reception of this topic has been.  Putting together a visualization that highlights a certain subsection of our community is not without risk.  While going through the build process, I wanted to keep the visualization in the vein of highlighting the contributions of women in the community.  It wasn’t meant to be selective or exclusive, instead, a visual display of something I was interested in understanding more about.  Despite being 5 days removed from the conference, the conversations I’ve been involved in (and observed from a distance) have all remained inclusive and positive.  I’ve seen plenty of people searching for understanding and hunting for more data points.  I’ve also seen a lot of collaboration around solutions and collecting the data we all seek.  What I’m thankful that I have not witnessed is blame or avoidance.  In my mind this speaks volumes to the brilliant and refined members of our community and their general openness and acceptance of change, feedback, and improvement.

One thing circling the rounds that I felt compelled to iterate off of, is @visualibrarian’s recent blog post that has interview style questions and answers around the topic.  I am a big believer in self reflection and exploration and was drawn to her call to action (maybe it was the punny and sarcastic nature of the ask) to answer the questions she put forth.

1. Tell me about yourself. What is your professional background? When did you participate in Iron Viz?

My professional background is that of a data analyst.  Although I have a bachelor’s degree in Mathematics, my first professional role was as a Pharmacy Technician entering prescriptions.  That quickly morphed into someone dedicated to reducing prescription entry errors and built on itself over and over to be put in roles like quality improvement and process engineering.  I’ve always been very reliant on data and data communication (in my early days as PowerPoint) to help change people and processes.  About 2 or 3 years ago I got fed up with being the business user at the mercy of traditional data management or data owners and decided to brute force my way into the “IT” side of things.  I was drawn to doing more with data and having better access to it.  Fast-forward to the role I’ve had for a little over 8 months as a Data Visualization Consultant.  Which essentially means I spend a significant amount of my time partnering with organizations to either enable them to use visual analytics, improve the platforms that they are currently using, or overcoming any developmental obstacles they may have.  It also means I spend a significant amount of time championing the power of data visualization and sharing “best practices” on the topic.  I often call myself a “data champion” because I seek simply to be the voice of the data sets I’m working with.  I’m there to help people understand what they’re seeing.

In terms of Iron Viz – I first participated in 2016’s 3rd round feeder, Mobile Iron Viz.  I’ve since participated in every feeder round since.  And that’s the general plan on my end, continue to participate until I make it on stage or they tell me to stop 🙂

2. Is Tableau a part of your job/professional identity?

Yes – see answer to question #1.  It’s pretty much my main jam right now.  But I want to be very clear on this point – I consider my trade visual analytics, data visualization, and data analytics.  Tableau is to me the BEST tool to use within my trade.  By no means the only tool I use, but the most important one for my role.

3. How did you find out about Iron Viz?

When I first started getting more deeply involved in my local User Group, I found out about the competition.  Over time I became the leader of my user group and a natural advocate for the competition.  Once I became a part of the social community (via Twitter) it was easy to keep up with the ins and outs of the competition.

4. Did you have any reservations about participating in Iron Viz?

Absolutely – I still have reservations.  The first one I participated in was sort of on the off chance because I found something that I want to re-visualize in a very pared down elegant, simplistic way.  I ended up putting together the visualization in a very short period of time and after comparing it to the other entries I felt my entry was very out of place.  I tend to shy away from putting text heavy explanations within my visualizations, so I’ve felt very self-conscience that my designs don’t score well on “story telling.”  It was also very hard in 2016 and the beginning of 2017.  Votes were based off of Twitter.  You could literally search for your hashtag and see how many people liked your viz.  It’s a very humbling and crushing experience when you don’t see any tweets in your favor.

5. Talk me through your favorite submission to Iron Viz. What did you like about it? Why?

Ah – they are all my favorite for different reasons.  For each entry I’ve always remained committed and deeply involved in what the data represents.  Independent of social response, I have always been very proud of everything I’ve developed.  For no other reason than the challenge of understanding a data set further and for bringing a new way to visually display it.  My mobile entry was devastatingly simple – I love it to death because it is so pared down (the mobile version).  For geospatial I made custom shapes for each of the different diamond grades.  It’s something I don’t think anyone in the world knows I did – and for me it really brought home the lack of interest I have in diamonds as rare coveted items.

6. What else do you remember about participating in Iron Viz?

The general anxiety around it.  For geospatial 2017 I procrastinated around the topic so much.  My parents actually came to visit me and I took time away from being with them to complete.  I remember my mom consoling me because I was so adamant that I needed to participate.

Safari and Silver Screen were different experiences for me.  I immediately locked in on data sets on subjects I loved, so there was less stress.  When I did the Star Trek entry I focused on look and feel of the design and was so stoked that the data set even existed.  Right now I am watching The Next Generation nightly and I go back to that visualization to see how it compares to my actual perception of each episode (in terms of speaking pace and flow).

7. Which Iron Viz competitions did you participate in, and why?

Everything since 2016 feeder round 3.  I felt a personal obligation and an obligation to my community to participate.  It was also a great way for me to practice a lot of what I tell others – face your fears and greet them as an awesome challenge.  Remain enthusiastic and excited about the unknown.  It’s not always easy to practice, but it makes the results so worth it.

8. What competitions did you not participate in, and why?

Anything before mobile – and only because I (most likely) didn’t know about it.  Or maybe more appropriately stated – I wasn’t connected enough to the community to know of it’s existence or how to participate.

9. Do you participate in any other (non Iron Viz) Tableau community events?

Yes – I participate in #MakeoverMonday and #WorkoutWednesday.  My goal for the end of 2017 is to have all 52 for each completed.  Admittedly I am a bit off track right now, but I plan on closing that gap soon.  I also participate in #VizForSocialGood and have participated in past User Group viz contests.  I like to collect things and am a completionist – so these are initiatives that I’ve easily gotten hooked on.  I’ve also reaped so many benefits from participation.  Not just the growth that’s occurred, but the opportunity to connect with like-minded individuals across the globe.  It’s given me the opportunity to have peers that can challenge me and to be surrounded by folks that I aspire to be more like.  It keeps me excited about getting better and knowing more about our field.  It’s a much richer and deeper environment than I have ever found working within a single organization.

10. Do you have any suggestions for improving representation in Iron Viz?

  • Make it more representative of the actual stage contest
  • Single data set
  • Everyone submits on the same day
  • People don’t tweet or reveal submissions until contest closes
  • Judges provide scoring results to individual participants
  • The opportunity to present analysis/results, the “why”
  • Blind submissions – don’t reveal participants until results are posted
  • Incentives for participation!  It would be nice to have swag or badges or a gallery of all the submissions afterward

And in case you just came here to see the visualization that’s set as the featured image, here’s the link.

Don’t be a Bridge, Instead be a Lock

Lately I’ve spent a lot of time pondering my role in the world of data.  There’s this common phrase that we as data visualization and data analytics (BI) professionals hear all the time (and that I am guilty of saying):

“I serve as the bridge between business and IT.”

Well – I’m here to say it’s time to move on.  Why?  Because the bridge analogy is incomplete.  And because it doesn’t accurately represent the way in which we function in this critical role.  At first glance the bridge analogy seams reasonable.  A connector, something that joins two disparate things.  In a very physical way it connects two things that otherwise have an impasse between them.  The business is an island.  IT is an island.  Only a bridge can connect them.  But is this really true?

Instead of considering the two as separate entities that must be connected, what if we rethought it to be bodies of water at different levels?  They touch each other, they are one.  They are the same type of thing.  The only difference is that they are at different levels, so something like a boat can’t easily go between them.  Is this not what is really happening.  “The business” and “IT” are all really one large organization – not two separate, foreign entities.

This is where the role of being the Lock comes in.  A lock is the mechanism by which watercraft are raised or lowered between waterways.  And to a large extent it is a better analogy to our roles in data.  We must adapt to the different levels of business and IT.  And more importantly it is our responsibility to form that function – and to get the boat (more specifically “the data”) through from one canal to the other.

Even exploring what Wikipedia says about a lock – it fits better.

“Locks are used to make a river more easily navigable, or to allow a canal to cross land that is not level. ”

“Larger locks allow for a more direct route to be taken” [paraphrased]

Is this not how we function in our daily roles?  How fitting is it to say this:

“My role is to make your data more easily navigable.  My goal is to allow data to flow through on your level.  I’m here to allow a more direct route between you and your data.”

It feels right.  I’m there to help you navigate your data through both IT and business waters.  And it is my privilege and honor to facilitate this.  Let’s drop the bridge analogy and move toward a new paradigm – the world where we are locks, adjusting our levels to fit the needs of both sides.

Boost Your Professional Skills via Games

Have you ever found yourself in a situation where you were looking for opportunities to get more strategic, focus on communication skills, improve your ability to collaborate, or just stretch your capacity to think critically?  Well I have the answer for you: pick up gaming.

Let’s pause for a second and provide some background: I was born the same year the NES was released in North America – so my entire childhood was littered with video games.  I speak quite often about how much video gaming has influenced my life.  I find them to be one of the best ways to unleash creativity, have a universe where failure is safe, and there is always an opportunity for growth and challenge.

With all that context you may think this post is about video games and how they can assist with growing out the aforementioned skills.  And that’s where I’ll add a little bit of intrigue: this post is actually dedicated to tabletop games.

For the past two years I’ve picked up an awesome hobby – tabletop gaming.  Not your traditional Monopoly or Game of Life – but games centered around strategy and cooperation.  I’ve taken to playing them with friends, family, and colleagues as a way to connect and learn.  And along the way I’ve come across a few of my favorites that serve as great growth tools.

Do I have you intrigued?  Hopefully!  Now on to a list of recommendations.  And the best part?  All but one of these can be played with 2 players.  I often play games with my husband as a way to switch off my brain from the hum of everyday life and into the deep and rich problems and mechanics that arise during game play.

First up – Jaipur

Jaipur is only a 2 player game that centers around trading and selling goods.  The main mechanics here are knowing when to sell, when to hold, and how to manipulate the market.  There are camel cards that get put in place that when returned cause new goods to appear.

Why you should play: It is a great way to understand value at a particular moment in time.  From being first to market, to waiting until you have several of a specific good to sell, to driving changes in the market by forcing your opponent’s hand.  It helps unlock the necessity to anticipate next steps.  It shows how you can have control over certain aspects (say all the camels to prevent variety in the market), but how that may put you at a disadvantage when trying to sell goods.

It’s a great game that is played in a max of 3 rounds and probably 30 minutes.  The variety and novelty of what happens makes this a fun to repeat game.

Hanabi

Hanabi is a collaborative game that plays anywhere from 2 to 5 people.  The basic premise is that you and your friends are absentminded fireworks makers and have mixed up all the fireworks (numeric sets 1 to 5 of 5 different colors).  Similar to Indian Poker you have a number of cards (3 or 4) facing away from you.  That is to say – you don’t know your hand, but your friends do.  Through a series of sharing information and discarding/drawing cards everyone is trying to put down cards in order from 1 to 5 for particular colors.  If you play a card too soon then the fireworks could go off early and there’s only so much information to share before the start of the fireworks show.

This is a great game to learn about collaboration and communication.  When you’re sharing information you give either color or numeric information to someone about their hand.  This can be interpreted several different ways and it’s up to the entire team to communicate effectively and adjust to interpretation style.  It also forces you to make choices.  My husband and I recently played and got dealt a bunch of single-high value cards that couldn’t be played until the end.  We had to concede as a team that those targets weren’t realistic to go after and were the only way we could end up having a decent fireworks display.

Lost Cities

This is another exclusively two player game.  This is also a set building game where you’re going on exploration missions to different natural wonders.  Your goal is to fill out sets in numeric order (1 to 10) by color.  There’s a baseline cost to going on a mission, so you’ll have to be wise about going off on a mission.  There are also cards you can play (before the numbers) that let you double, triple, or quadruple your wager on successfully going on the exploration.  You and your opponent take turns drawing from a pool of known cards or from a deck.  Several tactics can unfold here.  You can build into a color early, or completely change paths once you see what the other person is discarding.  It’s also a juggling act to decide how much to wager to end up making money.

Bohnanza

This one plays well with a widespread number of players.  The key mechanic here is that you’re a bean farmer with 2 fields to plant beans.  The order in which you receive cards is crucial and can’t be changed.  It’s up to you to work together with your fellow farmers at the bean market to not uproot your fields too early and ruin a good harvest.  This is a rapid fire trading game where getting on someone’s good side is critical and you’ll immediately see the downfall of holding on to cards for the “perfect deal.”  But of course you have to balance out your friendliness with the knowledge that if you share too many high value beans the other farmers may win.  There’s always action on the table and you have to voice your offer quickly to remain part of the conversation.

The Grizzled

The Grizzled is a somewhat melancholy game centered around World War I.  You’re on a squad and trying to successfully fulfill missions before all morale is lost.  You’ll do this by dodging too many threats and offering support to your team.  You’ll even make speeches to encourage your comrades.  This game offers lots of opportunities to understand when and how to be a team player to keep morale high and everyone successful.  The theme is a bit morose, but adds context to the intention behind each player’s actions.

The Resistance

Sadly this requires a minimum of 5 people to play, but is totally worth it.  As the box mentions it is a game of deduction and deception.  You’ll be dealt a secret role and are either fighting for victory or sabotage.  I played this one with 8 other colleagues recently and pure awesomeness was the result.  You’ll get the chance to pick teams for missions, vote on how much you trust each other, and ultimately fight for success or defeat.  You will get insight into crowd politics and how individuals handle situations of mistrust and lack of information.  My recent 9 player game divulged into using a white board to help with deductions!

Next time you’re in need of beefing up your soft skills or detaching from work and want to do it in a productive and fun manner – consider tabletop gaming.  Whether you’re looking for team building exercises or safe environments to test how people work together – tabletop games offer it all.  And in particular – collaborative tabletop games.  With most games there’s always an element of putting yourself first, but you will really start to understand how individuals like to contribute to team mechanics.

#WorkoutWednesday Week 24 – Math Musings

The Workout Wednesday for week 24 is a great way to represent where a result for a particular value falls with respect to a broader collection.  I’ve used a spine chart recently on a project where most data was centered around certain points and I wanted to show the range.  Propagating maximums, minimums, averages, quartiles, and (when appropriate) medians can help to profile data very effectively.

So I started off really enjoying where this visualization was going.  Also because the spine chart I made on a recent project was before I even knew the thing I developed had already been named.  (Sad on my part, I should read more!)

My enjoyment turned into caution really quickly once I saw the data set.  There are several ratios in the data set and very few counts/sums of things.  My math brain screams trap!  Especially when we start tiptoeing into the world of what we semantically call “average of all” or “overall average” or something that somehow represents a larger collective (“everybody”).  There is a lot of open-ended interpretation that goes into this particular calculation and when you’re working with pre-computed ratios it gets really tricky really quickly.

Here’s a picture of the underlying data set:

 

Some things to notice right away – the ratios for each response are pre-computed.  The number of responses is different for each institution.  (To simplify this view, I’m on one year and one question).

So the heart of the initial question is this: if I want to compare my results to the overall results, how would I do that?  Now there are probably 2 distinct camps here.  1: take the average of one of the columns and use that to represent the “overall average”.  Let’s be clear on what that is: it is the average pre-computed ratio of a survey.  It is NOT the observed percentage of all individuals surveyed.  That would be option 2: the weighted average.  For the weighted average or to calculate a representation of all respondents we could add up all the qualifying respondents answering ‘agree’ and divide it by the total respondents.

Now we all know this concept of average of an average vs. weighted average can cause issues.  Specifically we’d feel the friction immediately if there were several low-end responses commingled with several higher response capturing entities.  EX: Place A: 2 people out of 2 answered yes (100%) and  Place B: 5 out of 100 answered ‘yes’ (5%).  If we average 100% and 5% we’ll get 52.5%.  But what if we take 7 out of 102, that’s 6.86% – a way different number.  (Intentionally extreme example.)

So my math brain was convinced that the “overall average” or “ratio for all” should be inclusive of the weights of each Institution.  That was fairly easy to compensate for: take each ratio and multiply it by the number of respondents to get raw counts and then add those all back up together.

The next sort of messy thing to deal with was finding the minimums and maximums of these values.  It seems straightforward, but when reviewing the data set and the specifications of what is being displayed there’s caution to throw with regard to level of aggregation and how the data is filtered.  As an example, depending on how the ratios are leveraged, you could end up finding the minimum of 3 differently weighted subjects to a subject group.  You could also probably find the minimum Institution + subject result at the subject level of all the subjects within a group.  Again I think the best bet here is to tread cautiously over the ratios and get into raw counts as quickly as possible.

So what does this all mean?  To me it means tread carefully and ask clear questions about what people are trying to measure.  This is also where I will go the distance and include calculations in tool tips to help demonstrate what the values I am calculating represent.  Ratios are tricky and averaging them is even trickier.  There likely isn’t a perfect way to deal with them and it’s something we all witness consistently throughout our professional lives (how many of us have averaged a pre-computed average handle time?).

Beyond the math tangent – I want to reiterate how great a visualization I think this is.  I also want to highlight that because I went deep-end math on it that I decided to go deep end development different.

The main difference from the development perspective?  Instead of using reference bands, I used a gannt bar as the IQR.  I really like using the bar because it gives users an easier target to hover over.  It also reduce some of the noise of the default labeling that occurs with reference lines.  To create the gannt bar – simply compute the IQR as a calculated field and use it as the size.  You can select one of the percentile points to be the start of the mark.

March & April Combined Book Binge

Time for another recount of the content I’ve been consuming.  I missed my March post, so I figured it would be fine to do a combined effort.

First up:

The Icarus Deception by Seth Godin

In my last post I mentioned that I got a recommendation to tune in to Seth and got the opportunity to hear him firsthand on Design Matters.  Well, here’s the first full Seth book I’ve consumed and it didn’t disappoint.  If I had to describe what this book contains – I would say that it is a near manifesto for the modern artist.  The world is run by industrialists and the artist is trying to break through.

I appreciate how Seth frames the concept of an artist – he unpacks the term and invites or ENCOURAGES everyone to identify as such.  Being an artist means being emotionally invested, showing up, giving a shit.  That giving a shit, caring, connecting is ALL there is.  That you succeed in the world by connecting, by sharing your art.  These concepts and ideals resonate deeply with me.  He also explains how vulnerable and gutting it can be to live as an artist – something I’ve felt and experienced several times.

During the course of listening to this book I was on site with a client.  We got to a certain point, agreed on the direction and visualizations, then shared them with the broader team.  The broader team came heavy with design suggestions – most notable the green/red discussion came in to play.  I welcome these challenges and as an artist and communicator it is my responsibility to share my process, listen to feedback, and collaborate to find a solution.  That definitely occurred throughout the process, but honestly caused me to lose my balance for a moment.

As I reflected on what happened – I was drawn to this idea that as a designer I try to have ultimate empathy for the end user.  And furthermore the amount of care given to the end user is never fully realized by the casual interactor.  A melancholy realization, but one that should not be neglected or forgotten.

Moving on to the next book:

Rework by Jason Fried & David Heinemeier Hansson

This one landed in my lap because it was available while perusing through library books.

A quick read that talks about how to succeed in business.  It takes an extreme focus on being married to a vision and committing to it.  The authors focus on getting work done.  Sticking to a position and seeing it through.  I very much appreciated that they were PROUD of decisions they made for their products and company.  Active decisions NOT to do something can be more liberating and make someone more successful than being everything to everyone.

Last up was this guy:

Envisioning Information by Edward Tufte

A continuation of reading through all the Tufte books.  I am being lazy by saying “more of the same.”  Or “what I’ve come to expect.”  These are lazy terms, but they encapsulate what Tufte writes about: understanding visual displays of information.  Analyzing at a deep level the good, bad, and ugly of displays to get to the heart of how we can communicate through visuals.

I particularly loved some of the amazing train time tables displayed.  This concept of using lines to represent timing of different routes was amazing to see.  And the way color is explored and leveraged is on another level.  I highly recommend this one if the thought of verbalizing your witnessing of Tufte’s strong tongue-in-cheek style sounds entertaining.  I know for me it was.

Makeover Monday Week 10 – Top 500 YouTube Game(r) Channels

We’re officially 10 weeks into Makeover Monday, which is a phenomenal achievement.  This means that I’ve actively participated in recreating 10 different visualizations with data varying from tourism, to Trump, to this week’s Youtube gamers.

First some commentary people may not like to read: the data set was not that great.  There’s one huge reason why it wasn’t great: one of the measures (plus a dimension) was a dependent variable on two independent variables.  And that dependent variable was processed via a pre-built algorithm.  So it would almost make sense to use the resultant dependent variable to enrich other data.

I’m being very abstract right now – here’s the structure of the data set:

Let’s walk through the fields:

  • Rank – this is a component based entirely on the sort chosen by the top (for this view it is by video views, not sure what those random 2 are, I just screencapped the site)
  • SB Score/Rank – this is some sort of ranking value applied to a user based on a propriety algorithm that takes a few variables into consideration
  • SB Score (as a letter grade) – the letter grade expression of the SB score
  • User – the name of the gamer channel
  • Subscribers – the # of channel subscribers
  • Video Views – the # of video views

As best as I can tell through reading the methodology – SB score/rank (the # and the alpha) are influenced in part from the subscribers and video views.  Which means putting these in the same view is really sort of silly.  You’re kind of at a disadvantage if you scatterplot subscribers vs. video views because the score is purportedly more accurate in terms of finding overall value/quality.

There’s also not enough information contained within the data set to amass any new insights on who is the best and why.  What you can do best with this data set is summarization, categorization, and displaying what I consider data set “vitals.”

So this is the approach that I took.  And more to that point, I wanted to make over a very specific chart style that I have seen Alberto Cairo employ a few times throughout my 6 week adventure in his MOOC.

That view: a bar chart sliced through with lines to help understand size of chunks a little bit better.  This guy:

So my energy was focused on that – which only happened after I did a few natural (in my mind) steps in summarizing the data, namely histograms:

Notice here that I’ve leveraged the axis values across all 3 charts (starting with SB grade and through to it’s sibling charts to minimize clutter).  I think this has decent effect, but I admit that the bars aren’t equal width across each bar chart.  That’s not pleasant.

My final two visualizations were to demonstrate magnitude and add more specifics in a visual manner to what was previously a giant text table.

The scatterplot helps to achieve this by displaying the 2 independent variables with the overall “SB grade” encoded on both color and size.  Note: for size I did powers of 2: 2^9, 2^8, 2^7…2^1.  This was a decent exponential effect to break up the sizing in a consistent manner.

The unit chart on the right is to help demonstrate not only the individual members, but display the elite A+ status and the terrible C+, D+, and D statuses.  The color palette used throughout is supposed to highlight these capstones – bright on the edges and random neutrals between.

This is aptly named an exploration because I firmly believe the resultant visualization was built to broadly pluck away at the different channels and get intrigued by the “details.”  In a more real world I would be out hunting for additional data to tag this back to – money, endorsements, average video length, number of videos uploaded, subject matter area, type of ads utilized by the user.  All of these appended to this basic metric aimed at measuring a user’s “influence” would lead down the path of a true analysis.

The Flow of Human Migration

Today I decided to take a bit of a detour while working on a potential project for #VizForSocialGood.  I was focused on a data set provided by UNICEF that showed the number of migrants from different areas/regions/countries to destination regions/countries.  I’m pretty sure it is the direct companion to a chord diagram that UNICEF published as part of their Uprooted report.

As I was working through the data, I wanted to take it and start at the same place.  Focus on migration globally and then narrow the focus in on children affected by migration.

Needless to say – I got side tracked.  I started by wanting to make paths on maps showing the movement of migrants.  I haven’t really done this very often, so I figured this would be a great data set to play with.  Once I set that up, it quickly divulged into something else.

I wasn’t satisfied with the density of the data.  The clarity of how it was displayed wasn’t there for me.  So I decided to take an abstract take on the same concept.  As if by fate I had received Chart Chooser cards in the mail earlier and Josh and I were reviewing them.  We were having a conversation about the various uses of each chart and brainstorming on how it could be incorporated into our next Tableau user group (I really do eat, drink, and breathe this stuff).

Anyway – one of the charts we were talking about was the sankey diagram.  So it was already on my mind and I’d seen it accomplished multiple times in Tableau.  It was time to dive in and see how this abstraction would apply to the geospatial.

I started with Chris Love’s basic tutorial of how to set up a sankey.  It’s a really straightforward read that explains all the concepts required to make this work.  Here’s the quick how-to in my paraphrased words.

  1. Duplicate your data via a Union, identify the original and the copy (Which is great because I had already done this for the pathing)  As I understand it from Chris’s write-up this let’s us ‘stretch out’ the data so to speak.
  2. Once the data is stretched out, it’s filled in by manipulating the binning feature in Tableau.  My interpretation would be that the bins ‘kind of’ act like dimensions (labeled out by individual integers).  This becomes useful in creating individual points that eventually turn into the line (curve).
  3. Next there are ranking functions made to determine the starting and end points of the curves.
  4. Finally the curve is built using a mathematical function called a sigmoid function.  This is basically an asymptotic function that goes from -1 to 1 and has a middle area with a slope of ~1.
  5. After the curve is developed, the points are plotted.  This is where the ranking is set up to determine the leftmost and rightmost points.  Chris’s original specifications had the ranking straightforward for each of the dimensions.  My final viz is a riff on this.
  6. The last steps are to switch the chart to a line chart and then build out the width (size) of the line based on the measure you used in the ranking (percent of total) calculation.

So I did all those steps and ended up with exactly what was described – a sankey diagram.  A brilliant one too, I could quickly switch the origin dimension to different levels (major area, region, country) and do similar work on the destination side.  This is what ultimately led me to the final viz I made.

So while adjusting the table calculations, I came to one view that I really enjoyed.  The ranking pretty much “broke” for the initial starting point (everything was at 1), but the destination was right.  What this did for the viz was take everything from a single point and then create roots outward.  Initial setup had this going from left to right – but it was quite obvious that it looked like tree roots.  So I flipped it all.

I’ll admit – this is mostly a fun data shaping/vizzing exercise.  You can definitely gain insights through the way it is deployed (take a look at Latin America & Caribbean).

After the creation of the curvy (onion shape), it was a “what to add next” free for all.  I had wrestled with the names of the destination countries to try and get something reasonable, but couldn’t figure out how to display them in proximity with the lines.  No matter – the idea of a word cloud seemed kind of interesting.  You’d get the same concept of the different chord sizes passed on again and see a ton of data on where people are migrating.  This also led to some natural interactivity of clicking on a country code to see its corresponding chords above.

Finally to add more visual context a simple breakdown of the major regions origin to destinations.  To tell the story a bit further.  The story points for me: most migrants move within their same region, except for Latin America/Caribbean.

How do you add value through data analytics?

I recently read this article that really ignited a lot of thoughts that often swirl around in my mind.  If you were to ask me what my drive is, it’s making data-informed, data-driven decisions.  My mechanism for this is through data visualization.  More broadly than that, it is communicating complex ideas in a visual manner.  Often when you take an idea and paint it into a picture people can connect more deeply to it and it becomes the catalyst for change.

All that being said – I’ve encountered a sobering problem.  Those on the more “analytical” side of the industry sometimes fail to see the value in the communication aspect of data analytics.  They’ve become mired down by the concept that knowing statistical programming languages, database theory, and structured query language are the most important aspects of the process.  While I don’t discount the significance of these tools (and the ability to utilize them correctly), I can’t be completely on board with it.

We’ve all sat in a meeting that is born out of one idea: how do we get better.  We don’t get better by writing the most clever and efficient SQL query.  We get better by talking through and really understanding what it IS we’re trying to measure.  When we say X what do we mean?  How do we define X.  Defining X is the hard part – pulling it out of the database, not as difficult.  If you can get really good at definitions, it becomes intuitive when you start trying to incorporate it into your business initiatives.

As we continue to evolve in the business world, I highly encourage those from both ends of the spectrum to try and meet somewhere in the middle.  We have an unbelievable amount of technical tools at our disposal, yet quite often you step into a business who is still trying to figure out HOW to measure the most basic of metrics.  Let’s stop and consider how this happened and work on achieving excellence and improvement through the marriage of business and technical acumen – with artistry and creativity thrown in there for good measure.