Boost Your Professional Skills via Games

Have you ever found yourself in a situation where you were looking for opportunities to get more strategic, focus on communication skills, improve your ability to collaborate, or just stretch your capacity to think critically?  Well I have the answer for you: pick up gaming.

Let’s pause for a second and provide some background: I was born the same year the NES was released in North America – so my entire childhood was littered with video games.  I speak quite often about how much video gaming has influenced my life.  I find them to be one of the best ways to unleash creativity, have a universe where failure is safe, and there is always an opportunity for growth and challenge.

With all that context you may think this post is about video games and how they can assist with growing out the aforementioned skills.  And that’s where I’ll add a little bit of intrigue: this post is actually dedicated to tabletop games.

For the past two years I’ve picked up an awesome hobby – tabletop gaming.  Not your traditional Monopoly or Game of Life – but games centered around strategy and cooperation.  I’ve taken to playing them with friends, family, and colleagues as a way to connect and learn.  And along the way I’ve come across a few of my favorites that serve as great growth tools.

Do I have you intrigued?  Hopefully!  Now on to a list of recommendations.  And the best part?  All but one of these can be played with 2 players.  I often play games with my husband as a way to switch off my brain from the hum of everyday life and into the deep and rich problems and mechanics that arise during game play.

First up – Jaipur

Jaipur is only a 2 player game that centers around trading and selling goods.  The main mechanics here are knowing when to sell, when to hold, and how to manipulate the market.  There are camel cards that get put in place that when returned cause new goods to appear.

Why you should play: It is a great way to understand value at a particular moment in time.  From being first to market, to waiting until you have several of a specific good to sell, to driving changes in the market by forcing your opponent’s hand.  It helps unlock the necessity to anticipate next steps.  It shows how you can have control over certain aspects (say all the camels to prevent variety in the market), but how that may put you at a disadvantage when trying to sell goods.

It’s a great game that is played in a max of 3 rounds and probably 30 minutes.  The variety and novelty of what happens makes this a fun to repeat game.

Hanabi

Hanabi is a collaborative game that plays anywhere from 2 to 5 people.  The basic premise is that you and your friends are absentminded fireworks makers and have mixed up all the fireworks (numeric sets 1 to 5 of 5 different colors).  Similar to Indian Poker you have a number of cards (3 or 4) facing away from you.  That is to say – you don’t know your hand, but your friends do.  Through a series of sharing information and discarding/drawing cards everyone is trying to put down cards in order from 1 to 5 for particular colors.  If you play a card too soon then the fireworks could go off early and there’s only so much information to share before the start of the fireworks show.

This is a great game to learn about collaboration and communication.  When you’re sharing information you give either color or numeric information to someone about their hand.  This can be interpreted several different ways and it’s up to the entire team to communicate effectively and adjust to interpretation style.  It also forces you to make choices.  My husband and I recently played and got dealt a bunch of single-high value cards that couldn’t be played until the end.  We had to concede as a team that those targets weren’t realistic to go after and were the only way we could end up having a decent fireworks display.

Lost Cities

This is another exclusively two player game.  This is also a set building game where you’re going on exploration missions to different natural wonders.  Your goal is to fill out sets in numeric order (1 to 10) by color.  There’s a baseline cost to going on a mission, so you’ll have to be wise about going off on a mission.  There are also cards you can play (before the numbers) that let you double, triple, or quadruple your wager on successfully going on the exploration.  You and your opponent take turns drawing from a pool of known cards or from a deck.  Several tactics can unfold here.  You can build into a color early, or completely change paths once you see what the other person is discarding.  It’s also a juggling act to decide how much to wager to end up making money.

Bohnanza

This one plays well with a widespread number of players.  The key mechanic here is that you’re a bean farmer with 2 fields to plant beans.  The order in which you receive cards is crucial and can’t be changed.  It’s up to you to work together with your fellow farmers at the bean market to not uproot your fields too early and ruin a good harvest.  This is a rapid fire trading game where getting on someone’s good side is critical and you’ll immediately see the downfall of holding on to cards for the “perfect deal.”  But of course you have to balance out your friendliness with the knowledge that if you share too many high value beans the other farmers may win.  There’s always action on the table and you have to voice your offer quickly to remain part of the conversation.

The Grizzled

The Grizzled is a somewhat melancholy game centered around World War I.  You’re on a squad and trying to successfully fulfill missions before all morale is lost.  You’ll do this by dodging too many threats and offering support to your team.  You’ll even make speeches to encourage your comrades.  This game offers lots of opportunities to understand when and how to be a team player to keep morale high and everyone successful.  The theme is a bit morose, but adds context to the intention behind each player’s actions.

The Resistance

Sadly this requires a minimum of 5 people to play, but is totally worth it.  As the box mentions it is a game of deduction and deception.  You’ll be dealt a secret role and are either fighting for victory or sabotage.  I played this one with 8 other colleagues recently and pure awesomeness was the result.  You’ll get the chance to pick teams for missions, vote on how much you trust each other, and ultimately fight for success or defeat.  You will get insight into crowd politics and how individuals handle situations of mistrust and lack of information.  My recent 9 player game divulged into using a white board to help with deductions!

Next time you’re in need of beefing up your soft skills or detaching from work and want to do it in a productive and fun manner – consider tabletop gaming.  Whether you’re looking for team building exercises or safe environments to test how people work together – tabletop games offer it all.  And in particular – collaborative tabletop games.  With most games there’s always an element of putting yourself first, but you will really start to understand how individuals like to contribute to team mechanics.

The Importance of Certification

I’ve long been a huge advocate of certification in technical fields.  I think it is a great way to actively communicate and demonstrate the skill level you have in a particular area.  Even further in my mind it represents the ability to set a foundation to build off of with others.

I personally started my Tableau certification journey last year (more like 18 to 20 months ago).  I was becoming much more heavily involved in my local Tableau user group and felt that I needed a way to benchmark or assess my skills.  I knew I had much more to contribute to my local community and I thought that going through the certification journey and sharing that with the TUG would be beneficial.

So I started by getting ready for the Desktop Qualified Exam.  Independently I went through all the existing documentation and searched for knowledge nuggets that would set me on the right path.  I took the stance of developing out all of my own knowledge gaining to a format that could be digested by a larger audience.  I walked through all of the practice exam questions and did an analysis of the different certification levels to the user group at least 3 times.

I passed the Desktop Qualified Associate around April or May of 2016.  It was a great achievement – and I was able to add definition to what that exam means.  Having the Desktop Qualified Associate certification means that you are technically proficient in the features and functions of Tableau Desktop.  It means that you can answer thoughtful questions using built in features and that you have depth of understanding best practices or efficient ways to get to the results.  If I were to equate it to a different skill – I would say that it means you know how and when to use different tools in a toolbox.  What it doesn’t mean: that you are a masterful architect or that you can build a stunningly beautiful home.

To get to the next level of mastery and understanding, that means you’ll need the Certified Professional.  If you take a look at the specific components that are tested you’ll quickly realize that advanced technical skill is weighted less than storytelling or analysis.  The purpose of the Desktop Certified Professional is to demonstrate that you have a deep understanding of data visualization, using data visualization to tell a story, and how level two (or three or four) analytics or analysis are necessary to start answering deeper and more important (read as: higher impact) questions.

For me to work on preparation here – the exam prep guide was only the beginning.  It assists in the structural components of 1) knowing how many questions there will be 2) estimating time available to spend on each question 3) examples of analytical/presentation depth required to demonstrate proficiency 4) variety of question types.

Probably the most intriguing questions for me are those where you have to assess a particular visualization, give and justify a critique (and specifically how it relates to a described objective) and then provide an alternative solution (also justifying verbally the importance).  This skill is much different than knowing how to hammer a nail into a post.  It is defending why you chose to put a porch on the northeast corner of a home.  It’s a lot about feel.

I had such an awesome time taking the exam.  There are a lot of real world constraints that required me to distill down the most important components of each question.  It’s interesting because for most items there isn’t a single right answer.  There are definitely lots of wrong answers, but right is a spectrum that is somewhat dependent on your ability to communicate out the completeness of your point of view.

I’ve had the title of Tableau Desktop Certified Professional for just over a year now – so I can tell you with a decent amount of retrospective thought what it has done for me.  Just as I am able to describe the test and purpose it served in this blog post – I can do the same thing in all of my interactions.  It keeps me humble to knowing that the PURPOSE behind a visual display is more important than fancy widgets or cool tricks.  That to a large extent my role is to work through the semantics of a situation and get to the root of it.  The root of the question or questions, the heart of concern, the why behind the visualization.  And also the artistry (yes I use this word) behind what it takes to get there.  We have all felt the difference between a perfectly acceptable visualization and the right visualization.  The end user experiences something different.  I firmly believe that deeper understanding can be achieved by spending that extra thoughtfulness and approach to iteration.

So let’s now fast forward to the other certification path – the more recent one: Tableau Server.  What’s interesting is that because my strengths have been built out on the visualization end, I haven’t planted myself in an opportunity to understand the deeper technical components of Tableau Server.  I have always understood and had great depth of knowledge in Site Administration.  That is to say acknowledging and abiding by best practices for sharing, permissions, and managing content.  But – the part that I had not spent time on is creating a sustainable platform to have the vision continuously executed.

So to overcome that minor blind spot – I went on a mission to learn more, to shine light on the unknown.  You’ve seen that play out here on my blog – going on a self directed adventure to deploy a Server on Azure.  Nobody told me to do that – I was internally compelled.  (I should also mention I was honored to have a friend go on the journey with me!)

I’m probably rehashing at this point – but anytime you grow knowledge in a particular area (more specifically technical) it gives you such breadth and depth of vocabulary to be able to connect to other individuals.  You find that communication barriers that were preventing the success of a project are diminished because you now speak the same language.  As I write this I can hear Seth Godin saying that the more knowledge someone has in a particular subject area the more ABSTRACT their language is around it.  Which means that it is extremely difficult to communicate with that SME unless significant effort is taken on the part of both parties to bridge the gap.

So that’s what Tableau Server qualification has done for me.  It’s the first step on a journey to get to the point where I imagine the next level to Server Certified Professional is the act of execution.  It’s less knowledge and verbiage and more tactical.  Also likely there’s more ambiguity – not a right answer, rather a spectrum of right where you communicate your why.

As I wind down this post – I must shout to you “go get certified.”  Ignore the naysayers.  It’s easy to not do something, but you know what is hard?  Doing something.  Being tested.  And why is that?  Because you can fail.  Get over failure – push through that mindset.  The alternative is much more pleasant and unlocks all the potential the universe has to offer.

 

Azure + Tableau Server = Flex

I’m affectionately calling this post Azure + Tableau Server = Flex for two reasons.  First – are you a desktop user that has always wanted to extend your skills in Tableau as a platform?  Or perhaps you’re someone who is just inherently curious and gains confidence by learning and doing (I fall into this camp).  Well then this is the blog post for you.

Let me back up a bit.  I am very fortunate to spend a majority of my working time (and an amount of my free time!) advocating for visual analytics and developing data visualizations to support the value it brings.  That means doing things like speaking with end users, developing requirements, partnering with application and database owners/administrators, identifying and documenting important metrics, and finally (admittedly one of the more enjoyable components) partnering with the end user on the build out and functionality of what they’re looking for.  A very iterative process to get to results that have a fair amount of communication and problem solving sprinkled in to pure development time – a lucky job.  The context here is this: as soon as you start enabling people to harness the power of data visualization and visual analytics the immediate next conversation becomes: how can I share this with the world (or ‘my organization’).  Aha!  We’ve just stepped into the world of Tableau Server.

Tableau Server or Tableau Online bring the capabilities to share the visualizations that you’re making with everyone around you.  It does exactly what you want it to do: via a URL share interactive and data rich displays.  Just the thought of it gets me misty-eyed.  But, as with any excellent technology tool it comes with the responsibility of implementation, maintenance, security, cost, and ultimately a lot of planning.  And this is where the desktop developer can hit a wall in taking things to that next level.  When you’re working with IT folks or someone who may have done something like this in the past you’ll be hit with a question wall that runs the entire length of every potential ‘trap’ or ‘gotcha’ moment you’re likely to experience with a sharing platform.  And more than that – you’re tasking with knowing the answers immediately.  Just when you thought getting people to add terms like tooltip, boxplot, and dot plot was exciting they start using words like performance, permissions, and cluster.

So what do you do?  You start reading through administration guides, beefing up your knowledge on the platform, and most likely extending your initial publisher perspective of Tableau Server to the world of sever administrator or site administrator.  But along the way you may get this feeling – I certainly have – I know how to talk about it, but I’ve never touched it.  This is all theoretical – I’ve built out an imaginary instance in my mind a million times, but I’ve never clicked the buttons.  It’s the same as talking through the process of baking and decorating a wedding cake and actually doing it.  And really if we thought about it: you’d be much more likely to trust someone who can say “yeah I’ve baked wedding cakes and served them” opposed to someone who says “I’ve read every article and recipe and how-to in the world on baking wedding cakes.”

Finally we’re getting to the point and this is where Azure comes into play.  Instead of stopping your imaginary implementation process because you don’t have hardware or authority or money to test out an implementation and actually UNBOX the server – instead use Azure and finish it out.  Build the box.

What is Azure?  It’s Microsoft’s extremely deep and rich platform for a wide variety of services in the cloud.  Why should you care?  It gives you the ability to deploy a Tableau Server test environment through a website, oh, and they give you money to get started.  Now I’ll say this right away: Azure isn’t the only one.  There’s also Amazon’s AWS.  I have accounts with both – I’ve used them both.  They are both rich and deep.  I don’t have a preference.  For the sake of this post – Azure was attractive because you get free credits and it’s the tool I used for my last sandbox adventure.

So it’s really easy to get started with Azure.  You can head over to their website and sign up for a trial.  At the time of writing they were offering a 30-day free trial and $200 in credits.  This combination is more than enough resources to be able to get started and building your box.  (BTW: nobody has told me to say these things or offered me money for this – I am writing about this because of my own personal interest).

Now once you get started there are sort of 2 paths you can take.  The first one would be to search the marketplace for Tableau Server.  When you do that there’s literally step by step configuration settings to get to deployment.  You start at the beginning with basic configuration settings and then get all the way to the end.  It’s an easy path to get to the Server, but outside of the scope of where I’m taking this.  Instead we’re going to take the less defined path.

Why not use the marketplace process?  Well I think the less defined path offers the true experience of start to finish.  Hardware sizing through to software installation and configuration.  By building the machine from scratch (albeit it is a Virtual Machine) it would mimic the entire process more closely than using a wizard.  You have fewer guard rails, more opportunity for exploration, and the responsibility of getting to the finish line correctly is completely within your hands.

So here’s how I started: I made a new resource, a Windows Server 2012 R2 Datacenter box.  To do that, you go through the marketplace again and choose that as a box type.  It’s probably a near identical process to the marketplace Tableau Server setup.  Make a box, size the box, add optional features, and go.  To bring it closer to home go through the exercise of minimum requirements vs. recommended requirements from Tableau.  For a single-node box you’ll need to figure out the number of CPUs (cores), the amount of RAM (memory), and the disk space you’ll want.  When I did this originally I tried to start cheap.  I looked through the billing costs of the different machines on Azure and started at the minimum.  In retrospect I would say go with something heavier powered.  You’ll always have the option to resize/re-class the hardware – but starting off with a decent amount of power will prevent slow install experience and degraded initial Server performance.

Once you develop the resource, you literally click a button to boot up the box and get started.  It took probably 15 to 20 minutes for my box to initially be built.  More than I was expecting.

Everything done up to this point it to get to a place where you have your own Tableau Server that you can do whatever you want with.  You can set up the type of security, configure different components – essentially get down to the nitty gritty of what it would feel like to be a server administrator.

Your virtual machine should have access to the internet, so next steps are to go to here and download the software.  Here’s a somewhat pro tip.  Consider downloading a previous version of the server software so that you can upgrade and test out what that feels like.  Consider the difference between major and minor releases and the nuance between what the upgrade process will be.  For this adventure I started with 10.0.11 and ended up upgrading to 10.3.1.

The process of the actual install is on the level of “stupid easy.”  But, you probably wouldn’t feel comfortable saying “stupid easy” unless you’ve actually done it.  There are a few click through windows with clear instructions, but for the most part it installs start to finish without much input from the end user.

You get to this window here once you’ve finished the install process.

This is literally the next step and shows the depths to which you can administer the platform from within the server (from a menu/GUI perspective).  Basic things can be tweaked and setup – the type of authentication, SMTP (email) for alerts and subscriptions, and the all important Run As User account.  Reading through the Tableau Server: Everybody’s Install Guide is the best approach to get to this point.  Especially because of something I alluded to earlier: the majority of this is really in the planning of implementation, not the unboxing or build.

Hopefully by this point the amount of confidence gained in going through this process is going to have you feeling invincible.  You can take your superhero complex to the next level by doing the following tasks:

Start and Stop the Server via Tabadmin.  This is a great exercise because you’re using the command line utility to interact with the Server.  If you’re not someone who spends a lot of time doing these kinds of tasks it can feel weird.  Going through the act of starting and stopping the server will make you feel much more confident.  My personal experience was also interesting here: I like Tabadmin better than interacting with the basic utilities.  You know exactly what’s going on with Tabadmin.  Here’s the difference between the visual status indicator and what you get from Tabadmin.

When you right-click and ask for server status, it takes some time to display the status window.  When you’re doing the same thing in Tabadmin, it’s easier to tell that the machine is ‘thinking.’

Go to the Status section and see what it looks like.  Especially if you’re a power user from the front end (publisher, maybe even site administrator) – seeing the full details of what is in Tableau Server is exciting.

There are some good details in the Settings area as well.  This is where you can add another site if you want.

Once you’ve gotten this far in the process – the future is yours.  You can start to publish workbooks and tinker with settings.  The possibilities are really limitless and you will be working toward understanding and feeling what it means to go through each step.  And of course the best part of it all: if you ruin the box, just destroy it and start over!  You’ve officially detached yourself from the chains of responsibility and are freely developing in a sandbox.  It is your chance to get comfortable and do whatever you want.

I’d even encourage you to interact with the API.  See what you can do with your site.  Even if you use some assisted API process (think Alteryx Output to Tableau Server tool) – you’ll find yourself getting much more savvy at speaking Server and that much closer to owning a deployment in a professional environment.

February Book Binge

Another month has passed, so it’s time to recount what I’ve been reading.

Admittedly it was kind of a busy month for me, so I decided to mix up some of my book habits with podcasts.  To reflect that – I’ve decided to share a mixture of both.

 

First up is Rhinoceros Success by Scott Alexander

This is a short read designed to ignite fire and passion into whoever reads it. It walks through how a big burly rhino would approach every day life, and how you as a rhino should follow suit.

I read this one while I was transitioning between jobs and found it to be a great source of humor during the process. It helps to articulate out ‘why’ you may be doing certain things and puts it in the context of what a rhino would do. This got me through some rough patches of uncertainty.

The next book was Made to Stick by the Heath brothers

This was another recommendation and one that I enjoyed. I will caveat and say that this book is really long. I struggled to try and get through a chapter at a time (~300 pages and only 7 chapters). It is chocked full of stories to help the reader understand the required model to make ideas stick.

I read this one because often times a big part of my job is communicating out a yet to be seen vision. And it is also to try and get people to buy-in to a new type of thinking. These aren’t easy and can be met with resistance. The tools that the Heath brothers offer are simple and straightforward. I think they even extend further to writing or public speaking. How do you communicate a compelling idea that will resonate with your audience?

I’ve got their 2 other books and will be reading one of them in March.

Lastly – I wanted to spend a little bit of time sharing a podcast that I’ve come to enjoy. It is Design Matters with Debbie Millman.

This was shared with me by someone on Twitter. I found myself commuting much more than average this much (as part of the job change) and I was looking for media to consume during the variable length (30 to 60 minute) commute. This podcast fits that time slot so richly. What’s awesome is the first podcast I listed to had Seth Godin on it (reading one of his books now) – so it was a great dual purpose item. I could hear Seth and preview if I should read one of his many books and also get a dose of Debbie.

The beauty of this podcast for me is that Debbie spends a lot of time exploring the personality and history of modern artists/designers. She does this by amassing research on each individual and then having a very long sit-down to discuss findings. Often times this involves analyzing individual perspectives and recounting significant past events. I always find it illuminating how these people view the world and how they’ve “arrived” at their current place in life.

That wraps up my content diet for the month – and I’m off to listen to Seth.

Book Binge – January Edition

It’s time for another edition of book binge – a random category of blog posts devised (and now only on its second iteration) where I walk through the different books I’ve read and purchased this month.

First – a personal breakthrough!  I have always been an avid reader, but admittedly become lazy in recent years.  Instead of reading at least one book a month, I was going on small reading sprees of 2 or 3 books every four or five months.  After the success of my December reads, I figured I would keep things going and try to substitute books as entertainment whenever possible.

Here are a few books I read in January:

The Functional Art by Alberto Cairo

I picked this one up because it is quintessential to the world of data journalism and data visualization.  I also thought it would be great to get into the head of one of the instructors of a MOOC I’m taking.  Plus who can resist the draw of the slope chart on the cover?

I loved this one.  I like Alberto’s writing style.  It is rooted in logic and his use of text spacing and bold as emphasis is heavy on impact.  I appreciate that he says data visualization has to first be functional, but reminds us that it has to be seen to matter.  It’s also interesting to read the interviews/profiles in the end of the book of journalists.  This is an excellent way for me to shift my perspective and paradigm.  I come from the analysis/mathematical side of things – these folks are there to communicate stories of data.  A great read that is broken up in such a way that it is easy to digest.

Next book was The Visual Display of Quantitative Information by Edward Tufte

Obviously a classic read for anyone in the data visualization world by the “father” of modern information graphics.  I must admit I picked up all 4 of Tufte’s books in December, and couldn’t get my brain wrapped around them.  I was flipping through the pages to get a sense for how the information was contained and felt a little intimidated.  That intimidation was all in my head.  Once I began reading – the flow of information made perfect sense.

I appreciate Tufte’s voice and axiom type approach to information graphics.  Yes – there are times when it is snarky and absurd, but it is also full of purpose.  He walks through information graphics history, spotlighting many of the greats and lamenting the lack of recent progression (or more of a recession) in the art.

I have two favorites in this one: how he communicates small multiples and sparklines.  The verbiage used to describe the impact (and amount of information) small multiples can convey is poetic (and I don’t really like poetry).  His work on developing and demonstrating sparklines is truly illuminating.  There were times where I had dreams of putting together some of the high “data-ink” low “chartjunk” visuals that he described.  And his epilogue makes me forgive all the snarkyness.  The first in a series that I am ecstatic to continue to read.

The last book I’ll highlight this month was a short read – a Christmas present from a friend.

Together is Better by Simon Sinek

I’m very familiar with Simon – mostly because of his famous TED talk on starting with why. I’ve read his book on the subject as well. So I was delighted to be handed this tiny gem.  Written in hybrid format of children’s book and inspirational quote book – this is a good one to flip through if you’re in need of a quiet moment.  Simon calls himself a self professed optimist at the end, and that’s definitely how I left the book feeling.

It aims at sparking the inner fire we all have – and the most powerful moment: Simon saying that you don’t have to invent a new idea and then follow it.  It is perfectly acceptable to commit to someone else’s vision and follow them.  It frees you completely from the world of “special,” new, and different that entrepreneurial and ambitious types (myself) get hung up on.  You don’t have to make up an original idea – just find something that resonates deeply with you and latch on.  That is just as powerful as being a visionary.

The other part of this book devotes a significant amount of snippet takes on leadership.  A friendly reminder of what leadership looks like.  Leadership is not management.

I’ve got more books on the way and will be back in a month with three new reads to share.

The Flow of Human Migration

Today I decided to take a bit of a detour while working on a potential project for #VizForSocialGood.  I was focused on a data set provided by UNICEF that showed the number of migrants from different areas/regions/countries to destination regions/countries.  I’m pretty sure it is the direct companion to a chord diagram that UNICEF published as part of their Uprooted report.

As I was working through the data, I wanted to take it and start at the same place.  Focus on migration globally and then narrow the focus in on children affected by migration.

Needless to say – I got side tracked.  I started by wanting to make paths on maps showing the movement of migrants.  I haven’t really done this very often, so I figured this would be a great data set to play with.  Once I set that up, it quickly divulged into something else.

I wasn’t satisfied with the density of the data.  The clarity of how it was displayed wasn’t there for me.  So I decided to take an abstract take on the same concept.  As if by fate I had received Chart Chooser cards in the mail earlier and Josh and I were reviewing them.  We were having a conversation about the various uses of each chart and brainstorming on how it could be incorporated into our next Tableau user group (I really do eat, drink, and breathe this stuff).

Anyway – one of the charts we were talking about was the sankey diagram.  So it was already on my mind and I’d seen it accomplished multiple times in Tableau.  It was time to dive in and see how this abstraction would apply to the geospatial.

I started with Chris Love’s basic tutorial of how to set up a sankey.  It’s a really straightforward read that explains all the concepts required to make this work.  Here’s the quick how-to in my paraphrased words.

  1. Duplicate your data via a Union, identify the original and the copy (Which is great because I had already done this for the pathing)  As I understand it from Chris’s write-up this let’s us ‘stretch out’ the data so to speak.
  2. Once the data is stretched out, it’s filled in by manipulating the binning feature in Tableau.  My interpretation would be that the bins ‘kind of’ act like dimensions (labeled out by individual integers).  This becomes useful in creating individual points that eventually turn into the line (curve).
  3. Next there are ranking functions made to determine the starting and end points of the curves.
  4. Finally the curve is built using a mathematical function called a sigmoid function.  This is basically an asymptotic function that goes from -1 to 1 and has a middle area with a slope of ~1.
  5. After the curve is developed, the points are plotted.  This is where the ranking is set up to determine the leftmost and rightmost points.  Chris’s original specifications had the ranking straightforward for each of the dimensions.  My final viz is a riff on this.
  6. The last steps are to switch the chart to a line chart and then build out the width (size) of the line based on the measure you used in the ranking (percent of total) calculation.

So I did all those steps and ended up with exactly what was described – a sankey diagram.  A brilliant one too, I could quickly switch the origin dimension to different levels (major area, region, country) and do similar work on the destination side.  This is what ultimately led me to the final viz I made.

So while adjusting the table calculations, I came to one view that I really enjoyed.  The ranking pretty much “broke” for the initial starting point (everything was at 1), but the destination was right.  What this did for the viz was take everything from a single point and then create roots outward.  Initial setup had this going from left to right – but it was quite obvious that it looked like tree roots.  So I flipped it all.

I’ll admit – this is mostly a fun data shaping/vizzing exercise.  You can definitely gain insights through the way it is deployed (take a look at Latin America & Caribbean).

After the creation of the curvy (onion shape), it was a “what to add next” free for all.  I had wrestled with the names of the destination countries to try and get something reasonable, but couldn’t figure out how to display them in proximity with the lines.  No matter – the idea of a word cloud seemed kind of interesting.  You’d get the same concept of the different chord sizes passed on again and see a ton of data on where people are migrating.  This also led to some natural interactivity of clicking on a country code to see its corresponding chords above.

Finally to add more visual context a simple breakdown of the major regions origin to destinations.  To tell the story a bit further.  The story points for me: most migrants move within their same region, except for Latin America/Caribbean.

And so it beings – Adventures in Python

Tableau 10.2 is on the horizon and with it comes several new features – one that is of particular interest to me is their new Python integration.  Here’s the Beta program beauty shot:

Essentially what this will mean is that more advanced programming languages aimed at doing more sophisticated analysis will become an easy to use extension of Tableau.  As you can see from the picture, it’ll work similar to how the R integration works with the end-user using the SCRIPT_STR() function to pass through the native Python code and allowing output.

I have to admit that I’m pretty excited by this.  For me I see this propelling some data science concepts more into the mainstream and making it much easier to communicate and understand the purpose behind them.

In preparation I wanted to spend some time setting up a Linux Virtual Machine to start getting a ‘feel’ for Python.

(Detour) My computer science backstory: my intro to programming was C++ and Java.  They both came easy to me.  I tried to take a mathematics class based in UNIX later on that was probably the precursor to some of the modern languages we’re seeing, but I couldn’t get on board with the “terminal” level entry.  Very off putting coming from a world where you have a better feedback loop in terms of what you’re coding.  Since that time (~9 years ago) I haven’t had the opportunity to encounter or use these types of languages.  In my professional world everything is built on SQL.

Anyway, back to the main heart – getting a box set up for Python.  I’m a very independent person and like to take the knowledge I’ve learned over time and troubleshoot my way to results.  The process of failing and learning on the spot with minimal guidance helps me solidify my knowledge.

Here’s the steps I went through – mind you I have a PC and I am intentionally running Windows 7.  (This is a big reason why I made a Linux VM)

  1. Download and install VirtualBox by Oracle
  2. Download x86 ISO of Ubuntu
  3. Build out Linux VM
  4. Install Ubuntu

These first four steps are pretty straightforward in my mind.  Typical Windows installer for VirtualBox.  Getting the image is very easy as is the build (just pick a few settings).

Next came the Python part.  I figured I’d have to install something on my Ubuntu machine, but I was pleasantly surprised to learn that Ubuntu already comes with Python 2.7 and 3.5.  A step I don’t have to do, yay!

Now came the part where I hit my first real challenge.  I had this idea of getting to a point where I could go through steps of doing sentiment analysis outlined by Brit Cava on the Tableau blog.  I’d reviewed the code and could follow the logic decently well.  And I think this is a very extensible starting point.

So based on the blog post I knew there would be some Python modules I’d be in need of.  Googling led me to believe that installing Anaconda would be the best path forward, it contains several of the most popular Python modules.  Thus installing it would eliminate the need to individually add in modules.

I downloaded the file just fine, but instructions on “installing” were less than stellar.  Here’s the instructions:

Directions on installing Anaconda on Linux

So as someone who takes instructions very literal (and again – doesn’t know UNIX very well) I was unfortunately greeted with a nasty error message lacking any help.  Feelings from years ago were creeping in quickly.  Alas, I Googled my way through this (and had a pretty good inkling that it just couldn’t ‘find’ the file).

What they said (also notice I already dropped the _64) since mine isn’t 64-bit.

 

Alas – all that was needed to get the file to install!

So installing Anaconda turned out to be pretty easy.  After getting the right code in the prompt.  Then came the fun part, trying to do sentiment analysis.  I knew enough based on reading that Anaconda came with the three modules mentioned: pandas, nltk, and time.  So I felt like this was going to be pretty easy to try and test out – coding directly from the terminal.

Well – I hit my second major challenge.  The lexicon required to do the sentiment analysis wasn’t included.  So, I had no way of actually doing the sentiment analysis and was left to figure it out on my own.  This part was actually not that bad, Python did give me a good prompt to fix – essentially to call the nltk downloader and get the lexicon.  And the nltk downloader has a cute little GUI to find the right lexicon (vader).  I got this installed pretty quickly.

Finally – I was confident that I could input the code and come up with some results.  And this is where I hit my last obstacle and probably the most frustrating of the night.  When pasting in the code (raw form from blog post) I kept running into errors.  The message wasn’t very helpful and I started cutting out lines of code that I didn’t need.

What’s the deal with line 5?

Eventually I figured out the problem – there were weird spaces in the raw code snippet.  To which after some additional googling (this time from my husband) he kindly said “apparently spaces matter according to this forum.”  No big deal – lesson learned!

Yes! Success!

So what did I get at the end of the day?  A wonderful CSV output of sentiment scores for all the words in the original data set.

Looking good, there’s words and scores!
Back to my comfort zone – a CSV

Now for the final step – validate that my results aligned with expectations.  And it did – yay!

0.3182 = 0.3182

Next steps: viz the data (obviously).  And I’m hoping to extend this to an additional sentiment analysis, maybe even something from Twitter.  Oh and I also ended up running (you guessed it, already installed) a Jupyter notebook to get over the pain of typing directly in the Terminal.