Boost Your Professional Skills via Games

Have you ever found yourself in a situation where you were looking for opportunities to get more strategic, focus on communication skills, improve your ability to collaborate, or just stretch your capacity to think critically?  Well I have the answer for you: pick up gaming.

Let’s pause for a second and provide some background: I was born the same year the NES was released in North America – so my entire childhood was littered with video games.  I speak quite often about how much video gaming has influenced my life.  I find them to be one of the best ways to unleash creativity, have a universe where failure is safe, and there is always an opportunity for growth and challenge.

With all that context you may think this post is about video games and how they can assist with growing out the aforementioned skills.  And that’s where I’ll add a little bit of intrigue: this post is actually dedicated to tabletop games.

For the past two years I’ve picked up an awesome hobby – tabletop gaming.  Not your traditional Monopoly or Game of Life – but games centered around strategy and cooperation.  I’ve taken to playing them with friends, family, and colleagues as a way to connect and learn.  And along the way I’ve come across a few of my favorites that serve as great growth tools.

Do I have you intrigued?  Hopefully!  Now on to a list of recommendations.  And the best part?  All but one of these can be played with 2 players.  I often play games with my husband as a way to switch off my brain from the hum of everyday life and into the deep and rich problems and mechanics that arise during game play.

First up – Jaipur

Jaipur is only a 2 player game that centers around trading and selling goods.  The main mechanics here are knowing when to sell, when to hold, and how to manipulate the market.  There are camel cards that get put in place that when returned cause new goods to appear.

Why you should play: It is a great way to understand value at a particular moment in time.  From being first to market, to waiting until you have several of a specific good to sell, to driving changes in the market by forcing your opponent’s hand.  It helps unlock the necessity to anticipate next steps.  It shows how you can have control over certain aspects (say all the camels to prevent variety in the market), but how that may put you at a disadvantage when trying to sell goods.

It’s a great game that is played in a max of 3 rounds and probably 30 minutes.  The variety and novelty of what happens makes this a fun to repeat game.

Hanabi

Hanabi is a collaborative game that plays anywhere from 2 to 5 people.  The basic premise is that you and your friends are absentminded fireworks makers and have mixed up all the fireworks (numeric sets 1 to 5 of 5 different colors).  Similar to Indian Poker you have a number of cards (3 or 4) facing away from you.  That is to say – you don’t know your hand, but your friends do.  Through a series of sharing information and discarding/drawing cards everyone is trying to put down cards in order from 1 to 5 for particular colors.  If you play a card too soon then the fireworks could go off early and there’s only so much information to share before the start of the fireworks show.

This is a great game to learn about collaboration and communication.  When you’re sharing information you give either color or numeric information to someone about their hand.  This can be interpreted several different ways and it’s up to the entire team to communicate effectively and adjust to interpretation style.  It also forces you to make choices.  My husband and I recently played and got dealt a bunch of single-high value cards that couldn’t be played until the end.  We had to concede as a team that those targets weren’t realistic to go after and were the only way we could end up having a decent fireworks display.

Lost Cities

This is another exclusively two player game.  This is also a set building game where you’re going on exploration missions to different natural wonders.  Your goal is to fill out sets in numeric order (1 to 10) by color.  There’s a baseline cost to going on a mission, so you’ll have to be wise about going off on a mission.  There are also cards you can play (before the numbers) that let you double, triple, or quadruple your wager on successfully going on the exploration.  You and your opponent take turns drawing from a pool of known cards or from a deck.  Several tactics can unfold here.  You can build into a color early, or completely change paths once you see what the other person is discarding.  It’s also a juggling act to decide how much to wager to end up making money.

Bohnanza

This one plays well with a widespread number of players.  The key mechanic here is that you’re a bean farmer with 2 fields to plant beans.  The order in which you receive cards is crucial and can’t be changed.  It’s up to you to work together with your fellow farmers at the bean market to not uproot your fields too early and ruin a good harvest.  This is a rapid fire trading game where getting on someone’s good side is critical and you’ll immediately see the downfall of holding on to cards for the “perfect deal.”  But of course you have to balance out your friendliness with the knowledge that if you share too many high value beans the other farmers may win.  There’s always action on the table and you have to voice your offer quickly to remain part of the conversation.

The Grizzled

The Grizzled is a somewhat melancholy game centered around World War I.  You’re on a squad and trying to successfully fulfill missions before all morale is lost.  You’ll do this by dodging too many threats and offering support to your team.  You’ll even make speeches to encourage your comrades.  This game offers lots of opportunities to understand when and how to be a team player to keep morale high and everyone successful.  The theme is a bit morose, but adds context to the intention behind each player’s actions.

The Resistance

Sadly this requires a minimum of 5 people to play, but is totally worth it.  As the box mentions it is a game of deduction and deception.  You’ll be dealt a secret role and are either fighting for victory or sabotage.  I played this one with 8 other colleagues recently and pure awesomeness was the result.  You’ll get the chance to pick teams for missions, vote on how much you trust each other, and ultimately fight for success or defeat.  You will get insight into crowd politics and how individuals handle situations of mistrust and lack of information.  My recent 9 player game divulged into using a white board to help with deductions!

Next time you’re in need of beefing up your soft skills or detaching from work and want to do it in a productive and fun manner – consider tabletop gaming.  Whether you’re looking for team building exercises or safe environments to test how people work together – tabletop games offer it all.  And in particular – collaborative tabletop games.  With most games there’s always an element of putting yourself first, but you will really start to understand how individuals like to contribute to team mechanics.

#IronViz – Let’s Go on a Pokémon Safari!

It’s that time again – Iron Viz!  The second round of Iron Viz entered my world via an email with a very enticing “Iron Viz goes on Safari!” theme.  My mind immediately got stuck on one thing: Pokémon Safari Zone.

Growing up I was a huge gamer and Pokémon was (and still is) one of my favorites.  I even have a cat named after a Pokémon, it’s Starly (find her in the viz!).  So I knew if I was going to participate that the idea for Pokémon Safari was the only way to go.

I spent a lot of time thinking about how I might want to bring this to life.  Did I want to do a virtual safari of all the pocket monsters?  Did I want to focus on the journey of Ash Ketchum through the Safari Zone?  Did I want to focus on the video games?

After all the thoughts swirled through my mind – I settled on the idea of doing a long form re-creation of Ash Ketchum’s adventure through the Safari Zone in the anime.  I sat down and googled to figure out the episode number and go watch.  But to my surprise the episode has been banned.  It hasn’t made it on much TV and the reason it is banned makes it very unattractive and unfriendly for an Iron Viz long form.  I was gutted and had to set off on a different path.

The investment into the Safari Zone episode got me looking through the general details of the Safari Zone in the games.  And that’s what ended up being my hook.  I tend to think in a very structured format and because there were 4 regions that HAD Safari Zones (or what I’d consider to be the general spirit of one) it made it easy for me to compare each of them against each other.

Beyond that I knew I wanted to keep the spirit of the styling similar to the games.  My goal for the viz is to give the end user an understanding of the types of Pokémon in each game.  To show some basic details about each pocket monster, but to have users almost feel like they’re on the Safari.

There’s also this feeling I wanted to capture – for anyone who has played Pokémon you may know it.  It’s the shake of the tall grass.  It is the tug of the Fishing Pole.  It’s the screen transition.  In a nutshell: what Pokémon did I just encounter?  There is a lot of magic in that moment of tall grass shake and transition to ‘battle’ or ‘encounter’ screen.

My hope is that I captured that well with the treemaps.  You are walking through each individual area and encountering Pokémon.  For the seasoned Safari-goer, you’ll be more interested in knowing WHERE you should go and understanding WHAT you can find there.  Hence the corresponding visuals surrounding.

The last component of this visualization was the Hover interactivity.  I hope it translates well because I wanted the interactivity to be very fluid.  It isn’t a click and uncover – that’s too active.  I wanted this to be a very passive and openly interactive visualization where the user would unearth more through exploring and not have to click.

#WorkoutWednesday Week 24 – Math Musings

The Workout Wednesday for week 24 is a great way to represent where a result for a particular value falls with respect to a broader collection.  I’ve used a spine chart recently on a project where most data was centered around certain points and I wanted to show the range.  Propagating maximums, minimums, averages, quartiles, and (when appropriate) medians can help to profile data very effectively.

So I started off really enjoying where this visualization was going.  Also because the spine chart I made on a recent project was before I even knew the thing I developed had already been named.  (Sad on my part, I should read more!)

My enjoyment turned into caution really quickly once I saw the data set.  There are several ratios in the data set and very few counts/sums of things.  My math brain screams trap!  Especially when we start tiptoeing into the world of what we semantically call “average of all” or “overall average” or something that somehow represents a larger collective (“everybody”).  There is a lot of open-ended interpretation that goes into this particular calculation and when you’re working with pre-computed ratios it gets really tricky really quickly.

Here’s a picture of the underlying data set:

 

Some things to notice right away – the ratios for each response are pre-computed.  The number of responses is different for each institution.  (To simplify this view, I’m on one year and one question).

So the heart of the initial question is this: if I want to compare my results to the overall results, how would I do that?  Now there are probably 2 distinct camps here.  1: take the average of one of the columns and use that to represent the “overall average”.  Let’s be clear on what that is: it is the average pre-computed ratio of a survey.  It is NOT the observed percentage of all individuals surveyed.  That would be option 2: the weighted average.  For the weighted average or to calculate a representation of all respondents we could add up all the qualifying respondents answering ‘agree’ and divide it by the total respondents.

Now we all know this concept of average of an average vs. weighted average can cause issues.  Specifically we’d feel the friction immediately if there were several low-end responses commingled with several higher response capturing entities.  EX: Place A: 2 people out of 2 answered yes (100%) and  Place B: 5 out of 100 answered ‘yes’ (5%).  If we average 100% and 5% we’ll get 52.5%.  But what if we take 7 out of 102, that’s 6.86% – a way different number.  (Intentionally extreme example.)

So my math brain was convinced that the “overall average” or “ratio for all” should be inclusive of the weights of each Institution.  That was fairly easy to compensate for: take each ratio and multiply it by the number of respondents to get raw counts and then add those all back up together.

The next sort of messy thing to deal with was finding the minimums and maximums of these values.  It seems straightforward, but when reviewing the data set and the specifications of what is being displayed there’s caution to throw with regard to level of aggregation and how the data is filtered.  As an example, depending on how the ratios are leveraged, you could end up finding the minimum of 3 differently weighted subjects to a subject group.  You could also probably find the minimum Institution + subject result at the subject level of all the subjects within a group.  Again I think the best bet here is to tread cautiously over the ratios and get into raw counts as quickly as possible.

So what does this all mean?  To me it means tread carefully and ask clear questions about what people are trying to measure.  This is also where I will go the distance and include calculations in tool tips to help demonstrate what the values I am calculating represent.  Ratios are tricky and averaging them is even trickier.  There likely isn’t a perfect way to deal with them and it’s something we all witness consistently throughout our professional lives (how many of us have averaged a pre-computed average handle time?).

Beyond the math tangent – I want to reiterate how great a visualization I think this is.  I also want to highlight that because I went deep-end math on it that I decided to go deep end development different.

The main difference from the development perspective?  Instead of using reference bands, I used a gannt bar as the IQR.  I really like using the bar because it gives users an easier target to hover over.  It also reduce some of the noise of the default labeling that occurs with reference lines.  To create the gannt bar – simply compute the IQR as a calculated field and use it as the size.  You can select one of the percentile points to be the start of the mark.

March & April Combined Book Binge

Time for another recount of the content I’ve been consuming.  I missed my March post, so I figured it would be fine to do a combined effort.

First up:

The Icarus Deception by Seth Godin

In my last post I mentioned that I got a recommendation to tune in to Seth and got the opportunity to hear him firsthand on Design Matters.  Well, here’s the first full Seth book I’ve consumed and it didn’t disappoint.  If I had to describe what this book contains – I would say that it is a near manifesto for the modern artist.  The world is run by industrialists and the artist is trying to break through.

I appreciate how Seth frames the concept of an artist – he unpacks the term and invites or ENCOURAGES everyone to identify as such.  Being an artist means being emotionally invested, showing up, giving a shit.  That giving a shit, caring, connecting is ALL there is.  That you succeed in the world by connecting, by sharing your art.  These concepts and ideals resonate deeply with me.  He also explains how vulnerable and gutting it can be to live as an artist – something I’ve felt and experienced several times.

During the course of listening to this book I was on site with a client.  We got to a certain point, agreed on the direction and visualizations, then shared them with the broader team.  The broader team came heavy with design suggestions – most notable the green/red discussion came in to play.  I welcome these challenges and as an artist and communicator it is my responsibility to share my process, listen to feedback, and collaborate to find a solution.  That definitely occurred throughout the process, but honestly caused me to lose my balance for a moment.

As I reflected on what happened – I was drawn to this idea that as a designer I try to have ultimate empathy for the end user.  And furthermore the amount of care given to the end user is never fully realized by the casual interactor.  A melancholy realization, but one that should not be neglected or forgotten.

Moving on to the next book:

Rework by Jason Fried & David Heinemeier Hansson

This one landed in my lap because it was available while perusing through library books.

A quick read that talks about how to succeed in business.  It takes an extreme focus on being married to a vision and committing to it.  The authors focus on getting work done.  Sticking to a position and seeing it through.  I very much appreciated that they were PROUD of decisions they made for their products and company.  Active decisions NOT to do something can be more liberating and make someone more successful than being everything to everyone.

Last up was this guy:

Envisioning Information by Edward Tufte

A continuation of reading through all the Tufte books.  I am being lazy by saying “more of the same.”  Or “what I’ve come to expect.”  These are lazy terms, but they encapsulate what Tufte writes about: understanding visual displays of information.  Analyzing at a deep level the good, bad, and ugly of displays to get to the heart of how we can communicate through visuals.

I particularly loved some of the amazing train time tables displayed.  This concept of using lines to represent timing of different routes was amazing to see.  And the way color is explored and leveraged is on another level.  I highly recommend this one if the thought of verbalizing your witnessing of Tufte’s strong tongue-in-cheek style sounds entertaining.  I know for me it was.

February Book Binge

Another month has passed, so it’s time to recount what I’ve been reading.

Admittedly it was kind of a busy month for me, so I decided to mix up some of my book habits with podcasts.  To reflect that – I’ve decided to share a mixture of both.

 

First up is Rhinoceros Success by Scott Alexander

This is a short read designed to ignite fire and passion into whoever reads it. It walks through how a big burly rhino would approach every day life, and how you as a rhino should follow suit.

I read this one while I was transitioning between jobs and found it to be a great source of humor during the process. It helps to articulate out ‘why’ you may be doing certain things and puts it in the context of what a rhino would do. This got me through some rough patches of uncertainty.

The next book was Made to Stick by the Heath brothers

This was another recommendation and one that I enjoyed. I will caveat and say that this book is really long. I struggled to try and get through a chapter at a time (~300 pages and only 7 chapters). It is chocked full of stories to help the reader understand the required model to make ideas stick.

I read this one because often times a big part of my job is communicating out a yet to be seen vision. And it is also to try and get people to buy-in to a new type of thinking. These aren’t easy and can be met with resistance. The tools that the Heath brothers offer are simple and straightforward. I think they even extend further to writing or public speaking. How do you communicate a compelling idea that will resonate with your audience?

I’ve got their 2 other books and will be reading one of them in March.

Lastly – I wanted to spend a little bit of time sharing a podcast that I’ve come to enjoy. It is Design Matters with Debbie Millman.

This was shared with me by someone on Twitter. I found myself commuting much more than average this much (as part of the job change) and I was looking for media to consume during the variable length (30 to 60 minute) commute. This podcast fits that time slot so richly. What’s awesome is the first podcast I listed to had Seth Godin on it (reading one of his books now) – so it was a great dual purpose item. I could hear Seth and preview if I should read one of his many books and also get a dose of Debbie.

The beauty of this podcast for me is that Debbie spends a lot of time exploring the personality and history of modern artists/designers. She does this by amassing research on each individual and then having a very long sit-down to discuss findings. Often times this involves analyzing individual perspectives and recounting significant past events. I always find it illuminating how these people view the world and how they’ve “arrived” at their current place in life.

That wraps up my content diet for the month – and I’m off to listen to Seth.

Makeover Monday Week 10 – Top 500 YouTube Game(r) Channels

We’re officially 10 weeks into Makeover Monday, which is a phenomenal achievement.  This means that I’ve actively participated in recreating 10 different visualizations with data varying from tourism, to Trump, to this week’s Youtube gamers.

First some commentary people may not like to read: the data set was not that great.  There’s one huge reason why it wasn’t great: one of the measures (plus a dimension) was a dependent variable on two independent variables.  And that dependent variable was processed via a pre-built algorithm.  So it would almost make sense to use the resultant dependent variable to enrich other data.

I’m being very abstract right now – here’s the structure of the data set:

Let’s walk through the fields:

  • Rank – this is a component based entirely on the sort chosen by the top (for this view it is by video views, not sure what those random 2 are, I just screencapped the site)
  • SB Score/Rank – this is some sort of ranking value applied to a user based on a propriety algorithm that takes a few variables into consideration
  • SB Score (as a letter grade) – the letter grade expression of the SB score
  • User – the name of the gamer channel
  • Subscribers – the # of channel subscribers
  • Video Views – the # of video views

As best as I can tell through reading the methodology – SB score/rank (the # and the alpha) are influenced in part from the subscribers and video views.  Which means putting these in the same view is really sort of silly.  You’re kind of at a disadvantage if you scatterplot subscribers vs. video views because the score is purportedly more accurate in terms of finding overall value/quality.

There’s also not enough information contained within the data set to amass any new insights on who is the best and why.  What you can do best with this data set is summarization, categorization, and displaying what I consider data set “vitals.”

So this is the approach that I took.  And more to that point, I wanted to make over a very specific chart style that I have seen Alberto Cairo employ a few times throughout my 6 week adventure in his MOOC.

That view: a bar chart sliced through with lines to help understand size of chunks a little bit better.  This guy:

So my energy was focused on that – which only happened after I did a few natural (in my mind) steps in summarizing the data, namely histograms:

Notice here that I’ve leveraged the axis values across all 3 charts (starting with SB grade and through to it’s sibling charts to minimize clutter).  I think this has decent effect, but I admit that the bars aren’t equal width across each bar chart.  That’s not pleasant.

My final two visualizations were to demonstrate magnitude and add more specifics in a visual manner to what was previously a giant text table.

The scatterplot helps to achieve this by displaying the 2 independent variables with the overall “SB grade” encoded on both color and size.  Note: for size I did powers of 2: 2^9, 2^8, 2^7…2^1.  This was a decent exponential effect to break up the sizing in a consistent manner.

The unit chart on the right is to help demonstrate not only the individual members, but display the elite A+ status and the terrible C+, D+, and D statuses.  The color palette used throughout is supposed to highlight these capstones – bright on the edges and random neutrals between.

This is aptly named an exploration because I firmly believe the resultant visualization was built to broadly pluck away at the different channels and get intrigued by the “details.”  In a more real world I would be out hunting for additional data to tag this back to – money, endorsements, average video length, number of videos uploaded, subject matter area, type of ads utilized by the user.  All of these appended to this basic metric aimed at measuring a user’s “influence” would lead down the path of a true analysis.

The Flow of Human Migration

Today I decided to take a bit of a detour while working on a potential project for #VizForSocialGood.  I was focused on a data set provided by UNICEF that showed the number of migrants from different areas/regions/countries to destination regions/countries.  I’m pretty sure it is the direct companion to a chord diagram that UNICEF published as part of their Uprooted report.

As I was working through the data, I wanted to take it and start at the same place.  Focus on migration globally and then narrow the focus in on children affected by migration.

Needless to say – I got side tracked.  I started by wanting to make paths on maps showing the movement of migrants.  I haven’t really done this very often, so I figured this would be a great data set to play with.  Once I set that up, it quickly divulged into something else.

I wasn’t satisfied with the density of the data.  The clarity of how it was displayed wasn’t there for me.  So I decided to take an abstract take on the same concept.  As if by fate I had received Chart Chooser cards in the mail earlier and Josh and I were reviewing them.  We were having a conversation about the various uses of each chart and brainstorming on how it could be incorporated into our next Tableau user group (I really do eat, drink, and breathe this stuff).

Anyway – one of the charts we were talking about was the sankey diagram.  So it was already on my mind and I’d seen it accomplished multiple times in Tableau.  It was time to dive in and see how this abstraction would apply to the geospatial.

I started with Chris Love’s basic tutorial of how to set up a sankey.  It’s a really straightforward read that explains all the concepts required to make this work.  Here’s the quick how-to in my paraphrased words.

  1. Duplicate your data via a Union, identify the original and the copy (Which is great because I had already done this for the pathing)  As I understand it from Chris’s write-up this let’s us ‘stretch out’ the data so to speak.
  2. Once the data is stretched out, it’s filled in by manipulating the binning feature in Tableau.  My interpretation would be that the bins ‘kind of’ act like dimensions (labeled out by individual integers).  This becomes useful in creating individual points that eventually turn into the line (curve).
  3. Next there are ranking functions made to determine the starting and end points of the curves.
  4. Finally the curve is built using a mathematical function called a sigmoid function.  This is basically an asymptotic function that goes from -1 to 1 and has a middle area with a slope of ~1.
  5. After the curve is developed, the points are plotted.  This is where the ranking is set up to determine the leftmost and rightmost points.  Chris’s original specifications had the ranking straightforward for each of the dimensions.  My final viz is a riff on this.
  6. The last steps are to switch the chart to a line chart and then build out the width (size) of the line based on the measure you used in the ranking (percent of total) calculation.

So I did all those steps and ended up with exactly what was described – a sankey diagram.  A brilliant one too, I could quickly switch the origin dimension to different levels (major area, region, country) and do similar work on the destination side.  This is what ultimately led me to the final viz I made.

So while adjusting the table calculations, I came to one view that I really enjoyed.  The ranking pretty much “broke” for the initial starting point (everything was at 1), but the destination was right.  What this did for the viz was take everything from a single point and then create roots outward.  Initial setup had this going from left to right – but it was quite obvious that it looked like tree roots.  So I flipped it all.

I’ll admit – this is mostly a fun data shaping/vizzing exercise.  You can definitely gain insights through the way it is deployed (take a look at Latin America & Caribbean).

After the creation of the curvy (onion shape), it was a “what to add next” free for all.  I had wrestled with the names of the destination countries to try and get something reasonable, but couldn’t figure out how to display them in proximity with the lines.  No matter – the idea of a word cloud seemed kind of interesting.  You’d get the same concept of the different chord sizes passed on again and see a ton of data on where people are migrating.  This also led to some natural interactivity of clicking on a country code to see its corresponding chords above.

Finally to add more visual context a simple breakdown of the major regions origin to destinations.  To tell the story a bit further.  The story points for me: most migrants move within their same region, except for Latin America/Caribbean.

And so it beings – Adventures in Python

Tableau 10.2 is on the horizon and with it comes several new features – one that is of particular interest to me is their new Python integration.  Here’s the Beta program beauty shot:

Essentially what this will mean is that more advanced programming languages aimed at doing more sophisticated analysis will become an easy to use extension of Tableau.  As you can see from the picture, it’ll work similar to how the R integration works with the end-user using the SCRIPT_STR() function to pass through the native Python code and allowing output.

I have to admit that I’m pretty excited by this.  For me I see this propelling some data science concepts more into the mainstream and making it much easier to communicate and understand the purpose behind them.

In preparation I wanted to spend some time setting up a Linux Virtual Machine to start getting a ‘feel’ for Python.

(Detour) My computer science backstory: my intro to programming was C++ and Java.  They both came easy to me.  I tried to take a mathematics class based in UNIX later on that was probably the precursor to some of the modern languages we’re seeing, but I couldn’t get on board with the “terminal” level entry.  Very off putting coming from a world where you have a better feedback loop in terms of what you’re coding.  Since that time (~9 years ago) I haven’t had the opportunity to encounter or use these types of languages.  In my professional world everything is built on SQL.

Anyway, back to the main heart – getting a box set up for Python.  I’m a very independent person and like to take the knowledge I’ve learned over time and troubleshoot my way to results.  The process of failing and learning on the spot with minimal guidance helps me solidify my knowledge.

Here’s the steps I went through – mind you I have a PC and I am intentionally running Windows 7.  (This is a big reason why I made a Linux VM)

  1. Download and install VirtualBox by Oracle
  2. Download x86 ISO of Ubuntu
  3. Build out Linux VM
  4. Install Ubuntu

These first four steps are pretty straightforward in my mind.  Typical Windows installer for VirtualBox.  Getting the image is very easy as is the build (just pick a few settings).

Next came the Python part.  I figured I’d have to install something on my Ubuntu machine, but I was pleasantly surprised to learn that Ubuntu already comes with Python 2.7 and 3.5.  A step I don’t have to do, yay!

Now came the part where I hit my first real challenge.  I had this idea of getting to a point where I could go through steps of doing sentiment analysis outlined by Brit Cava on the Tableau blog.  I’d reviewed the code and could follow the logic decently well.  And I think this is a very extensible starting point.

So based on the blog post I knew there would be some Python modules I’d be in need of.  Googling led me to believe that installing Anaconda would be the best path forward, it contains several of the most popular Python modules.  Thus installing it would eliminate the need to individually add in modules.

I downloaded the file just fine, but instructions on “installing” were less than stellar.  Here’s the instructions:

Directions on installing Anaconda on Linux

So as someone who takes instructions very literal (and again – doesn’t know UNIX very well) I was unfortunately greeted with a nasty error message lacking any help.  Feelings from years ago were creeping in quickly.  Alas, I Googled my way through this (and had a pretty good inkling that it just couldn’t ‘find’ the file).

What they said (also notice I already dropped the _64) since mine isn’t 64-bit.

 

Alas – all that was needed to get the file to install!

So installing Anaconda turned out to be pretty easy.  After getting the right code in the prompt.  Then came the fun part, trying to do sentiment analysis.  I knew enough based on reading that Anaconda came with the three modules mentioned: pandas, nltk, and time.  So I felt like this was going to be pretty easy to try and test out – coding directly from the terminal.

Well – I hit my second major challenge.  The lexicon required to do the sentiment analysis wasn’t included.  So, I had no way of actually doing the sentiment analysis and was left to figure it out on my own.  This part was actually not that bad, Python did give me a good prompt to fix – essentially to call the nltk downloader and get the lexicon.  And the nltk downloader has a cute little GUI to find the right lexicon (vader).  I got this installed pretty quickly.

Finally – I was confident that I could input the code and come up with some results.  And this is where I hit my last obstacle and probably the most frustrating of the night.  When pasting in the code (raw form from blog post) I kept running into errors.  The message wasn’t very helpful and I started cutting out lines of code that I didn’t need.

What’s the deal with line 5?

Eventually I figured out the problem – there were weird spaces in the raw code snippet.  To which after some additional googling (this time from my husband) he kindly said “apparently spaces matter according to this forum.”  No big deal – lesson learned!

Yes! Success!

So what did I get at the end of the day?  A wonderful CSV output of sentiment scores for all the words in the original data set.

Looking good, there’s words and scores!
Back to my comfort zone – a CSV

Now for the final step – validate that my results aligned with expectations.  And it did – yay!

0.3182 = 0.3182

Next steps: viz the data (obviously).  And I’m hoping to extend this to an additional sentiment analysis, maybe even something from Twitter.  Oh and I also ended up running (you guessed it, already installed) a Jupyter notebook to get over the pain of typing directly in the Terminal.

Synergy through Action

This has been an amazing week for me.  On the personal side of things my ship is sailing in the right direction.  It’s amazing what the new year can do to clarify values and vision.

Getting to the specifics of why I’m calling this post “Synergy through Action.”  That’s the best way for me to describe how my participation in this week’s Tableau and data visualization community offerings have influenced me.

It all actually started on Saturday.  I woke up and spent the morning working on a VizforSocialGood project, specifically a map to represent the multiple locations connected to the February 2017 Women in Data Science conference.  I’d been called out on Twitter (thanks Chloe) and felt compelled to participate.  The kick of passion I received after submitting my viz propelled me into the right mind space to tackle 2 papers toward my MBA.

Things continued to hold steady on Sunday where I took on the #MakeoverMonday task of Donald Trump’s tweets.  I have to imagine that the joy from accomplishment was the huge motivator here.  Otherwise I can easily imagine myself hitting a wall.  Or perhaps it gets easier as time goes on?  Who knows, but I finished that viz feeling really great about where the week was headed.

Monday – Alberto Cairo and Heather Krause’s MOOC was finally open!  Thankfully I had the day off to soak it all in.  This kept my brain churning.  And by Wednesday I was ready for a workout!

So now that I’ve described my week – what’s the synergy in action part?  Well I took all the thoughts from the social good project, workout Wednesday, and the sage wisdom from the MOOC this week to hit on something much closer to home.

I wound up creating a visualization (in the vein of) the #WorkoutWednesday redo offered up.  What’s it of?  Graduation rates of specific demographics for every county in Arizona for the past 10ish years.  Stylized into small multiples using at smattering of slick tricks I was required to use to complete the workout.

Here’s the viz – although admittedly it is designed more as a static view (not quite an infographic).

 

And to sum it all up: this could be the start of yet another spectacular thing.  Bringing my passion to the local community that I live in – but more on a widespread level (in the words of Dan Murray, user groups are for “Tableau zealots”).

#DataResolutions – More than a hashtag

This gem of a blog post appeared on Tableau Public and within my twitter feed earlier this week asking what my #DataResolutions are.  Here was my lofty response:

 


Sound like a ton of goals and setting myself up for failure?  Think again.  At the heart of most of my work with data visualization are 2 concepts: growth and community.  I’ve had the amazing opportunity to co-lead and grow the Phoenix Tableau user group over the past 5+ months.  And one thing I’ve learned along the way: to be a good leader you have to show up.  Regardless of skill level, technical background, formal education, we’re all bound together by our passion for data visualization and data analytics.

To ensure that I communicate my passion, I feel that it’s critical to demonstrate it.  It grows me as a person and stretches me outside of my comfort zone to an extreme.  And it opens up opportunities and doors for me to grow in ways I didn’t know existed.  A great example of this is enrolling in Alberto Cairo and Healther Krause’s MOOC Data Exploration and Storytelling: Finding Stories in Data with Exploratory Analysis and Visualization.  I see drama and story telling as a development area for me personally.  Quite often I think I get very wrapped up in the development of data stories that the final product is a single component being used as my own visual aid.  I’d like the learn how to communicate the entire process within a visualization and guide a reader through.  I also want to be surrounded by 4k peers who have their own passion and opinions.

Moving on to collaborations.  There are 2 collaborations I mentioned above, one surrounding data+women and the other is data mashup.  My intention behind developing out these is to once again grow out of my comfort zone.  Data Mashup is also a great way for me to enforce accountability to Makeover Monday and to develop out my visualization interpretation skills.  The data+women project is still in an incubation phase, but my goal there is to spread some social good.  In our very cerebral world, sometimes it takes a jolt from someone new to be used as fuel for validation and action.  I’m hoping to create some of this magic and get some of the goodness of it from others.

More to come, but one thing is for sure: I can’t fail if I don’t write down what I want to achieve.  The same is true for achievement, unless it’s written down, how can I measure?