And so it beings – Adventures in Python

Tableau 10.2 is on the horizon and with it comes several new features – one that is of particular interest to me is their new Python integration.  Here’s the Beta program beauty shot:

Essentially what this will mean is that more advanced programming languages aimed at doing more sophisticated analysis will become an easy to use extension of Tableau.  As you can see from the picture, it’ll work similar to how the R integration works with the end-user using the SCRIPT_STR() function to pass through the native Python code and allowing output.

I have to admit that I’m pretty excited by this.  For me I see this propelling some data science concepts more into the mainstream and making it much easier to communicate and understand the purpose behind them.

In preparation I wanted to spend some time setting up a Linux Virtual Machine to start getting a ‘feel’ for Python.

(Detour) My computer science backstory: my intro to programming was C++ and Java.  They both came easy to me.  I tried to take a mathematics class based in UNIX later on that was probably the precursor to some of the modern languages we’re seeing, but I couldn’t get on board with the “terminal” level entry.  Very off putting coming from a world where you have a better feedback loop in terms of what you’re coding.  Since that time (~9 years ago) I haven’t had the opportunity to encounter or use these types of languages.  In my professional world everything is built on SQL.

Anyway, back to the main heart – getting a box set up for Python.  I’m a very independent person and like to take the knowledge I’ve learned over time and troubleshoot my way to results.  The process of failing and learning on the spot with minimal guidance helps me solidify my knowledge.

Here’s the steps I went through – mind you I have a PC and I am intentionally running Windows 7.  (This is a big reason why I made a Linux VM)

  1. Download and install VirtualBox by Oracle
  2. Download x86 ISO of Ubuntu
  3. Build out Linux VM
  4. Install Ubuntu

These first four steps are pretty straightforward in my mind.  Typical Windows installer for VirtualBox.  Getting the image is very easy as is the build (just pick a few settings).

Next came the Python part.  I figured I’d have to install something on my Ubuntu machine, but I was pleasantly surprised to learn that Ubuntu already comes with Python 2.7 and 3.5.  A step I don’t have to do, yay!

Now came the part where I hit my first real challenge.  I had this idea of getting to a point where I could go through steps of doing sentiment analysis outlined by Brit Cava on the Tableau blog.  I’d reviewed the code and could follow the logic decently well.  And I think this is a very extensible starting point.

So based on the blog post I knew there would be some Python modules I’d be in need of.  Googling led me to believe that installing Anaconda would be the best path forward, it contains several of the most popular Python modules.  Thus installing it would eliminate the need to individually add in modules.

I downloaded the file just fine, but instructions on “installing” were less than stellar.  Here’s the instructions:

Directions on installing Anaconda on Linux

So as someone who takes instructions very literal (and again – doesn’t know UNIX very well) I was unfortunately greeted with a nasty error message lacking any help.  Feelings from years ago were creeping in quickly.  Alas, I Googled my way through this (and had a pretty good inkling that it just couldn’t ‘find’ the file).

What they said (also notice I already dropped the _64) since mine isn’t 64-bit.


Alas – all that was needed to get the file to install!

So installing Anaconda turned out to be pretty easy.  After getting the right code in the prompt.  Then came the fun part, trying to do sentiment analysis.  I knew enough based on reading that Anaconda came with the three modules mentioned: pandas, nltk, and time.  So I felt like this was going to be pretty easy to try and test out – coding directly from the terminal.

Well – I hit my second major challenge.  The lexicon required to do the sentiment analysis wasn’t included.  So, I had no way of actually doing the sentiment analysis and was left to figure it out on my own.  This part was actually not that bad, Python did give me a good prompt to fix – essentially to call the nltk downloader and get the lexicon.  And the nltk downloader has a cute little GUI to find the right lexicon (vader).  I got this installed pretty quickly.

Finally – I was confident that I could input the code and come up with some results.  And this is where I hit my last obstacle and probably the most frustrating of the night.  When pasting in the code (raw form from blog post) I kept running into errors.  The message wasn’t very helpful and I started cutting out lines of code that I didn’t need.

What’s the deal with line 5?

Eventually I figured out the problem – there were weird spaces in the raw code snippet.  To which after some additional googling (this time from my husband) he kindly said “apparently spaces matter according to this forum.”  No big deal – lesson learned!

Yes! Success!

So what did I get at the end of the day?  A wonderful CSV output of sentiment scores for all the words in the original data set.

Looking good, there’s words and scores!
Back to my comfort zone – a CSV

Now for the final step – validate that my results aligned with expectations.  And it did – yay!

0.3182 = 0.3182

Next steps: viz the data (obviously).  And I’m hoping to extend this to an additional sentiment analysis, maybe even something from Twitter.  Oh and I also ended up running (you guessed it, already installed) a Jupyter notebook to get over the pain of typing directly in the Terminal.

Synergy through Action

This has been an amazing week for me.  On the personal side of things my ship is sailing in the right direction.  It’s amazing what the new year can do to clarify values and vision.

Getting to the specifics of why I’m calling this post “Synergy through Action.”  That’s the best way for me to describe how my participation in this week’s Tableau and data visualization community offerings have influenced me.

It all actually started on Saturday.  I woke up and spent the morning working on a VizforSocialGood project, specifically a map to represent the multiple locations connected to the February 2017 Women in Data Science conference.  I’d been called out on Twitter (thanks Chloe) and felt compelled to participate.  The kick of passion I received after submitting my viz propelled me into the right mind space to tackle 2 papers toward my MBA.

Things continued to hold steady on Sunday where I took on the #MakeoverMonday task of Donald Trump’s tweets.  I have to imagine that the joy from accomplishment was the huge motivator here.  Otherwise I can easily imagine myself hitting a wall.  Or perhaps it gets easier as time goes on?  Who knows, but I finished that viz feeling really great about where the week was headed.

Monday – Alberto Cairo and Heather Krause’s MOOC was finally open!  Thankfully I had the day off to soak it all in.  This kept my brain churning.  And by Wednesday I was ready for a workout!

So now that I’ve described my week – what’s the synergy in action part?  Well I took all the thoughts from the social good project, workout Wednesday, and the sage wisdom from the MOOC this week to hit on something much closer to home.

I wound up creating a visualization (in the vein of) the #WorkoutWednesday redo offered up.  What’s it of?  Graduation rates of specific demographics for every county in Arizona for the past 10ish years.  Stylized into small multiples using at smattering of slick tricks I was required to use to complete the workout.

Here’s the viz – although admittedly it is designed more as a static view (not quite an infographic).


And to sum it all up: this could be the start of yet another spectacular thing.  Bringing my passion to the local community that I live in – but more on a widespread level (in the words of Dan Murray, user groups are for “Tableau zealots”).

#DataResolutions – More than a hashtag

This gem of a blog post appeared on Tableau Public and within my twitter feed earlier this week asking what my #DataResolutions are.  Here was my lofty response:


Sound like a ton of goals and setting myself up for failure?  Think again.  At the heart of most of my work with data visualization are 2 concepts: growth and community.  I’ve had the amazing opportunity to co-lead and grow the Phoenix Tableau user group over the past 5+ months.  And one thing I’ve learned along the way: to be a good leader you have to show up.  Regardless of skill level, technical background, formal education, we’re all bound together by our passion for data visualization and data analytics.

To ensure that I communicate my passion, I feel that it’s critical to demonstrate it.  It grows me as a person and stretches me outside of my comfort zone to an extreme.  And it opens up opportunities and doors for me to grow in ways I didn’t know existed.  A great example of this is enrolling in Alberto Cairo and Healther Krause’s MOOC Data Exploration and Storytelling: Finding Stories in Data with Exploratory Analysis and Visualization.  I see drama and story telling as a development area for me personally.  Quite often I think I get very wrapped up in the development of data stories that the final product is a single component being used as my own visual aid.  I’d like the learn how to communicate the entire process within a visualization and guide a reader through.  I also want to be surrounded by 4k peers who have their own passion and opinions.

Moving on to collaborations.  There are 2 collaborations I mentioned above, one surrounding data+women and the other is data mashup.  My intention behind developing out these is to once again grow out of my comfort zone.  Data Mashup is also a great way for me to enforce accountability to Makeover Monday and to develop out my visualization interpretation skills.  The data+women project is still in an incubation phase, but my goal there is to spread some social good.  In our very cerebral world, sometimes it takes a jolt from someone new to be used as fuel for validation and action.  I’m hoping to create some of this magic and get some of the goodness of it from others.

More to come, but one thing is for sure: I can’t fail if I don’t write down what I want to achieve.  The same is true for achievement, unless it’s written down, how can I measure?

Book Binge – December Edition

I typically spend the end of my year self reflecting on how things have gone – both the good and the bad.  Usually that leads me to this thoughtful place of “I need more books.”  For some reason to me books are instant inspiration and a great alternative to binge streaming.  They remind me of the people I want to be, the challenges I want to battle and conquer, and seamlessly entangle themselves into whatever it is I am currently experiencing.

Here are 3 of my binges this month:

First up: You are a Badass: How to Stop Doubting Your Greatness and Start Living Your Life by Jen Sincero

This is a really great read.  Despite the title being a little melodramatic (I don’t really believe that I’m not already a super badass, or that my greatness isn’t already infiltrating the world), Jen writes in a style that is very easy to understand.  She breaks down several “self help” concepts in an analytical fashion that reveals itself through words that actually make sense.  There’s a fair amount of brash language as well, something I appreciate in writing.

Backstory on this purchase:  I actually bought a copy of this book for me and 2 fellow data warriors.  I wanted it to serve as a reminder that we are badasses and can persevere in a world where we’re sometimes misunderstood.

To contradict all the positiveness I learned from Jen Sincero, I then purchased this guy: The Subtle Art of not Giving a F*ck by Mark Manson.  (Maybe there’s a theme here: I like books with profanity on the cover?)

Despite the title, it isn’t about how you can be indifferent to everything in the world – definitely not a guide on how to detach from everything going on.  Instead it’s a book designed to help you prioritize the important things, see suffering as a growth opportunity, and figure out what suffering you like to do on a repeated basis.  I’m still working my way through this one, but I appreciate some of the basic principles that we all need to hear.  Namely that the human condition IS to be in a constant state of solving problems and suffering and fixing, improving, overcoming.  That there is no finish line, and when you reach your goal you don’t achieve confetti and prizes (maybe you do), but instead you get a whole slew of new problems to battle.

Last book of the month is more data related.  It’s good old Tableau Your Data by Dan Murray + Interworks team.

I was inspired to buy this after I met Dan (way back in March of 2016).  I’ve had the book for several months, but wanted to give it a shout out for being my friend.  I’ve had some sticky challenges regarding Tableau Server this month and the language, organized layout, and approach to deployment have been the reinforcement (read as: validation) I’ve needed at times in an otherwise turbulent sea.

More realistically – I try to buy at least 1 book a month.  So I’m hoping to break in some good 2017 habits of doing small recaps on what I’ve read and the imprint new (or revisited) reads leave behind.

How do you add value through data analytics?

I recently read this article that really ignited a lot of thoughts that often swirl around in my mind.  If you were to ask me what my drive is, it’s making data-informed, data-driven decisions.  My mechanism for this is through data visualization.  More broadly than that, it is communicating complex ideas in a visual manner.  Often when you take an idea and paint it into a picture people can connect more deeply to it and it becomes the catalyst for change.

All that being said – I’ve encountered a sobering problem.  Those on the more “analytical” side of the industry sometimes fail to see the value in the communication aspect of data analytics.  They’ve become mired down by the concept that knowing statistical programming languages, database theory, and structured query language are the most important aspects of the process.  While I don’t discount the significance of these tools (and the ability to utilize them correctly), I can’t be completely on board with it.

We’ve all sat in a meeting that is born out of one idea: how do we get better.  We don’t get better by writing the most clever and efficient SQL query.  We get better by talking through and really understanding what it IS we’re trying to measure.  When we say X what do we mean?  How do we define X.  Defining X is the hard part – pulling it out of the database, not as difficult.  If you can get really good at definitions, it becomes intuitive when you start trying to incorporate it into your business initiatives.

As we continue to evolve in the business world, I highly encourage those from both ends of the spectrum to try and meet somewhere in the middle.  We have an unbelievable amount of technical tools at our disposal, yet quite often you step into a business who is still trying to figure out HOW to measure the most basic of metrics.  Let’s stop and consider how this happened and work on achieving excellence and improvement through the marriage of business and technical acumen – with artistry and creativity thrown in there for good measure.

#data16 Day 3

Admittedly I’m jumping from day 1 to day 3.  I hit a micro wall on Tuesday.  But now that I’ve pushed through to Wednesday – it is time to focus on the amazing.

First up – paradigm shift.  I had a very novel vision of expectations and how to “get the most” out of the conference.  This involved the idea of attending several hands-on sessions and maximizing my time soaking in how others solve data problems.  The ‘why’ behind the initial decision: I have a particular love for seeing how other people pull apart problems.  I was once asked what my passion was by a colleague – I said that I loved understanding the universe.  Pulling apart anything and everything, understanding it, cataloging it, figuring out how it fits into existence.  So faced with the opportunity to see how others tackle things was something I had to do.

So what was the paradigm shift?  The conference isn’t just for seeing people solve problems.  It’s about seeing people communicate their passion.  And this happens in a million different ways.  This morning it happened with Star Trek and making data fun and serious.  Later it was 300+ slides of humor secretly injected with sage wisdom.  The word that comes to my mind is intensity.  I think really what I started seeking out was intensity.  And there’s no shortage.

My takeaway: Focus more on the passion and intensity from others.  Soaking this in becomes fuel.  Fuel for improvement, potential, and endless possibilities.  I can always go back and learn the intricate, well documented solutions.  I can’t recreate magic.

Second item – commitment.  Commitment is accountability, following through, sticking it out, dedication.  Commitment is daunting.  Commitment is a conscious choice.  I made a commitment to myself to be present, to engage with others.  Following through has been difficult (and very imperfect), but it has been unbelievably rewarding.  Thinking back to my day 1 thoughts – I fall back to community.  Committing to this community has been one of the best decisions I’ve made.

My takeaway: Human connections matter and are second to none.  Human connections make all the gravy of data visualization, playing with data, and problem solving possible.  (Also when you commit to dancing unafraid at a silent disco, you end up having an amazing day.)

Final item – Try everything that piques interest.  (This one I will keep short because it’s late.)  If you sense something is special, RUN TOWARD IT.  Special is just that: special.  Unique, one-of-a-kind, infrequent.  I think the moments I’ve had while here will turn into what shapes the next year of my life adventures.

My love note for Wednesday – in the books.

#data16 Day 1

What better way to commemorate my first day at #data16 than sharing the highs, lows – what has met expectations and what I didn’t expect.

The community – Probably the one thing I couldn’t anticipate coming into #data16 was how the virtual community (mainly via Twitter) compared to reality.  Like internet dating, you never really know how things are going to be until you meet someone in real life.  Not that I am shocked, but everyone that I’ve met from the blogosphere/twitterverse has been even more amazing than I imagined.  From sitting next to an Iron Viz contestant and forming a friendship on the plane, to getting a ride to my hotel, to meeting up with friends in a crazy food truck alley, to someone shouting my name in the Expo Hall – it’s been a wave of positive energy.

One unexpected component was the local Phoenix community.  It’s been awesome to see familiar faces from Phoenix wandering around Austin soaking in every moment.  I wanted to come to Austin and feel surrounded by familiar and that is definitely something that’s been accomplished.

The venues – When I was 18, I redecorated my childhood bedroom to be more “adult.”  Part of the process was finding the perfect desk for my space.  I somehow stumbled onto an Ikea webpage (mind you, I grew up in a small-ish city in Indiana).  Not knowing too much, I convinced my mom to road trip to Chicago to go to Ikea and buy my perfect desk.  What I expected at the time was to walk into a normal size furniture store.  I couldn’t fathom or anticipate the sheer size the store turned out to be.  That’s been my experience in Austin so far.  Overwhelmingly massive in size with everything being on a grand unexpected scale.  Not bad, just unexpected.  The registration desk had 50+ smiling faces greeting me.

Logistics – I’m still early in the game, so I will have to elaborate after a full day of conference.  So far I’ve been extremely impressed.  I was intimidated by being south of campus.  How would I get around, would I be able to be “in it?”  This has been a non-issue.  Details on transportation have been very transparent and well organized.  There’s been food at every turn, plenty to sustain even the weirdest of diets.

The weather – This has been my only let down!  I can tell it has been rainy off and on, so it is super humid.  For someone used to the dry Arizona air, it’s a little different feeling the moisture in the air.  I’m sure my skin in thankful!  But, tonight I’m left running the A/C to compensate for the moisture.  A huge change from swimming in Phoenix on Sunday to heavy humidity on Monday.

First up for my very full day Tuesday is hopefully a meetup for Healthcare followed up by the morning keynote (I really need to eat some breakfast!).  After that – we’ll see.  I originally anticipated spending the majority of my time in Jedi hands-on sessions.  I love seeing how people solve data problems and figuring out things I can take back, tweak and tinker with.  After today, I’m wondering if I should reevaluate.  The one thing I won’t be able to recreate after this experience are the people, so anytime there’s a schedule clash – for me I am prioritizing networking above all else.

#data16 day one in the books!