The Importance of Certification

I’ve long been a huge advocate of certification in technical fields.  I think it is a great way to actively communicate and demonstrate the skill level you have in a particular area.  Even further in my mind it represents the ability to set a foundation to build off of with others.

I personally started my Tableau certification journey last year (more like 18 to 20 months ago).  I was becoming much more heavily involved in my local Tableau user group and felt that I needed a way to benchmark or assess my skills.  I knew I had much more to contribute to my local community and I thought that going through the certification journey and sharing that with the TUG would be beneficial.

So I started by getting ready for the Desktop Qualified Exam.  Independently I went through all the existing documentation and searched for knowledge nuggets that would set me on the right path.  I took the stance of developing out all of my own knowledge gaining to a format that could be digested by a larger audience.  I walked through all of the practice exam questions and did an analysis of the different certification levels to the user group at least 3 times.

I passed the Desktop Qualified Associate around April or May of 2016.  It was a great achievement – and I was able to add definition to what that exam means.  Having the Desktop Qualified Associate certification means that you are technically proficient in the features and functions of Tableau Desktop.  It means that you can answer thoughtful questions using built in features and that you have depth of understanding best practices or efficient ways to get to the results.  If I were to equate it to a different skill – I would say that it means you know how and when to use different tools in a toolbox.  What it doesn’t mean: that you are a masterful architect or that you can build a stunningly beautiful home.

To get to the next level of mastery and understanding, that means you’ll need the Certified Professional.  If you take a look at the specific components that are tested you’ll quickly realize that advanced technical skill is weighted less than storytelling or analysis.  The purpose of the Desktop Certified Professional is to demonstrate that you have a deep understanding of data visualization, using data visualization to tell a story, and how level two (or three or four) analytics or analysis are necessary to start answering deeper and more important (read as: higher impact) questions.

For me to work on preparation here – the exam prep guide was only the beginning.  It assists in the structural components of 1) knowing how many questions there will be 2) estimating time available to spend on each question 3) examples of analytical/presentation depth required to demonstrate proficiency 4) variety of question types.

Probably the most intriguing questions for me are those where you have to assess a particular visualization, give and justify a critique (and specifically how it relates to a described objective) and then provide an alternative solution (also justifying verbally the importance).  This skill is much different than knowing how to hammer a nail into a post.  It is defending why you chose to put a porch on the northeast corner of a home.  It’s a lot about feel.

I had such an awesome time taking the exam.  There are a lot of real world constraints that required me to distill down the most important components of each question.  It’s interesting because for most items there isn’t a single right answer.  There are definitely lots of wrong answers, but right is a spectrum that is somewhat dependent on your ability to communicate out the completeness of your point of view.

I’ve had the title of Tableau Desktop Certified Professional for just over a year now – so I can tell you with a decent amount of retrospective thought what it has done for me.  Just as I am able to describe the test and purpose it served in this blog post – I can do the same thing in all of my interactions.  It keeps me humble to knowing that the PURPOSE behind a visual display is more important than fancy widgets or cool tricks.  That to a large extent my role is to work through the semantics of a situation and get to the root of it.  The root of the question or questions, the heart of concern, the why behind the visualization.  And also the artistry (yes I use this word) behind what it takes to get there.  We have all felt the difference between a perfectly acceptable visualization and the right visualization.  The end user experiences something different.  I firmly believe that deeper understanding can be achieved by spending that extra thoughtfulness and approach to iteration.

So let’s now fast forward to the other certification path – the more recent one: Tableau Server.  What’s interesting is that because my strengths have been built out on the visualization end, I haven’t planted myself in an opportunity to understand the deeper technical components of Tableau Server.  I have always understood and had great depth of knowledge in Site Administration.  That is to say acknowledging and abiding by best practices for sharing, permissions, and managing content.  But – the part that I had not spent time on is creating a sustainable platform to have the vision continuously executed.

So to overcome that minor blind spot – I went on a mission to learn more, to shine light on the unknown.  You’ve seen that play out here on my blog – going on a self directed adventure to deploy a Server on Azure.  Nobody told me to do that – I was internally compelled.  (I should also mention I was honored to have a friend go on the journey with me!)

I’m probably rehashing at this point – but anytime you grow knowledge in a particular area (more specifically technical) it gives you such breadth and depth of vocabulary to be able to connect to other individuals.  You find that communication barriers that were preventing the success of a project are diminished because you now speak the same language.  As I write this I can hear Seth Godin saying that the more knowledge someone has in a particular subject area the more ABSTRACT their language is around it.  Which means that it is extremely difficult to communicate with that SME unless significant effort is taken on the part of both parties to bridge the gap.

So that’s what Tableau Server qualification has done for me.  It’s the first step on a journey to get to the point where I imagine the next level to Server Certified Professional is the act of execution.  It’s less knowledge and verbiage and more tactical.  Also likely there’s more ambiguity – not a right answer, rather a spectrum of right where you communicate your why.

As I wind down this post – I must shout to you “go get certified.”  Ignore the naysayers.  It’s easy to not do something, but you know what is hard?  Doing something.  Being tested.  And why is that?  Because you can fail.  Get over failure – push through that mindset.  The alternative is much more pleasant and unlocks all the potential the universe has to offer.

 

Star Trek The Next Generation: Every Episode (#IronViz 3)

It’s that time again – Iron Viz feeder contest!  The third and final round for a chance to battle at conference in a chef coat is upon us.  This round the focus was on anything ‘Silver Screen.’

With a limitless topic I was certain that I would find myself in a creative rut that would likely result in submitting something at the end of the submission time period (August 13th).  So I am as shocked as anyone else that I have a fully formed submission way before deadline.

So what’s the topic and what got me unstuck?  Star Trek of course!  The backstory here is amazing – I went to a belated wedding shower for a few friends and they mentioned to me that they were going to the annual Star Trek convention.  And more specifically there was a special celebration occurring – the 30th anniversary of Star Trek: The Next Generation.  Not even up for debate – it just IS the best incarnation of the Star Trek universe.

So I decided to take a moment to do some research on finding TNG data.  It didn’t take me long to unearth this fantastic data set on GitHub that includes each episode’s script parsed out by character.

Really inspired by the thought of seeing each word of each episode visualized – I set forth on my mission.  As I got started there was one component that was mission critical: the bold colors present throughout the world of Star Trek.  The bold and moody colors of Star Trek are fantastic – especially paired with a black background.  And working with individual scripts meant that I could use color to accentuate different characters – much like their uniforms do in the episodes.

The next component that I wanted to invoke on this (again – design focused here) was the electronics and computer interfaces.  I particularly like the rounded edges and strong geometric shapes that are on the computer screens across all iterations of Star Trek.  So that describes most of the design – the choice of colors and how some of the visualizations were setup.

Now on to the next important component here: analysis.  When you see this visualization you may find yourself realizing that I don’t draw any conclusions.  For this collection of visualizations I am playing the role of curator.  I am developing a visual world for you to interact with, to go deep and wide in your understanding.  I am not attempting to summarize data for you or force conclusions upon you.  I am inviting you to come into the world of Star Trek, unearth who speaks during each episode, find out what that character is saying.  I want there to be an unending number of takeaways and perceptions generated from this.

And the last part you need to understand is the story telling.  This entire visualization has an untold number of stories in it by virtue of it being a visualization of the entire series.  If you want a meta-story to tell it’s simply this: Star Trek The Next Generation is such a deep and rich world that you should go get lost.  And while you’re on the path of getting lost do me a favor: retain some leadership tidbits from Picard and sprinkle in some logical takeaways from Data.

 

Azure + Tableau Server = Flex

I’m affectionately calling this post Azure + Tableau Server = Flex for two reasons.  First – are you a desktop user that has always wanted to extend your skills in Tableau as a platform?  Or perhaps you’re someone who is just inherently curious and gains confidence by learning and doing (I fall into this camp).  Well then this is the blog post for you.

Let me back up a bit.  I am very fortunate to spend a majority of my working time (and an amount of my free time!) advocating for visual analytics and developing data visualizations to support the value it brings.  That means doing things like speaking with end users, developing requirements, partnering with application and database owners/administrators, identifying and documenting important metrics, and finally (admittedly one of the more enjoyable components) partnering with the end user on the build out and functionality of what they’re looking for.  A very iterative process to get to results that have a fair amount of communication and problem solving sprinkled in to pure development time – a lucky job.  The context here is this: as soon as you start enabling people to harness the power of data visualization and visual analytics the immediate next conversation becomes: how can I share this with the world (or ‘my organization’).  Aha!  We’ve just stepped into the world of Tableau Server.

Tableau Server or Tableau Online bring the capabilities to share the visualizations that you’re making with everyone around you.  It does exactly what you want it to do: via a URL share interactive and data rich displays.  Just the thought of it gets me misty-eyed.  But, as with any excellent technology tool it comes with the responsibility of implementation, maintenance, security, cost, and ultimately a lot of planning.  And this is where the desktop developer can hit a wall in taking things to that next level.  When you’re working with IT folks or someone who may have done something like this in the past you’ll be hit with a question wall that runs the entire length of every potential ‘trap’ or ‘gotcha’ moment you’re likely to experience with a sharing platform.  And more than that – you’re tasking with knowing the answers immediately.  Just when you thought getting people to add terms like tooltip, boxplot, and dot plot was exciting they start using words like performance, permissions, and cluster.

So what do you do?  You start reading through administration guides, beefing up your knowledge on the platform, and most likely extending your initial publisher perspective of Tableau Server to the world of sever administrator or site administrator.  But along the way you may get this feeling – I certainly have – I know how to talk about it, but I’ve never touched it.  This is all theoretical – I’ve built out an imaginary instance in my mind a million times, but I’ve never clicked the buttons.  It’s the same as talking through the process of baking and decorating a wedding cake and actually doing it.  And really if we thought about it: you’d be much more likely to trust someone who can say “yeah I’ve baked wedding cakes and served them” opposed to someone who says “I’ve read every article and recipe and how-to in the world on baking wedding cakes.”

Finally we’re getting to the point and this is where Azure comes into play.  Instead of stopping your imaginary implementation process because you don’t have hardware or authority or money to test out an implementation and actually UNBOX the server – instead use Azure and finish it out.  Build the box.

What is Azure?  It’s Microsoft’s extremely deep and rich platform for a wide variety of services in the cloud.  Why should you care?  It gives you the ability to deploy a Tableau Server test environment through a website, oh, and they give you money to get started.  Now I’ll say this right away: Azure isn’t the only one.  There’s also Amazon’s AWS.  I have accounts with both – I’ve used them both.  They are both rich and deep.  I don’t have a preference.  For the sake of this post – Azure was attractive because you get free credits and it’s the tool I used for my last sandbox adventure.

So it’s really easy to get started with Azure.  You can head over to their website and sign up for a trial.  At the time of writing they were offering a 30-day free trial and $200 in credits.  This combination is more than enough resources to be able to get started and building your box.  (BTW: nobody has told me to say these things or offered me money for this – I am writing about this because of my own personal interest).

Now once you get started there are sort of 2 paths you can take.  The first one would be to search the marketplace for Tableau Server.  When you do that there’s literally step by step configuration settings to get to deployment.  You start at the beginning with basic configuration settings and then get all the way to the end.  It’s an easy path to get to the Server, but outside of the scope of where I’m taking this.  Instead we’re going to take the less defined path.

Why not use the marketplace process?  Well I think the less defined path offers the true experience of start to finish.  Hardware sizing through to software installation and configuration.  By building the machine from scratch (albeit it is a Virtual Machine) it would mimic the entire process more closely than using a wizard.  You have fewer guard rails, more opportunity for exploration, and the responsibility of getting to the finish line correctly is completely within your hands.

So here’s how I started: I made a new resource, a Windows Server 2012 R2 Datacenter box.  To do that, you go through the marketplace again and choose that as a box type.  It’s probably a near identical process to the marketplace Tableau Server setup.  Make a box, size the box, add optional features, and go.  To bring it closer to home go through the exercise of minimum requirements vs. recommended requirements from Tableau.  For a single-node box you’ll need to figure out the number of CPUs (cores), the amount of RAM (memory), and the disk space you’ll want.  When I did this originally I tried to start cheap.  I looked through the billing costs of the different machines on Azure and started at the minimum.  In retrospect I would say go with something heavier powered.  You’ll always have the option to resize/re-class the hardware – but starting off with a decent amount of power will prevent slow install experience and degraded initial Server performance.

Once you develop the resource, you literally click a button to boot up the box and get started.  It took probably 15 to 20 minutes for my box to initially be built.  More than I was expecting.

Everything done up to this point it to get to a place where you have your own Tableau Server that you can do whatever you want with.  You can set up the type of security, configure different components – essentially get down to the nitty gritty of what it would feel like to be a server administrator.

Your virtual machine should have access to the internet, so next steps are to go to here and download the software.  Here’s a somewhat pro tip.  Consider downloading a previous version of the server software so that you can upgrade and test out what that feels like.  Consider the difference between major and minor releases and the nuance between what the upgrade process will be.  For this adventure I started with 10.0.11 and ended up upgrading to 10.3.1.

The process of the actual install is on the level of “stupid easy.”  But, you probably wouldn’t feel comfortable saying “stupid easy” unless you’ve actually done it.  There are a few click through windows with clear instructions, but for the most part it installs start to finish without much input from the end user.

You get to this window here once you’ve finished the install process.

This is literally the next step and shows the depths to which you can administer the platform from within the server (from a menu/GUI perspective).  Basic things can be tweaked and setup – the type of authentication, SMTP (email) for alerts and subscriptions, and the all important Run As User account.  Reading through the Tableau Server: Everybody’s Install Guide is the best approach to get to this point.  Especially because of something I alluded to earlier: the majority of this is really in the planning of implementation, not the unboxing or build.

Hopefully by this point the amount of confidence gained in going through this process is going to have you feeling invincible.  You can take your superhero complex to the next level by doing the following tasks:

Start and Stop the Server via Tabadmin.  This is a great exercise because you’re using the command line utility to interact with the Server.  If you’re not someone who spends a lot of time doing these kinds of tasks it can feel weird.  Going through the act of starting and stopping the server will make you feel much more confident.  My personal experience was also interesting here: I like Tabadmin better than interacting with the basic utilities.  You know exactly what’s going on with Tabadmin.  Here’s the difference between the visual status indicator and what you get from Tabadmin.

When you right-click and ask for server status, it takes some time to display the status window.  When you’re doing the same thing in Tabadmin, it’s easier to tell that the machine is ‘thinking.’

Go to the Status section and see what it looks like.  Especially if you’re a power user from the front end (publisher, maybe even site administrator) – seeing the full details of what is in Tableau Server is exciting.

There are some good details in the Settings area as well.  This is where you can add another site if you want.

Once you’ve gotten this far in the process – the future is yours.  You can start to publish workbooks and tinker with settings.  The possibilities are really limitless and you will be working toward understanding and feeling what it means to go through each step.  And of course the best part of it all: if you ruin the box, just destroy it and start over!  You’ve officially detached yourself from the chains of responsibility and are freely developing in a sandbox.  It is your chance to get comfortable and do whatever you want.

I’d even encourage you to interact with the API.  See what you can do with your site.  Even if you use some assisted API process (think Alteryx Output to Tableau Server tool) – you’ll find yourself getting much more savvy at speaking Server and that much closer to owning a deployment in a professional environment.

27 Weeks of #WorkoutWednesday

27 weeks into 2017 means 27 weeks of #WorkoutWednesday.  So it is time to do some reminiscing on the experience and providing some commentary on the profound effect it has had on me.

At the end of 2016 something was abundantly clear to me: I wasn’t as fluid as I could be and I didn’t fully understand the limitless possibilities that Tableau as a tool had.  I have this very abstract concept of how working with Tableau should be for me as an individual: the canvas and tools for my artwork.  And to be honest, that sounds kind of silly coming from someone like me – but I believe it.  When I talk to people I tell them that I’m a data communicator.  I like to see what is in data and share it with the world.  More specifically: I like to share an organized view of several data points so that the end consumer can go exploring and see the beauty of the individual points.

Getting back to the point: this means that I should be capable of wielding the full depth of Tableau.  I wanted to have the ability to orchestrate anything.  I felt it was necessary.  I wanted the feeling of flow to extend.  I didn’t want my creativity to be limited by my lack of practice or time on the clock.

So I set out a few goals for myself for 2017 – 2 that I’ll share, and 1 that this post is about.  The 2 related to my professional development are pretty similar: participate in every #MakeoverMonday and participate in every #WorkoutWednesday.  Show up, do the work, share the results with the global community, and see what happens.

So here we are: 27 weeks into the year.  What has participation done for me?  It’s not enough to say that my skills have grown exponentially.  My confidence and ability to connect with individuals has also grown tremendously.  One thing I did this year in addition to this participation was to facilitate going through the results every month at the Phoenix Tableau User Group.  A critical component from my perspective: communicating out the “why” behind the build along with the “how.”  With two main goals: I would be forced to do the work consistently (selfish) and the other being that the Phoenix Tableau community would benefit and grow from the knowledge share.

Now that the foundation (context) has been set – I’d like to go through each individual workout and share its impact on me.

Week 1: Comparing Year over Year Purchasing Frequencies
I remember this one vividly because it’s the first.  There were two things in this particular Workout that I’d never done.  The first was to use the day of the year in a visualization.  The next was to have dynamic marker points based on a parameter.  One thing that was interesting about this was that I had a sense of how to do the running total calculation because of a Tableau blog post on the “top table calculations.”  Going through this workout was humbling.  It took me a significant amount of time, more time than I thought it should.  It was also the beginning of a now pattern ritual.  I know that I spent a lot of time verbalizing my problem solving process and trying to get to a solution.  And I also remember the sweet satisfaction of solving and posting it.  I was hooked after the first one.

Week 2: Showing Nothing When ‘All’ is Selected
I was really thankful for this week.  There were several things that I already knew how to do, mostly with the custom shapes and how to not show something when ‘All’ was selected.  What I didn’t know how to do well was deal with dashboard titling when ‘All’ was selected.  My past attempts usually landed me in the world of ATTR aka * land.  So going through this challenge really helped me stop and process something that I previously stumbled over.  I got an amount of confidence out of this week because it took less time than the first.

Week 3: The State of U.S. Jobs
Ah I loved this one.  Small multiples are fascinating to me.  And Andy’s blog post gave me the freedom to end up with lots of sheets – he gave mention that it wasn’t a trellis chart and I was immediately relieved.  There was a lot of formatting in this one – some really interesting tricks on how to display things that I learned.  And one that I continue to take with me is this: change row or column borders to thick white to add some padding.  I know when I downloaded Andy’s solution he had 50 sheets, I had 10.  This workout ignited something in me and I made a similar visualization regarding high school graduation rates in Arizona.

Week 4: Common Baseline Charts & NFL QBs
I really liked this Workout from a visual perspective.  I like showing massive amounts of data and then giving someone control over what is the most prominent.  This was also the second visualization that shared with me how you can use running total and baselines to show differences between categories.  This type of visualization is now something I often develop at work.

Week 5: The Distribution and Mean of NFL Quarterbacks
The math nerd inside of me loved this one.  I used to be a huge geek for box plots and I always think showing distributions of things in a visual format is very easy to interpret.  I get this mental image of looking down on a histogram and the fact that this one had the median opposed to the mean got me really jazzed.  I also remember feeling super cool because I successfully flipped the axis labels for the year to the top using a random tip ala Jonathan Drummey.  I also like this one because I had to download fonts from Google Fonts – a resource I didn’t even know was out there.

Week 6: UK Population Predictions – Trellis Butterfly Chart
The appearance of the word trellis had me cringing.  Looking at the visualization had me intrigued.  There was a LOT of depth in this one.  Knowing there was comparison to a national average and knowing that there was multiple dual-axis charts PLUS bake on the trellis component had me concerned.  You know what ended up being the worst part for me on this one?  The year labels and the tooltips.  Each LOD in that tooltip was a validation point I had to go through to determine if my calculation was accurate.  This workout made me appreciate reversing axes.

Week 7: Dynamic Trellis Chart
It finally happened.  I couldn’t fake a trellis chart anymore and hard code different row & column locations – I had to use the capabilities of Tableau to achieve.  More than that was some very sophisticated labeling that I just couldn’t get right for the life of me.  This is the only one that I gave up on.  I couldn’t figure it out and I was a little too prideful to download Andy’s workbook and USE his calculations.  I definitely downloaded and digested the process, but I didn’t feel it was authentic to me to finish the exercise – I was beaten this week.

Week 8: Marimekko Makeover
I thought this one was going to be cake.  I thought it was going to be a cake walk because I had briefly thumbed through an article about Tableau 10 and the ability to make these types of charts.  I was wrong.  The way the data was structured made it more complex.  I shared this one at the Phoenix Tableau User Group and the whole time I was concerned that the “table calculation magic” may not be repeatable.  We made that Marimekko chart.

Week 9: World Series Game 7: Pitch-By-Pitch
I love this visualization.  I love how granular it is.  I love how abstract it is.  I love that there is color and shape encoding and even negative and positive positioning.  I also really like using characters within text to denote what is being seen in a visualization – all clever things that I do now.  As I look back I remember the one sore spot for me that I decided not to correct for: the open “reached base” shape.  I didn’t put white in the middle.  Looking back I should have – I was being lazy.  I knew how to do it and that it was the right thing to do to get it “to spec.”  But the lazier side of me won out and let it go.

Week 10: Exploring UK House Prices
This one I knew I would need help on.  I’d never made a hexbin map and I didn’t know where to start.  What’s surprising is that it’s not overly complicated.  I didn’t realize that there were built in hexbin functions.  I thought there was some deep crazy skill going on anytime I saw these.  Walking through this exercise made me change my tune.  This was also an important growth week for me.  I started getting more comfortable with the idea that it wasn’t “cheating” to use community made resources as help and guidance.  Instead I was using them for their rightful purpose.

Week 11: Full Year Calendar with Month Labels
This one has another interesting story.  I completed it last weekend (the 26th week) opposed to the 11th week.  So how did that happen?  Well I remember starting it and getting stuck.  I couldn’t figure out 2 things to begin with: how to get the dates in order (which sounds really lame) and how to deal with the month labels.  This was also right around the time where I changed jobs and was trying to finish my MBA.  I think the challenge this one presented exhausted me from a mental perspective.  Week 11 was the start of my workout break (check out my tracker to see the details).  Once I completed it though, I was very pleased with the results.  I made a conscious decision to go a different path with the month labels and embed them into each month’s calendar.  I really like that I’m now comfortable going off spec and not feeling like I’m not living up to the challenge.

Week 12: Highlight a Treemap
I’ll admit it, this one was simple for me to do.  When I came back after my mental break and did this one, I laughed at why I hadn’t done it sooner.  I appreciate the simplicity of this one in development and the impact it has on making the end-user’s experience so much more pleasant.

Week 13: Benford’s Law
Another straightforward week for me from the development perspective.  When I completed this I started to realize that I know a lot.  I know how Andy typically develops and what his tricks are.  I know how to take something displayed and translate it into Tableau.  This is a workout I completed on 6/3/17.  Six months after embarking on the #MakeoverMonday and #WorkoutWednesday challenge.  The immersion was paying off in spades.

Week 14: UK Exports Pareto
I didn’t complete this one on time, but relatively close to its original time period.  I ended up sharing this one at the Phoenix Tableau User Group.  The first job I ever had at “analyzing data” I was asked during an interview to build a Pareto chart in Excel.  I memorized how to do it because I couldn’t describe the technical mechanics.  That was more than 3 years ago and feels like an eternity.  Today pareto charts are still some of the most engaging and useful visuals that I use when trying to assess a problem.

Week 15: How many times has a team been top of the Premier League?
Okay – this was just a community challenge with no hidden agenda.  One designed from my perspective to test and share the difference between table calculations and level of detail expressions.  I remember completing this and realizing that life before LODs must have been terrible.  And that there are some extremely talented problem solvers and thinkers out there who can develop solutions using the tools they have.

Week 16: Should I Buy Tableau Shares?
I remember this one vividly because it mirrored something I was doing at work.  It was a different take on a visualization I was trying to get people to accept.  I appreciated seeing window calculations for statistical values being present and giving users input flexibility.

Week 17: Product Spread Treemap (Part 1)
Intentionally named Part 1 – this one made me recognize the funny mechanics that Tableau has.  They’re really obvious when you make a treemap.  Just test it out and you’ll see that the order of pills on the marks card determines how the visual will be generated.  It also taught me an important lesson: sometimes I over complicate things.  Pre-build I had imagined the colored texts as separate worksheets, going through the build I was humbled realizing it could be one.

Week 18: Appearing and Disappearing Vizzes
This one also made an appearance at the Phoenix Tableau User Group.  And to be perfectly honest, Emma’s topics are usually much more practical for the group.  I took this one as an opportunity to explore between tiled and floating layouts.  When I demoed this to the TUG everyone loved it.  I know several users who took this back to their professional lives.  Thank you Emma.

Week 19: Product Spread Treemap (Part 2)
The agony of this one.  Andy mentioned it was going to be tough and it was.  I had a sense that there was trickery involved because of the automagic nature of treemaps seen in Week 17.  The spoiler on this one: the boxes are different sizes.  This one also made an appearance within our user group at our Saturday Viz Club.  4 members got together and collaborated on trying to build it out and downloading Andy’s solution.

Week 20: Comparing Regions
Perhaps a more appropriate name would building out bar + line charts all in one view with the bars next to each other.  This one was damning to me.  It took me a long time to parse out the ‘aha’ factor and put it into action.

Week 21: NCAA Final Score-by-Score
This was another great challenge, do everything in a single worksheet.  The biggest challenge here was the data structure.  I think if I had taken time to restructure the data set that it would have been easier to develop – but being who I am, I took it on as part of the challenge.  I realized when I finished this one that I did it a different way than Andy because I had dots everywhere and no attribute stars *.  I kind of feel like it makes mine more complete.

Week 22: Wine Tasting is Harder Than it Looks
Guess what – this was also presented at our user group!  What’s great about this is the challenges that the community faced as we built it together.  When asked before the build, most had never thought to make a visualization of this type.  When participating in the build the color legends were a huge curve ball.  Even the most tenured individuals didn’t think to make the color legend an actual sheet.  I also had a colleague tell me that he didn’t realize you could drag headers in the view to change their order – he thought that was life changing.

Week 23: National Parks Have Never Been More Popular
Simply a stunning visualization to recreate.  A bump chart, vivid use of color and text color matching line color.  I love this visualization.  I shared this on LinkedIn and got reactions from so many people.  I know it is something that has been imprinted on many many people.

Week 24: Visualising the National Student Survey with Spine Charts
I wrote a blog post on this one, so there’s a lot there.  This one is still pretty fresh in my mind.  The biggest things regarding this one are related to the mathematics under the hood.  At the way numbers can do funny things.  How at the end of this exercise I opened my version, Emma’s version, and Andy’s version and we all had different numbers for the same question response.  And we could all defend equally the reasoning and logic behind how the number was derived.

Week 25: The Value of Top 3 & Top 5 Contributors
This taught me so much about table calculations.  I use them in basic ways on a daily basis – this workout takes them to another level.  I had never thought to use a table calculation to limit the number of members within a dimension from being displayed.  Once I did it – it made perfect sense.  The order of Tableau filters immediately came to my mind.  I am still in awe of the depth and thoughtfulness here.

Week 26: UK General Election 2017 Results
Another dynamic Trellis chart – no no no.  I do not like these!  I like the presentation and layout, the slope charts, the way they look like ladders.  I like the reference lines.  I don’t like dynamic trellis.  I am not convinced that the approach to dynamic trellis can be let loose in the wild – it needs some supervision.  Comparing mine to the original I noticed how easy it was for data points to be indexed into wrong blocks.

Week 27: The Quadrant Chart
As if by fate this week’s workout resonated deeply with a visualization from my history.  More than a year ago I made a quadrant chart regarding wage gaps.  I really like that Andy took the time to color the tool tips to add effect.  It demonstrates something that I now know to be true: duplicating and iterating off of a sheet or a calculated field is something you should be doing often.  Copy and paste is your friend.  Duplicate is music to my ears.

Cheers to 27 weeks – I’m on board for the rest of the year.  As I alluded to, I made a progress tracker on my Tableau Public (and also on this site) to keep myself accountable.  While I can’t guarantee it will be done in the same week, I can say with a true heart my intention is to complete the year at 100%.

If you haven’t started the adventure of the workouts, or if you’ve done a few – I strongly encourage you to take a Saturday afternoon and go through the exercises.  Don’t look at them and lazily say “oh I could totally do that.”  DO THE WORK.  It will help you grow tremendously, unearth skill gaps, and unlock your creativity.  Thank you Andy & Emma.

#WorkoutWednesday Week 24 – Math Musings

The Workout Wednesday for week 24 is a great way to represent where a result for a particular value falls with respect to a broader collection.  I’ve used a spine chart recently on a project where most data was centered around certain points and I wanted to show the range.  Propagating maximums, minimums, averages, quartiles, and (when appropriate) medians can help to profile data very effectively.

So I started off really enjoying where this visualization was going.  Also because the spine chart I made on a recent project was before I even knew the thing I developed had already been named.  (Sad on my part, I should read more!)

My enjoyment turned into caution really quickly once I saw the data set.  There are several ratios in the data set and very few counts/sums of things.  My math brain screams trap!  Especially when we start tiptoeing into the world of what we semantically call “average of all” or “overall average” or something that somehow represents a larger collective (“everybody”).  There is a lot of open-ended interpretation that goes into this particular calculation and when you’re working with pre-computed ratios it gets really tricky really quickly.

Here’s a picture of the underlying data set:

 

Some things to notice right away – the ratios for each response are pre-computed.  The number of responses is different for each institution.  (To simplify this view, I’m on one year and one question).

So the heart of the initial question is this: if I want to compare my results to the overall results, how would I do that?  Now there are probably 2 distinct camps here.  1: take the average of one of the columns and use that to represent the “overall average”.  Let’s be clear on what that is: it is the average pre-computed ratio of a survey.  It is NOT the observed percentage of all individuals surveyed.  That would be option 2: the weighted average.  For the weighted average or to calculate a representation of all respondents we could add up all the qualifying respondents answering ‘agree’ and divide it by the total respondents.

Now we all know this concept of average of an average vs. weighted average can cause issues.  Specifically we’d feel the friction immediately if there were several low-end responses commingled with several higher response capturing entities.  EX: Place A: 2 people out of 2 answered yes (100%) and  Place B: 5 out of 100 answered ‘yes’ (5%).  If we average 100% and 5% we’ll get 52.5%.  But what if we take 7 out of 102, that’s 6.86% – a way different number.  (Intentionally extreme example.)

So my math brain was convinced that the “overall average” or “ratio for all” should be inclusive of the weights of each Institution.  That was fairly easy to compensate for: take each ratio and multiply it by the number of respondents to get raw counts and then add those all back up together.

The next sort of messy thing to deal with was finding the minimums and maximums of these values.  It seems straightforward, but when reviewing the data set and the specifications of what is being displayed there’s caution to throw with regard to level of aggregation and how the data is filtered.  As an example, depending on how the ratios are leveraged, you could end up finding the minimum of 3 differently weighted subjects to a subject group.  You could also probably find the minimum Institution + subject result at the subject level of all the subjects within a group.  Again I think the best bet here is to tread cautiously over the ratios and get into raw counts as quickly as possible.

So what does this all mean?  To me it means tread carefully and ask clear questions about what people are trying to measure.  This is also where I will go the distance and include calculations in tool tips to help demonstrate what the values I am calculating represent.  Ratios are tricky and averaging them is even trickier.  There likely isn’t a perfect way to deal with them and it’s something we all witness consistently throughout our professional lives (how many of us have averaged a pre-computed average handle time?).

Beyond the math tangent – I want to reiterate how great a visualization I think this is.  I also want to highlight that because I went deep-end math on it that I decided to go deep end development different.

The main difference from the development perspective?  Instead of using reference bands, I used a gannt bar as the IQR.  I really like using the bar because it gives users an easier target to hover over.  It also reduce some of the noise of the default labeling that occurs with reference lines.  To create the gannt bar – simply compute the IQR as a calculated field and use it as the size.  You can select one of the percentile points to be the start of the mark.

#MakeoverMonday Week 25 | Maricopa County Ozone Readings

We had another giant data set this week – 202 million records of EPA Ozone readings across the United States.  The giant data set is generously hosted by Exasol.  I encourage you to register here to gain access to the data.

The heart of the data is pretty straight forward – PPM readings across several sites around the nation for the past 25+ years.  As I went through and browsed the data set, it’s easy to see that there are multiple readings per site per day.  Here’s the basic data model:

Parameter Name only has Ozone, Units of Measure only has Parts per million.  There is one little tweak to this data set – the Datum field.  Now this wasn’t a familiar term for me, so I described the domain to see what it had.

I know exactly what one of these 4 things means (beyond Unknown) – that’s WGS84.  I was literally at the Alteryx Inspire conference two weeks ago and in a Spatial Analytics session where people were talking about different standards for coordinate systems on Earth.  The facilitators mentioned that WGS84 was a main standard.  For fun I decided to plot the number of records for each Datum per year to see how the Lat/Lon have potentially changed in measurement over time.  Since 2012 it seems like WGS84 has dominated as the preferred standard.

So armed with that knowledge I sort of kept it in my back pocket of something I may need to be mindful of if I enter the world of mapping.

Beyond that, I had to start my focus on preparing something for Tableau Public.  202 million records unfortunately won’t sit on Public and I have to extract the data.  Naturally I did what every human would do and zeroed in on my city: Phoenix Metropolitan area aka Maricopa County.

So going through the data set there are multiple sites that are taking measurements.  And more than that, these sites are taking measurements multiple times per day.  I really wanted to express that somehow in my final visualization.  Here’s all the site averages plotted each day for the past 30 years – thanks Exasol!

So this is averaged per day per site – and you can see how much variation there is.  Some are reporting very low numbers, even zeros.  Some are very high.

If I take off the site ID, here’s what I get for the daily averages:

Notice the Y-axis – much less dramatic.  Now the EPA has the AQI measurements and it doesn’t even get into the “bad” range until 0.071 PPM (Unhealthy for Sensitive Groups).  So there’s less of a story to some extent when we take the averages.  This COULD be because of the sites in Maricopa county (maybe there are low or faulty numbers dragging down the average) or it could be because when you do the average you’re getting better precision of truth.

I’m going down this path because at this point I decided to make a decision: I wanted to look at the maximum daily measurement.  Given that these are instantaneous measurements, I felt that knowing the maximum measurement in a given day would also provide insight and value into how Ozone levels are faring.  And more specifically, knowing my region a little bit – the measurement sites could be outside of well populated areas and may naturally have lower occurring measurements.

So that was step one for me: move to the world of MAX.  This let me leverage all the site data and get going.  (Also originally I wanted to jitter and display all the sites because I thought that would be interesting – I distilled the data down further because I wasn’t getting what I wanted in terms of presentation in the end result).

Okay – next up was plotting the data.  I wanted to do a single page very dense data display that had all the years and the months and allowed for easy comparisons.  I had thought a cycle plot may be appropriate, but after trying a few combinations I didn’t see anything special about day of the week additions and noticed that the measurement really is about time of year (the month).  Secondary comparison being each year.

Now that I’ve covered that part – next up was how to plot.  Again, this originally started out its life as dots that were going to be color encoded using the AQI scale with PPM on the Y-axis.  And I almost published it that way.  But to be honest with you, I don’t know if the minutia of the PPM really matters that much.  I think that dimension defined on top of the measurement is easier for an end user to understand.  Hence my final development fork: turn the categorical result into a unit measure (1, 2, 3, 4 etc.) as a byproduct to represent height of a bar chart.  And that’s where I got really inspired.  I made “Good” -1 and “Moderate” 0.  That way anything positive on the Y-axis is a bad day.  To me this will allow you to see the streaks of bad throughout the time periods.

Close up of 2015 – I love this.  Look at those moderates just continuing the axis.  Look how clear the not so good to very bad is.  This resonates with me.

Okay – so final steps here were going to be to have a map of all the measurements at each site (again the max for each site based on the user clicking a day).  It was actually quite cute showing Phoenix more close up.  And then I was going to have national readings (max for each site upon clicking a day) as a comparison.  This would have been super awesome – here’s the picture:

So good.  And perhaps I could have kept this, but knowing I have to go to Tableau Public – it just isn’t going to handle the national data well.  So I sat on this for an evening and while I was driving to work I decided to do a marginal chart that showed the breakdown of number of days of each type.  The “why” was because it looks like things are getting better – more attention needs to be drawn to that!

So last steps ended up being to add on the marginal bar charts and then go one step further to isolate the “bad days” per year and have them be the final distilled metric at the far far right.  My thought process: scan each year, get an idea of performance, see it aggregated to the bar chart, then see the bad as a single number.  For sheer visual pleasure I decided to distill the “bad” further into one more chart.  I had a stacked bar chart to start, but didn’t like it.  I figured for the sake of artistry I could get away with the area chart and I really like the effect it brings.  You can see that the “very bad” days have become less prominent in recent years.

So that pretty much sums up the development process.  Here’s the full viz again and a comparison to the original output for Maricopa County, which echos the sentiment of my maximums – Ozone measurements are going down.

 

 

#MakeoverMonday Week 24 – The Watercolours of Tate

First – I apologize.  I did a lot of web editing this week that has led to a series of system fails.  The first was spelling the hashtag wrong.  Next I decided to re-upload the workbook and ruin the bit link.  What will be the next fail?

Anyway – to rectify the series of fails I decided that the best thing to do would be to create a blog post.  Blog posts merit new tweets and new links!

So week 24’s data was the Tate Collection, which upon click through of this link indicates it is a decent approximation of artwork housed at Tate.

Looking at the underlying data set, here’s the columns we get:

And the records:

So I started off decently excited about the fact that there were 2 URLs to leverage in the data set.  One with just a thumbnail image only and the other a full link to the asset.  However, the Tate website can’t be accessed via HTTPS, so it doesn’t work for on dashboard URLs on Tableau Public.  I guess Tableau wants us to be secure – and I respect that!

So my first idea of going the route of all float with an image in the background was out.

Now my next idea was to limit the data set.  I had originally thought to do the “Castles of Tate” – check out the number of titles:

A solid number: 2,791 works of art.  A great foundation for the underneath.  Except of course for what we knew to be true of the data: Turner.

Sigh – this bummed me out.  Apparently only Turner really likes to label works of art with “Castle.”  Same was true for River and Mountain.  Fortunately I was able to easily see that using the URL actions on Tableau Desktop (again can’t do that on Public because of security reasons):

Here is a classic Turner castle:

Now yes, it is artwork – but doesn’t necessarily evoke what I was looking to unearth in the Tate collection.

So I went another path, focusing on the medium.  There was a decent collection of watercolour (intentional European spelling).  And within that a few additional artist representations beyond our good friend Turner.

So this informed the rest of the visualization.  Lucky for me there was a decent amount of distribution date wise, both from a creation and acquisition standpoint.  This allowed me to do some really pretty things with binned time buckets.  And inspired by the Tate logo: I took a very abstract approach to the visualization this week.  The output is intentionally meant for data discovery.  I am not deriving insights for you, I’m building a view for you to explore.

One of my most favorite elements is the small multiples bubble chart.  This is not intended to aid in cognition, this is intended to be artwork of artwork.  I think that pretty much describes the entire visualization if I’m being honest.  Something that could stand alone as a picture perhaps or be drilled deep to the depths of going to each piece’s website and finding out more.

Some oddities with color I explored this week included: using an index and placing that on the color shelf with a diverging color palette (that’s what is coloring the bubble charts).  And also using modulo on the individual asset names to spark some fun visual encoding.  Better than all one color, I felt breaking up the values in a programmatic way would be fun and different.

Perhaps my most favorite of this is the top section with the bubble charts and bar charts below with the binned year ranges between.  Pure data art blots.

Here’s the full visualization on Tableau Public – I promise not to tinker further with the URLs.

#WorkoutWednesday Week 23 – American National Parks

I’m now back in full force from an amazing analytics experience at the Alteryx Inspire conference in Las Vegas.  The week was packed with learning, inspiration, and community – things I adore and am honored to be a part of.  Despite the awesome nature of the event, I have to admit I’m happy to be home and keeping up with my workout routine.

So here goes the “how” of this week’s Workout Wednesday week 23.  Specifications and backstory can be found on Andy’s blog here.

Here’s a picture of my final product and my general assessment of what would be required for approach:

Things you can see from the static image that will be required –

  • Y axis grid lines are on specific demarcations with ordinal indicators
  • X-axis also has specific years marked
  • Colors are for specific parks
  • Bump chart of parks is fairly straight forward, will require index() calculation
  • Labels are only on colored lines – tricky

Now here’s the animated version showing how interactivity works

  • Highlight box has specific actions
    • When ‘none’ is selected, defaults to static image
    • When park of specific color is selected, only that park has different coloration and it is labeled
    • When park of unspecified color is selected, only that park has different coloration (black) and it is labeled

Getting started is the easy part here – building the bump chart.  Based on the data set and instructions it’s important to recognize that this is limited to parks of type ‘National Historical Park’ and ‘National Park.’  Here’s the basic bump chart setup:

and the custom sort for the table calculation:

Describing this is pretty straight for – index (rank) each park by the descending sum of recreation visitors every year.  Once you’ve got that setup, flipping the Y-axis to reversed will get you to the basic layout you’re trying to achieve.

Now – the grid lines and the y-axis header.  Perhaps I’ve been at this game too long, but anytime I notice custom grid lines I immediately think of reference lines.  Adding constant reference lines gives ultimate flexibility in what they’re labelled with and how they’re displayed.  So each of the rank grid lines are reference lines.  You can add the ‘Rank’ header to the axis by creating an ad-hoc calculation of a text string called ‘Rank.’  A quick note on this: if you add dimensions and measures to your sheet be prepared to double check and modify your table calculations.  Sometimes dimensions get incorporated when it wasn’t intended.

Now on to the most challenging part of this visualization: the coloration and labels.  I’ll start by saying there are probably several ways to complete this task and this represents my approach (not necessarily the most efficient one):

First up: making colors for specific parks called out:

(probably should have just used the Grouping functionality, but I’m a fast typer)

Then making a parameter to allow for highlighting:

(you’ll notice here that I had the right subset of parks, this is because I made the Park Type a data source filter and later an extract filter – thus removing them from the domain)

Once the parameter is made, build in functionality for that:

And then I set a calculation to dynamically flip between the two calculations depending on what the parameter was set to.

Looking back on this: I didn’t need the third calculation, it’s exactly the same functionality as the second one.  In fact as I write this, I tested it using the second calculation only and it functions just fine.  I think the over-build speaks to my thought process.

  1. First let’s isolate and color the specific parks
  2. Let’s make all the others a certain color
  3. Adding in the parameter functionality, I need the colors to be there if it is set to ‘(None)’
  4. Otherwise I need it to be black
  5. And just for kicks, let’s ensure that when the parameter is set to ‘(None)’ that I really want it to be the colors I’ve specified in the first calc
  6. Otherwise I want the functionality to follow calc 2

Here’s the last bit of logic to get the labels on the lines.  Essentially I know we’re going to want to label the end point and because of functionality I’m going to have to require all labels to be visible and determine which ones actually have values for the label.  PS: I’m really happy to use that match color functionality on this viz.

And the label setting:

That wraps up the build for this week’s workout with the last components being to add in additional components to the tooltip and to stylize.  A great workout that demonstrates the compelling nature of interactive visualization and the always compelling bump chart.

Interact with the full visualization here on my Tableau Public.

#MakeoverMonday Week 22 – Internet Usage by Country

This week’s data set demonstrates the number of users per 100 people by country spanning several years.  The original data set and accompanying visualization starts as an interactive map with the ability to animate through the changing values year by year.  Additionally, the interactor can click into a country to see percentage changes or the comparative changes with multiple countries.

Channeling my inner Hans Rosling – I was drawn to play through the animation of the change by year, starting with 1960.  What sort of narrative could I see play out?

Perhaps it was the developer inside of me, but I couldn’t get over the color legend.  For the first 30 years (1960 to 1989) there’s only a few data points, all signifying zero.  Why?  Does this mean that those few countries actually measured this value in those years, or is it just bad data?  Moving past the first 30 years, my mind was starting to try and resolve the rest of usage changes.  However – here again my mind was hurt by the coloration.  The color legend shifts from year to year.  There’s always a green, greenish yellow, yellow, orange, and red.  How am I to ascertain growth or change when I must constantly refer to the legend?  Sure there’s something to say about comparing country to country, but it loses alignment once you start paginating through the years.

Moving past my general take on the visualization – there were certain things I picked up on and wanted to carry forward on my makeover.  The first was the value out of 100 people.  Because I noticed that the color legend was increasing year to year, this meant that overall number of users was increasing.  Similarly, when thinking about comparing the countries, coloration changed, meaning ranks were changing.

I’ll tell you – my mind was originally drawn to the idea of 3 slope charts sitting next to each other.  One representing the first 5 years, then next 5 years, and so on.  Each country as a line.  Well that wasn’t really possible because the data has 1990 to 2000 as the first set of years – so I went down the path of the first 10 years.  It doesn’t tell me much other than something somewhat obvious: internet usage exploded from 1990 to 2000.

Here’s how the full set would have maybe played out:

This is perhaps a bit more interesting, but my mind doesn’t like the 10 year gap between 1990 and 2000, five year gaps from 2000 to 2010, and then annual measurements from 2010 to 2015 (that I didn’t include on this chart).  More to the point, it seems to me that 2000 may be a better starting measurement point.  And it created the inflection point of my narrative.

Looking at this chart – I went ahead and decided my narrative would be to understand not only how much more internet usage there is per country, but to also demonstrate how certain countries have grown throughout the time periods.  I limited the data set to the top 50 in 2015 to eliminate some of the data noise (there were 196 members in the country domain, when I cut it to 100 there were still some 0s in 2000).

To help demonstrate that usage was just overall more prolific, I developed a consistent dimension to block out number of users.  So as you read it – it goes from light gray to blue depending on the value.  The point being that as we get nearer in time, there’s more dark blue, no light gray.

And then I went the route of a bump chart to show how the ranks have changed.  Norway had been at the top of the charts, now it’s Iceland.  When you hover over the lines you can see what happened.  And in some cases it makes sense, a country is already dominating usage, increasing can only go so far.

But there are some amazing stories that can unfold in this data set: check out Andorra.  It went from #33 all the way up to #3.

You can take this visualization and step back into different years and benchmark each country on how prolific internet usage was during the time.  And do direct peer comparatives to boot.

This one deserves time spent focused on the interactivity of the visualization.  That’s part of the reason why it is so dense at first glance.  I’m intentionally trying to get the end user to see 3 things up front: overall internet usage in 2000 (by size and color encoding) and the starting rank of countries, the overall global increase in internet usage (demonstrated by coloration change over the spans), and then who the current usage leader is.

Take some time to play with the visualization here.

Workout Wednesday Week 21 – Part 1 (My approach to existing structure)

This week’s Workout Wednesday had us taking NCAA data and developing a single chart that showed the cumulative progression of a basketball game.  More specifically a line chart where the X axis is countdown of time and the Y axis is current score.  There’s some additional detail in the form of the size of each dot representing 1, 2, or 3 points.  (see cover photo)

Here’s what the underlying data set looks like:

Comparing the data structure to the image and what needs to be produced my brain started to hurt.  Some things I noticed right away:

  • Teams are in separate columns
  • Score is consolidated into one column and only displayed when it changes
  • Time amount is in 20 minute increments and resets each half
  • Flavor text (detail) is in separate columns (the team columns)
  • Event ID restarts each half, seriously.

My mind doesn’t like that there’s a team dimension that’s not in the same column.  It doesn’t like the restarting time either.  It really doesn’t like the way the score is done.  These aren’t numbers I can aggregate together, they are raw outputs that are in a string format.

Nonetheless, my goal for the Workout was to take what I had in that structure and see if I could make the viz.  What I don’t know is this: did Andy do it the same way?

My approach:

First I needed to get the X axis working.  I’ve done a good bit of work with time so I knew a few things needed to happen.  The first part was to convert what was in MM:SS to seconds.  I did this in my mind to change the data to a continuous axis that I could format into MM:SS format.  Here’s the calculation:

I cheated and didn’t write my calculated field for longevity.  I saw that there was a dropped digit in the data and compensated by breaking it up into two parts.  Probably a more holistic way to do this would be to say if it is of length 4 then append a 0 to the string and then go about the same process.  Here’s the described results showing the domain:

Validation check: the time goes from 0 to 20 minutes (0 to 20*60 seconds aka 1200 seconds).  We’re good.

Next I needed to format that time into MM:SS continuous format.  I took that calculation from Jonathan Drummey.  I’ve used this more than once, so my google search is appropriately ‘Jonathan Drummey time formatting.’  So the resultant time ‘measure’ was almost there, but I wasn’t taking into consideration the +20 minutes for the first half and that the time axis was full game duration.  So here’s the two calculations that I made (first is +20 mins, then the formatting):

At this point I felt like I was kind of getting somewhere – almost to the point of making the line chart, but I needed to break apart the teams.  For that bit I leveraged the fact that the individual team fields only have details in them when that team scores.  Here’s the calc:

I still don’t have a lot going on – at best I have a dot plot where I can draw out the event ID and start plotting the individual points.

So to get the score was relatively easy.  I also did this in a custom to the data set kind of way with 3 calculations – find the left score, find the right score, then tag the scores to the teams.

Throwing that on rows, here’s the viz:

All the events are out of order and this is really difficult to understand.  To get closer to the view I did a few things all at once:

  • Reverse the time axis
  • Add Sum of the Team Score to the path
  • Put a combined half + event field on detail (since event restarts per half)

Also – I tried Event & Half separately and my lines weren’t connected (broken at half time; so creating a derived combined field proved useful at connecting the line for me)

Here’s that viz:

It’s looking really good.  Next steps are to get the dots to represent the ball sizes.

One of my last calculations:

That got dropped on size on a duplicated and synchronized “Team Score.”  To get the pesky null to not display from the legend was a simple right click and ‘hide.’  I also had to sort the Ball Size dimensions to align with the perceived sizing.  Also the line size was made super skinny.

Now some cool things happened because of how I did this:  I could leverage the right and left scores for tooltips.  I could also leverage them in the titling of the overall scores UNC = {MAX([LeftScore]}.

Probably the last component was counting the number of baskets (within the scope of making it a single returned value in a title per the specs of the ask).  Those were repeated LODs:

And thankfully the final component of the over sized scores on the last marks could be accomplished by the ‘Always Show’ option.

Now I profess this may not be the most efficient way to develop the result, heck here’s what my final sheet looks like:

All that being said: I definitely accomplished the task.

In Part 2 of this series, I’ll be dissecting how Andy approached it.  We obviously did something different because it seems like he may have used the Attribute function (saw some * in tooltips).  My final viz has all data points and no asterisks ex: 22:03 remaining UNC.  Looking at that part, mine has each individual point and the score at each instantaneous spot, his drops the score.  Could it be that he tiptoed around the data structure in a very different way?

I encourage you to download the workbook and review what I did via Tableau Public.