Tag: tableau

  • Without Water an Iron Viz feeder

    Without Water an Iron Viz feeder

    Jump directly to the viz

    At the time of writing it is 100°F outside my window in Arizona and climbing.  It’s also August and we’re right in the middle of feeder round 3 for Tableau Public’s Iron Viz contest.  Appropriately timed, the theme for this round is water.  So it’s only fitting that my submission for this round would take into consideration the mashup of these two and form my submission: Without Water, 2 decades of drought & damage in Arizona.

    The Genesis of the Idea

    I’ll start by saying that water is a very tricky topic.  The commonplace of it makes searching for data and a narrative direction challenging.  It’s necessary for sustaining life, so it seems to want to have a story tied directly to humankind – something closely related to water quality, water availability, loss of water – essentially something that impacts humans.  And because it’s so vital, there are actually several organizations and resources doing fantastic things to demonstrate the points above.  Unicef tracks drinking water and sanitation, Our World in Data has a lengthy section devoted to the topic, there’s the Flint Water Study, and the Deepwater Horizon oil spill.

    This realization around the plethora of amazing resources associated with water led me to the conclusion that I would have to get personal and share a story not broadly known.  So what could be more personal than the place I’ve called home for 14 years of my life: Arizona.

    Arizona is a very interesting state, it’s home to the Grand Canyon, several mountain ranges, and of course a significant portion of the Sonoran desert.  This means that in October it can be snowing in the mountains of Flagstaff and a stifling 90°F two hours south in Phoenix.  And, despite the desert, it needs water.  Particularly in the large uninhabited sections of the mountains covered with forests.  Getting to the punchline: since my time in Arizona, the state has been in a long sustained drought.  A drought that’s caused massive wildfires, extreme summer heat, and conversation thread that never steers far from the weather.

    Getting Started

    A quick google search led me to my first major resource: NOAA has a very easy to use data portal for climate data which includes: precipitation, various drought indices, and temperatures – all by month/state/and division.  This served as the initial data set along with the joining of climate division shapefiles maintained by NCEI.  Here’s the first chart I made showing the divisions by their drought index.  This uses the long term Palmer Drought Severity Index and any positive values (non-drought) are zeroed out to focus attention on deficit.

    My next major find was around wildfire data from the Federal Fire Occurrence website.  Knowing that fire is closely associated with drought, it seemed a natural progression to include.  Here’s an early iteration of total acres destroyed by year:

    It’s clear that after 2002 a new normal was established.  Every few years massive fires were taking place.

    And after the combination of these two data sets – the story started developing further – it was a time bound story of the last 20 years.

    Telling the Story

    I headed down the path of breaking out the most relevant drought headlines by year with the idea of creating 20 micro visualizations.  Several more data sources were added (including dust storms, heat related deaths, and water supply/demand).  An early iteration had them in a 4 x 5 grid:

    As the elements started to come together, it was time to share and seek feedback.  Luke Stanke was the first to see and gave me the idea of changing  from a static grid to a scrolling mobile story.  And that’s where things began to lock into place.  Several iterations later and with input from previous Iron Viz winner Curtis Harris – the collection of visualizations was starting to become more precisely tuned to the story.  White space became more defined and charts were sharpened.

    My final pass of feedback included outsourcing to Arizona friends (including Josh Jackson) to ask if it evoked the story we’re all experiencing and it’s what led to the ultimate change in titles from years to pseudo-headlines.

    Wrapping Up

    My one last lingering question: Mobile only or to include a desktop version?  The ultimate choice and deciding factor was to create a medium and version that was optimized for getting to the largest end audience – thus, mobile only.

    WITHOUT WATER

    And now that all leads to the final product.  A mobile only narrative data story highlighting the many facets of drought and it’s consequences for the state of Arizona.  Click on the image to view the interactive version on Tableau Public.

    click to view on Tableau Public

     

     

  • Building an Interactive Visual Resume using Tableau

    Building an Interactive Visual Resume using Tableau

    click to interact on Tableau Public

    In the age of the connected professional world it’s important to distinguish and differentiate yourself.  When it comes to the visual analytics space, a great way to do that is an interactive resume.  Building out a resume in Tableau and posting it on Tableau Public allows prospective employers to get firsthand insight into your skills and style – it also provides an opportunity for you to share your professional experience in a public format.

    Making an interactive resume in Tableau is relatively simple – what turns out to be more complex is how you decide to organize your design.  With so many skills, achievements, and facts competing for attention, it’s important for you to decide what’s most important.  How do you want your resume to be received?

    In making my own resume, my focus was on my professional proficiency across larger analytics domains, strength in specific analytics skills, and experience in different in industries.  I limited each of these components to my personal top 5, so that it is clear to the audience what areas hold the most interest for me (and I’m most skilled in).

    Additionally, I also wanted to spend a significant amount of real estate highlighting my community participation.  After plotting a gantt chart of my education and work experience, I realized that the last two years are jam packed with speaking engagements and activities that would be dwarfed on a traditional timeline.  To compensate for this, I decided to explode the last two years into its own timeline in the bottom dot plot.  This allowed for color encoding of significant milestones and additional detail on each event.

    The other two components of the resume serve importance as well.  I’ve chosen to demonstrate experience in terms of years (a traditional metric to demonstrate expertise) with the highest level of certification or professional attainment denoted along each bar.  And finally, including a traditional timeline of my education and work experience.  The “where” of my work experience is less important than the “what,” so significant detail was spent adding role responsibilities and accomplishments.

    Once you’ve decided how you want to draw attention to your resume, it’s time to build out the right data structure to support it.  To build out a gantt chart of different professional roles a simple table with the type of record, name of the role, start date, end date, company, flag for if it’s current role, and a few sentences of detail should suffice.

    This table structure also works well for the years of experience and community involvement sections.

    You may also want to make a separate table for the different skills or proficiencies that you want to highlight.  I chose to make a rigid structured table with dimensions for the rank of each result, ensuring I wouldn’t have to sort the data over each category (passion, expertise, industry) once I was in Tableau.

    Here’s the table:

    That’s it for data structure, leaving style (including chart choices) as the last piece of the puzzle.  Remember, this is going to be a representation of you in the digital domain, how do you want to be portrayed?  I am known for my clean, minimalist style, so I chose to keep the design in this voice.  Typical to my style, I purposely bubble up the most important information and display it in a visual format with supporting detail (often text) in the tooltip.  Each word and label is chosen with great care.  It’s not by mistake that the audience is seeing the name of my education (and not the institution) and the labels of each proficiency.  In a world where impressions must happen instantaneously, it’s critical to know what things should have a lasting impact.

    I also chose colors in a very specific manner, the bright teal is my default highlight color, drawing the eyes in to certain areas.  However, I’ve also chosen to use a much darker gray (near black) as an opposite highlight in the bottom section.  My goal with the dark “major milestones” is to entice the audience to interact and find out what major means.

    The final product from my perspective represents a polished, intentional design, where the data-ink ratio has been maximized and the heart of my professional ambitions and goals are most prominent.

    Now that you’ve got the tools – go forth and build a resume.  I’m curious to know what choices you will make to focus attention and how you’ll present yourself from a styling perspective.  Will it be colorful and less serious, will you focus on your employment history or skills?  Much like other visualizations whatever choices you make, ensure they are intentional.

  • Blending Visualizations of Different Sizes

    Blending Visualizations of Different Sizes

    One of my favorite visualizations is the sparkline – I always appreciated how they are described by Edward Tufte “data-intense, design-simple, word-sized graphics.”  Meaning the chart gets right to the point: conveying a high amount of information without sacrificing real estate.  I’ve found this approach works really well when trying to convey different levels of information (detail and summary) or perhaps different metrics around a common topic.

    I recently built out a Report Card for Human Resources that aims to do just that.  Use a cohort of visualizations to communicate an overall subject area and then repeat the concept to combine 4 subject/metric areas.  Take a look at the final dashboard below.

    click to view on Tableau Public

    The dashboard covers one broad topic – Human Resources.  Within it there are 4 sub-topics: number of employees, key demographics, salary information, and tenure.  As your eyes scan through the dashboard, they likely stopped at the large call outs in each box.  You’ve got your at-a-glance metrics that start to bring awareness to the topic.

    But the magic of this dashboard lies in the collection of charts surrounding the call outs.  Context has been added to surround each metric.  Let’s go through each quadrant and unpack the business questions we may have.

    1. How many active employees do we have?
    2. How many new employees have we been hiring?
    3. How many employees are in each department?
    4. What’s the employee to leadership ratio?

    The first visualization (1) is likely the one a member of management would want.  It’s the soundbite and tidbit of information they’re looking for.  But once that question is asked and answered, the rest of the charts become important to knowing the health of that number.  If it’s a growing company, the conversation could unfold into detail found in chart 2 – “okay we’re at 1500 employees, what’s our hiring trend?”  The same concept could be repeated for the other charts – with chart 4 being useful for where there might be opportunity for restructuring, adding management, or checking up on employee satisfaction.

    The next quadrant focuses specificly on employee demographics.  And the inclusion of it after employee count is intentional.  It’s more contextual information building from the initial headcount number.

    1. Do we have gender equity?
    2. What is the gender distribution?
    3. How does the inclusion of education level affect our gender distribution?

    Again, we’re getting the first question answered quickly (1) – do we have gender equity?  Nope – we don’t.  So just how far off are we, that’s answered just to the right (2).  The second chart is still a bit summarized, we can see the percentages for each gender, but it’s so rolled up that we’d be pressed to figure out how or where the opportunity for improvement might be.  This is where the final chart (3) helps to fill in gaps.  With this particular organization, there could be knowledge that there’s gender disparity based on levels of education.  We don’t get the answers to all the questions we have, but we are starting to narrow down focus immensely.  We could go investigate a potentially obvious conclusion and try to substantiate it (this company hires more men without any college experience).

    The next quadrant introduces salary – a topic everyone cares about.

    1. What’s the average salary of one of our employees?
    2. Are we promoting our employees?  (A potential influence to #1)
    3. What’s the true distribution of salaries within our organization?

    The design pattern is obvious at this point – convey most important single number quickly, and then dive into context, drivers, and supporting detail.  I personally like the inclusion of the histogram with a boxplot, a simple way to apply statistics to an easily skewed metric.  Even in comparing the average number to the visual median, we can see that there are some top heavy salaries contributing to the number.  And what’s even more interesting about the inclusion of the histogram is the frequency of salaries around the $25k mark.  I would take away from this section the knowledge of $78k, but also the visual spread of how we arrive at that number.  The inclusion of (2) here serves mostly for a form of context.  Here it could be that the organization has an initiative to promote internally and thus goes hand-in-hand with salary changes.

    And finally our last section – focused closely on retention.

    1. What’s our average employee tenure?
    2. How much attrition/turnover do we have monthly?
    3. How much seniority is there in our staff?

    After this final quadrant, we’ve got a snapshot of what a typical employee looks like at this organization.  We know their likely salary, how long they’ve been with the company, some ideas on where they’re staffed, and a guess at gender.  We can also start to fill in some gaps around employee satisfaction – seems like there was some high turnover during the summer months.

    And let’s not forget – this dashboard can come more to life by the inclusion of a few action filters.  We’ve put down the groundwork of how we want to measure the health of our team, now it’s time to use these to drive deeper and more meaningful questions and analysis.

    I hope this helps to demonstrate how the inclusion of visualizations of varying sizes can be combined to tell a very rich and contextual data story – perfect for understanding a large subject area with contextual indicators and answers to trailing questions included.

  • The Shape of Shakespeare’s Sonnets | #IronViz Books & Literature

    The Shape of Shakespeare’s Sonnets | #IronViz Books & Literature

    Jump directly to the viz

    If it’s springtime that can only mean that it’s time to begin the feeder rounds for Tableau’s Iron Viz contest.  The kick-off global theme for the first feeder is books & literature, a massive topic with lots of room for interpretation.  So without further delay, I’m excited to share my submission: The Shape of Shakespeare’s Sonnets.

    The genesis of the idea

    The idea came after a rocky start and abandoned initial idea.  My initial idea was to approach the topic with a meta-analysis or focus on the overall topic (‘books’) and to avoid focusing on a single book.  I found a wonderful list of NYT non-fiction best sellers lists, but was uninspired after spending a significant amount of time consuming and prepping the data.  So I switched mid-stream and decided to keep the parameters of a meta-analysis, but change to a body of literature that a meta-analysis could be performed on.  I landed on Shakespeare’s Sonnets for several reasons:

    • Rigid structure – great for identifying patterns
    • 154 divides evenly for small multiples (11×14 grid)
    • Concepts of rhyme and sentiment could easily be analyzed
    • More passionate subject: themes of love, death, wanting, beauty, time
    • Open source text, should be easy to find
    • Focus on my strengths: data density, abstract design, minimalism
    Getting Started

    I wasn’t disappointed with my google search, it took me about 5 minutes to locate a fantastic CSV containing all of the Sonnets (and more) in a nice relational format.  There were some criteria necessary for the data set to be usable – namely each line of the sonnet needed to be a record.  After that point, I knew I could explode and reshape the data as necessary to get to a final analysis.

    Prepping & Analyzing the Data

    The strong structuring of the sonnets meant that counting things like number of characters and number of words would yield interesting results.  And that was the first data preparation moment.  Using Alteryx I expanded out line into columns for individual words.  Those were then transposed back into rows and affixed to the original data set.  Why?  This would allow for quick character counting in Tableau, repeated dimensions (like line, sonnet number), and dimensions for the word number in each line.

    I also extracted out all the unique words, counted their frequency, and exported them to a CSV for sentiment analysis.  Sentiment analysis is a way to score words/phrases/text to determine the intention/sentiment/attitude of the words.  For the sake of this analysis, I chose to go with a negative/positive scoring system.  Using Python and the nltk package, each word’s score was processed (with VADER).  VADER is optimized for social media, but I found the results fit well with the words within the sonnets.

    The same process was completed for each sonnet line to get a more aggregated/overall sentiment score.  Again, Alteryx was the key to extracting the data in the format I needed to quickly run it through a quick Python script.

    Here’s the entire Alteryx workflow for the project:

    The major components
    • Start with original data set (poem_lines.csv)
      • filter to Sonnets
      • Text to column for line rows
      • Isolate words, aggregate and export to new CSV (sonnetwords.csv)
      • Isolate lines, export to new CSV (sonnetlines)
      • Join swordscore to transformed data set
      • Join slinescore to transformed data set
      • Export as XLSX for Tableau consumption (sonnets2.xlsx)
    Python snippet
    make sure you download nltk leixcons after importing; thanks to Brit Cava for code inspiration

    The Python code is heavily inspired by a blog post from Brit Cava in December 2016.  Blog posts like hers are critically important, they help enable others within the community do deeper analysis and build new skills.

    Bringing it all together

    Part of my vision was the provoke patterns, have a highly dense data display, and use an 11×14 grid.  My first iteration actually started with mini bar charts for number of characters in each word.  The visual this produced was what ultimately led to the path of including word sentiment.

    height = word length, bars are in word order

    This eventually changed to circles, which led to the progression of adding a bar to represent the word count of each individual line.  The size of the words at this point became somewhat of a disruption on the micro-scale, so sentiment was distilled down into 3 colors: negative, neutral, or positive.  The sentiment of the entire line instead has a gradient spectrum (same color endpoints for negative/positive).  Sentiment score for each word was reserved for a viz in tool tip – which provides inspiration for the name of the project.

    Sonnet 72, line 2

    Each component is easy to see and repeated in macro format at the bottom – it also gives the end user an easy way to read each Sonnet from start to finish.

    designed to show the progression of abstraction

    And there you have it – a grand scale visualization showing the sentiment behind all 154 of Shakespeare’s Sonnets.  Spend some time reciting poetry, exploring the patterns, and finding the meaning behind this famous body of literature.

    Closing words: thank you to Luke Stanke for being a constant source of motivation, feedback, and friendship.  And to Josh Jackson for helping me battle through the creative process.

    The Shape of Shakespeare’s Sonnets

    click to interact at Tableau Public

     

     

     

  • Dying Out, Bee Colony Loss in US | #MakeoverMonday Week 18

    Dying Out, Bee Colony Loss in US | #MakeoverMonday Week 18

    Week 18 of Makeover Monday tackles the issue of the declining bee population in the United States.  Data was provided by BeeInformed and the re-visualization is in conjunction with Viz for Social Good.  Unfamiliar with a few of the terms – check out their websites to learn what Makeover Monday and Viz for Social Good are all about.

    The original visualization is a filled map showing the annual percentage of bee colony loss for the United States.  Each state (and DC) are filled with a gradient color from blue (low loss) to orange (high loss).  The accompanying data set for the makeover included historical data back to 2010/11.

    Original visualization | Bee Informed

    Looking at the data my goal was to capitalize on some of the same concepts presented in the original visualization, but add more analytical value by including the dimension of time.  The key component I was aiming to understand was that there’s annual colony loss, but how “bad” is the loss.  The critical “compared to what” question.

    My Requirements
    • Keep the map theme – good way to demonstrate data
    • Add in time dimension
    • Keep color as an indicator of performance (good/bad indicator) – clarify how color was used
    • Provide more context for audience
    • Switch to tile map for skill building
    • Key question: where are bees struggling to survive
    • Secondary question: which states (if any) have improved

    Building out the tile map and beginning to add the time series was pretty simple.  I downloaded the hexmap template provided by Matt Chambers.  I did a bit of tweaking to the file to change where Washington D.C. was located.  Original file has it off to the side, I decided to place it in-line with the continental US to clean up the final look.

    Well documented through the Tableau Community – the next step was to take the two data sources (bees + map) and blend them together.  Part of that process includes setting up the relationship between the two data sources and then adding them both to a single view:

    setting up the relationship between data sources
    visual cues – MM18 extract is primary data source, hexmap secondary

    To change to a line chart and start down the path of showing a metric (in our case annual bee colony loss) over time – a few minor tweaks:

    • Column/Row become discrete (why: so we can have continuous axes inside of our rows & columns)
    • Add on continuous fields for time & metric

    This to me was a big improvement over the original visualization (because of the addition of time).  But it still needs a bit of work to clearly explain where good and bad are.  This brought me back to a concept I worked on during Week 17 – using the background of a chart as an indicator of performance.

    forest land consumption

    In week 17 I looked at the annual consumption of carbon, forest land, and crop land by the top 10 world economies compared to the global footprint.  Background color indicates whether the country’s footprint is above/below the current global metric.  I particularly appreciate this view because you get the benefit of the aggregate and immediate feedback with the nice detail of trend.

    This led me down the path of ranking each of the states (plus DC) to determine which state had experienced the most colony loss between the years of the data (2010/11 and 2016/17).  You’d get a sense of where the biggest issues were and where hope is sprouting.

    To accomplish this I ended up using Alteryx to create a rank.  The big driver behind creating a rank pre-visualization was to replicate the same rank number across the years.  The background color for the final visualization is made by creating constant value bar charts for each year.  So having a constant number for each state based off of a calculation from 2010 vs. 2016 would be much easier to develop with.

    notice the bar chart marks card; Record ID is the rank

     

    Here’s my final Alteryx workflow.  Essentially I took the primary data set, split it up into 2010 and 2016, joined it back, calculated the difference between them, corrected for a few missing data points, sorted them from greatest decline in bee colony loss to smallest, applied a rank, joined back all the data, and then exported it as a .hyper file.

    definitely a quick & dirty workflow

    This workflow developed in less than 10 minutes eliminated the need for me to do at least one table calculation and brought me closer to my overall vision quickly and painlessly.

    Final touches were to be a little descriptive to eliminate the need for a color legend and to provide a first-time reader areas to focus on.  And picking the right color palette and title.  Color always leads my design – so I settled on the gold early on, but it took a few iterations to evoke the feeling of “dying out” from the color range.

    tones of brown to keep theme of loss, gold indicates more hope

    And here’s the final visualization again, with link to interactive version in Tableau Public.

    click to interact on Tableau Public
  • Workout Wednesday Week 17: Step, Jump, or Linear?

    Workout Wednesday Week 17: Step, Jump, or Linear?

    What better way to celebrate the release of step lines and jump lines in Tableau Desktop with a workout aimed at doing them the hard way?

    click to view on Tableau Public

    Using alternative line charts can be a great way to have more meaningful visual displays of not-so-continuous information.  Or continuous information where it may not be best to display the shortest distance between two points in a linear way (traditional line charts).

    Step line and jump line charts are most useful for something with few fluctuations in value, an expected value, or something that isn’t consistently measured.

    The workout this week is very straight forward – explore the different types of line charts (step lines, jump lines, and linear/normal lines).  Don’t use the new built in features of 2018.1 (beta or release, depending on when you’re reading) found by clicking on the Path shelf.  Instead use other functions or features to create the charts.

    The tricky parts about this week’s workout will be the step lines.  Pay special attention to the stop and start of the lines and where the tooltips display information.  You are not allowed to duplicate the data or create a “path ID” field.  Everything you do should be accomplished using a single connection of Superstore and no funny business.

    Tiny additional element of creating the ability to flip through the chart types.

    Requirements:

    • Dashboard size 1000 x 800
    • Displaying sales by month for each Category
    • Create a button that flips through each chart type
    • Match step line chart exactly, including tooltip, start/stop of lines, colors, labels
    • Match jump line chart exactly, including axes, labels, tooltips
    • Match normal line chart exactly, including axes, labels tooltips

    This week uses the superstore dataset.  You can get it here at data.world

    After you finish your workout, share on Twitter using the hashtag #WorkoutWednesday and tag @AnnUJackson, @LukeStanke, and @RodyZakovich.  (Tag @VizWizBI too – he would REALLY love to see your work!)

    Also, don’t forget to track your progress using this Workout Wednesday form.

  • Workout Wednesday 14 | Guest Post | Frequency Matrix

    Workout Wednesday 14 | Guest Post | Frequency Matrix

    Earlier in the month Luke Stanke asked if I would write a guest post and workout.  As someone who completed all 52 workouts in 2017, the answer was obviously YES!

    This week I thought I’d take heavy influence from a neat little chart made to accompany Makeover Monday (w36y2017) – the Frequency Matrix.

    I call it a Frequency Matrix, you can call it what you will – the intention is this: use color to represent the frequency (intensity) of two things.  So for this week you’ll be creating a Frequency Matrix showing the number of orders within pairs of sub-categories.

    click to view on Tableau Public

    Primary question of the visualization: Which sub-categories are often ordered together?
    Secondary question of the visualization: How much on average is spent per order for the sub-categories.
    Tertiary question: Which sub-category combination causes the most average spend per order?

    Requirements
    • Use sub-categories
    • Dashboard size is 1000 x 900; tiled; 1 sheet
    • Distinctly count the number of orders that have purchases from both sub-categories
    • Sort the categories from highest to lowest frequency
    • White out when the sub-category matches and include the number of orders
    • Calculate the average sales per order for each sub-category
    • Identify in the tooltip the highest average spend per sub-category (see Phones & Tables)
    • If it’s the highest average spend for both sub-categories, identify with a dot in the square
    • Match formatting & tooltips – special emphasis on tooltip verbiage

    This week uses the superstore dataset.  You can get it here at data.world

    After you finish your workout, share on Twitter using the hashtag #WorkoutWednesday and tag @AnnUJackson, @LukeStanke, and @RodyZakovich.  (Tag @VizWizBI too – he would REALLY love to see your work!)

    Also, don’t forget to track your progress using this Workout Wednesday form.

    Hints & Detail
    • You may not want to use the WDC
    • Purple is from hue circle
    • You’ll be using both LODs and Table Calculations
    • I won’t be offended if you change the order of the sub-category labels in the tooltips
    • Dot is ●
    • Have fun!
  • Who Gets an Olympic Medal | #MakeoverMonday Week 7

    Who Gets an Olympic Medal | #MakeoverMonday Week 7

    At the time of writing the 2018 Winter Olympic Games are in full force, so it seems only natural that the #MakeoverMonday topic for Week 7 of this year is record level results of Winter Games medal wins.

    I have to say that I was particularly excited to dive into this data set.  Here’s what a few rows of data look like:

    I always find with this level of data there are so many interesting things that can be done that it gets really hard to focus.  The trouble is that all of the rows are interesting, so as a creator I’m immediately drawn to organizing “all the data” and want put to it ALL on display.  And that’s where the first 20 minutes of my development were headed.

    I’d started with a concept of showing all the medals and more specifically showing the addition of new sports over time.   As I was building, the result was quite clearly going to be a giant poster form viz.  Not what I was going for.

    To move past that my mind shifted to female sports at the Winter Olympics.  And if you look through the data set you’ll see there are some interesting points.  Specifically that it took about 60 years for women to get to a similar number of events/medals as men.  (yellow = men, purple = women, gray = mixed)

    I spent some time stuck on this – thinking through how I could segment by different sports and to extract out some of the noise of the different years and come up with a slope chart.  Ultimately I found myself disappointed with all of these pursuits – so my thoughts shifted.

    So I switched gears and stumbled on this chart:

    Which as you look through it is REALLY interesting.  I had just watched the Opening Ceremonies and knew there were 91 delegations (countries) represented in 2018.  To know that in 2014 the number was probably similar, yet only 26 reached a podium seemed to be a sticking point in my mind.

    So – that led to a quick adventure over to http://www.olympic.org to add context to the number of countries represented at the games over the years.  They actually have really nice summary pages for each set of games that made gathering data simple.  Here’s a snapshot of 1980 – Lake Placid:

    Using the ribbon of information at the bottom I went about collecting and enriching the data set.  Because what was missing from our original #MakeoverMonday data was the NULLs.

    Sufficiently enriched I was able to come up with a calculation for the percentage of delegations medalling at each set of games.  Of course I suspected that this would not be close to 100%, only by virtue of knowing that we’ve got 91 delegations in 2018.  Here’s the chart:

    So – now the story is unfolding, but I wanted to take it a few steps further.  My main beef: I want to also see how many additional delegations are bringing athletes to the games.  Specifically at the first data point I’d think that it was a small number of countries because the game were new. Essentially the opportunity for medalling would perhaps be greater.  Hence settling on what ended up being my final submission for the week:

    click to view on Tableau Public

    What are you looking at?  Medals are clearly parsed out into Gold, Silver, and Bronze.  Each bar represents a Winter Games.  The width of the bar is the # of countries/delegations, the height of the bar is the % of countries who medalled in that respective color.  I concede in this that eliminating the dimensionality of medals may have made for a more consolidated view, but I selfishly wanted to use the different colors.

    Here’s the non-medalled version:

    Less abstracted, more analytically presented:

    Ultimately for the sake of the exercise I went with continuous bar sizing representing the number of delegations at each Winter Games.  And my “why” is because this isn’t often seen and within the confines of this visualization it would be a great usage.  Explaining this aloud should facilitate easy cognition.  The wider bars means more countries participating (reinforced by our general knowledge of the games).  And then the height of the bars can cleanly represent the percentage of those getting medals.  Plus – per usual – the tooltip divulges all this in well articulated detail.  (++ bars allow for chronology of time)

    I’m quite pleased with this one.  Maybe because I am the designer, but I was delighted with the final representation both from a visual perspective and an analytical presentation perspective.  There is a certain amount of salience in having both the bars gets larger over time (and repeating that 3 times) and the colors of the medals being represented within a single worksheet.

  • #Workout Wednesday – Fiscal Years + Running Sums

    #Workout Wednesday – Fiscal Years + Running Sums

    As a big advocate of #WorkoutWednesday I am excited to see that it is continuing on in 2018.  I champion the initiative because it offers people a constructive way to problem solve, learn, and grow using Tableau.

    I was listening to this lecture yesterday and there was a great snippet “context is required to spark curiosity.”  We see this over and over again in our domain – everyone wants to solve problems, but unless there is a presented problem it can be hard to channel energy and constructively approach something.

    Last pre-context paragraph before getting into the weeds of the build.  I enjoy the exercise of explaining how something works.  It helps cement in my mind concepts and techniques used during the build.  Being able to explain the “why” and “how” are crucial.

    Let’s get started.

    High level problem statement or requirement: build out a dashboard that shows the running total of sales and allows for the user to dynamically change the start of a fiscal year.  The date axis should begin at the first month of the fiscal year. Here’s the embedded tweet for the challenge:

     

    Okay – so the challenge is set. Now on to the build. I like to tackle as many of the small problems that I know the answer to immediately. Once I get a framework of things to build from I can then work through making adjustments to get to the end goal.

    • numeric parameter 1 to 12 for the fiscal year start
    • calculation to push a date to the right fiscal year
    • dimension for the fiscal year
    • transformation of date to the correct year
    • running sum of sales

    I’ll spare you the parameter build and go directly into the calculations that deal with the fiscal year components.

    Logic as follows – if the month of the order date is less than the fiscal year start, then subtract one from the year, otherwise it’s the year.  I can immediately use this as a dimension on color to break apart the data.

    The next step would be to use that newly defined year with other elements of the original date to complete the “fiscal year” transformation.  Based on the tooltips – the year for each order date should be the fiscal year.

    Now that the foundation is in place, the harder part is building a continuous axis, and particularly a continuous axis of time.  Dates span from 2013 to 2017 (depending on how you’ve got your FY set up), so if we plotted all the native dates we’d expect it to go from 2013 to 2017.  But that’s not really want we want.  We want a timeline that spans a single year (or more appropriately 365 days).

    So my first step was to build out a dummy date that had the SAME year for all the dates.  The dates are already broken out with the FY, so as long as the year isn’t shown on the continuous axis, it will display correctly.  Here’s my first pass at the date calculation:

    That gets me the ability to produce this chart – which is SO CLOSE!

    The part that isn’t working is my continuous axis of time is always starting at 1/1.  And that makes sense because all the dates are for the same year and there’s always 1/1 data in there.  The takeaway: ordering the dates is what I need to figure out.

    The workaround?  Well the time needs to span more than one year and specifically the start of the axis should be “earlier” in time.  To achieve this – instead of hard coding a single year 2000 in my case, I changed the dummy year to be dependent on the parameter.  Here’s the result:

    Basically offset everything that’s less than the chosen FY start to the next year (put it at the end).  (Remember that we’re already slicing apart the data to the correct FY using the previous calculations.)  The combination of these two calculations then changes the chart to this guy which is now mere formatting steps away from completion.

    The full workbook is available to download here.

  • The Remaining 25 Weeks of #WorkoutWednesday

    The Remaining 25 Weeks of #WorkoutWednesday

    Back in July I wrote the first half of this blog post – it was about the first 27 weeks of #WorkoutWednesday.  The important parts to remember (if the read is too long) are that I made a commitment to follow through and complete every #MakeoverMonday and #WorkoutWednesday in 2017.  The reason was pretty straightforward – I wanted a constructive way to challenge myself and tangible, realistic goals.

    Now that we’re 3 days into 2018 – it’s the perfect opportunity to go through the same process of sharing the impact each workout (the remaining 25) has had on me.

    Week 28 | Insights & Annotations
    The focus on this workout was adding context/insights/summarization to existing visualizations.  Something that is often asked of those creating dashboards or presenting data.  I enjoyed this workout tremendously because it was a great example of using a feature within Tableau in a way I hadn’t thought about.  The premise is pretty simple – allow users the ability to input insights/findings and customize the summary output.  Very clever use that is a great way to provide a dashboard to someone who has little time or is skeptical of self-service analytics.

    Week 29 | Who sits where at the Data School?
    I hated this workout – because of donut charts.  Donut charts in Tableau are the level one “look how cool I am” or “I think I’m pretty awesome with the tool” things that people make.  Yes – they are cuter than pie charts (which I don’t mind) – but I strongly hate how these are implemented in Tableau.  Pushing aside my dissatisfaction for donuts – I veered off requirements for this dashboard.  I particularly changed from the original build results because of how seat favorites were computed – more specifically I ended up with more “No Favorite” than the original.  The great points about this workout – the sophistication required to calculate several of the numbers shown.

    Week 30 | Loads of LODs
    As described this workout had a few LODs.  The most distinct thing I remember about this build relates to the region filter.  You need to decide early on how you’re going to implement this.  I believe Emma used a parameter where I used a filter.  The choice made here will have consequences on the visualizations and is a great way to start understanding the order of operations within Tableau.

    Week 31 | The Timing of Baby Making
    Ah – a favorite visualization of mine, the step chart/plot.  The two major gotcha moments here are: implementing the step chart and the dropdowns/filters to set highlighting.  This one is solvable in multiple ways and I went the Joe Mako route of unioning my data set.  During the build this seemed like the easier solution to me, but it does have implications later on.  I believe it’s worth the effort to go this route – you will learn a lot about how unioning your data on itself can be useful.

    Week 32 | Continuous Dates are Tricky
    This is classic Emma style – particularly down to the requirement of a dynamic title that would update based on newer data availability.  The premise itself is pretty straightforward – how can you plot a continuous month line, but have years broken out.  By definition that concept should break your brain a bit, because continuous means it’s the entire date, so how are you plotting different years on a continuous axis?  And hence the challenge of the workout!

    Week 33 | How Have Home Prices Changed?
    Andy mentioned that this visualization was inspired by Curtis Harris.  And as I write this – it is probably my top visualization in terms of design from the 2017 #WorkoutWednesday collection.  Something about the strategic use of capitalization in conjunction to the color choices resonated with me and has left a lasting impact on my viz style.  This is a stunningly beautiful example of making Tableau a very self-service analytics tool, having dense data, but still being deceptively simple and clean from a design perspective.  Plus you’re practicing dynamic titles again – which I find to be a requirement for most serious builds.

    Week 34 | Disney’s Domination
    This workout was my first waffle chart.  I’d successfully avoided the waffle (as a general rule I don’t like high carbohydrate visualizations) up to this point.  More than the waffle – was the requirement to data blend.  I’m not a big data blending fan because it is very easy for things to go sideways.  However – the icky feeling you get from data blending is exactly why this is a great exercise to work through.  And also because I believe I did this entire visualization (up to a point) using LODs and had to switch to table calculations.  I learned how to make an LOD into a table calculation (probably the reverse practice for more tenured Tableau folks).

    Week 35 | Average Latitude of Solar Eclipses by Century
    This is another visualization with design that I find very pleasing – particularly the use of reference lines.  I strongly remember learning so much about the Path shelf this week.  Specifically how to use path to your advantage.  You don’t often see people create something that would otherwise be a line chart, but instead has vertical bars/gantts/lines to focus the eye vertically and then across.  A great exercise and thought-starter on additional visualizations to make.

    Week 36 | Which UK Airport Should You Fly From?
    This workout is the perfect hands-on exercise on taking continuous sizing on bar charts offered up as a new feature in the past year.  Beyond knowing that you CAN do something, knowing HOW to build that something is the key (at least for me) to be able to iterate and ideate.  This one is more complex than it seems at first glance.

    Week 37 | Killings of Blacks by Whites Are Far More Likely to Be Ruled ‘Justifiable’
    A viz I don’t want to remember – this one is 100% about formatting.  It took me a considerable chunk of time to complete.  And probably more maddening – this viz needs to be built in one session, otherwise you’ll forget all the intricate details required to make it look “just so.”  I was cursing my PC the entire build – and worse than that I think I restarted it at a certain point because things weren’t lining up how I wanted.  Only complete if you enjoy being tormented.  The upside?  Going through this workout will make you intimately aware of all the gaps and limitations Tableau has as it relates designing.  Also – this was done before changing padding was a feature.  Thanks guys.

    Week 38 | (It Takes) All Sorts
    This is another “looks simple” but has “some tricks” workout.  I remember someone at our user group asking about this over the summer and if I knew how to build it.  I didn’t have an answer readily available within 30 seconds, so I knew there was more going on.  I highly encourage this build because it demonstrates how sorting works and how multiple sorts interact with each other.  Also – I think whatever sorting I ended up was some sort of mathematical manipulation on my part.

    Week 39 | Are the contributions of top sellers increasing throughout the year?
    Another trellis chart!   More than the trellis – check out what is being analyzed.  This isn’t superficial or first-pass reading of data.  This is second and third level thought on finding deeper insights and answers to questions within a data set.  So naturally it requires more layers of calculations to resolve.  And of course – the “just so” placement of the far right label.  This is a perfect example of taking a question and turning it into a visualization that shows the answer.

    Week 40 | All Sorts Part 2
    As advertised and named – the second half of Emma’s sorting workout.  This may actually have been the dashboard where I did some mathematical magic to require the first position to be first and retain any additional sorting.  Also – the devil is in the details.  When you change sort order, notice that the bottom visualization always changes to be the subcategory chosen.  Sounds easy, but takes some thought to implement.

    Week 41 | State to City Drill Down
    As I look back on my tracker – I realize that I did 38 through 41 in the same day.  And naturally I approach backlog in the fashion of oldest gets built first.  So this was the 4th on a particular day – but I championed this guy hardcore.  I will say it again – YOU NEED TO DO THIS WORKOUT.  The concepts on execution here are next level.  I know it sounds kind of trivial – but it will help unlock your mind to the possibilities of using filters and the art of the possible.  Plus this is a question that ALWAYS gets asked.  “Can I click a state and have it automatically change the view to cities.”  This does that.  Also – this build took me 30 minutes tops.

    Week 42 | Market Basket Analysis
    I won’t forget this workout anytime soon because it required the legacy JET connector and thinking about how data gets joined back on itself.  This type of analysis is something people often want to have done – so knowing the steps on creation using an Excel data source (or other sources for that matter) makes this guy worth the build.  Follow Emma’s advice closely.

    Week 43 | The Seasonality of Superstore
    A great viz once again demonstrating how powerful parameters can be – how you can use them in multiple places – and also things you can do to make visualizations more user/reader friendly.  You’re definitely using a table calculation somewhere in here – and you definitely will get angry when trying to recreate the smoothing (particularly dealing with endpoints of the chosen time range).

    Week 44 | Customer Cohorts
    When dealing with cohort analysis you’re very likely to encounter LODs – that’s par for the course for this workout.  But again – Emma is so clever at taking something that seems straightforward and challenging you to implement.  If you look closely you’ll have to dynamically change the bottom visualization based on where a user clicks.  I remember spending the majority of my time on the dynamic title.

     

    Week 45 | Stock Portfolio
    This is one sheet.  Just remember that – everything is on one sheet.  And more than that – think about how this is implemented from a numerical perspective – there’s some serious normalization going on to make things show up in context to one another.  If you’re not a math lover – this will be a great way to play with numbers and have them bend to your advantage.  Also – I remember being annoyed because one of the stocks had a maximum value greater than the recorded max (which is it’s own measure) – and that irritated me.

    Week 46 | Top N Customers
    Think of this as a different way of implementing sets.  It has a lot of similar functionality between IN/OUT and showing members of a set.  And also there are some key takeaways in terms of aggregating dimensions.  Not super flashy on design, but very useful in terms of implementation.

    Week 47 | Fun with Formatting
    Another visualization where you’re required to do everything in a single sheet.  This will put all that table calculation sweat to action.  I really enjoyed this one.  There is something very satisfying about ranking/indexing things multiple ways in one view.  Also it uses the Caption guys.

    Week 48 | Treemap Drilldown
    Same concept as week 41, but executed as a treemap.  I think I even opened up week 41 to use as influence on where to go.  Same concepts are repeated, but in a different format.  The automagic of this one doesn’t get old – also carefully look at how things are sorted.

    Week 49 | Position of Letter Occurrences in Baby Names
    When you say out loud what you’re trying to do – particularly “find the nth occurrence” of a specific letter (we can generalize as substring) in a specific string – it sounds really really hard.  But guess what – there’s a built in function!  The fact that it’s built in made this visualization super straightforward to complete.  You should build this to introduce yourself to a function you’ve probably never used before.

    Week 50 | Rocket ship Chart
    I very much enjoy this type of chart from an analytical perspective.  It’s a great way to normalize things that are bound to time.  You see immediate inferred rank and results.  Emma put in some requirements to ensure that as data changed this chart would stay accurate.

    Week 51 | State by State Profit Ratio
    If you want several of the lessons Andy built into multiple workouts all in one place – this workout is for you.  It’s got so many classic Kriebel “gotcha” moments in it.  As I was building this I really felt like it was a specially designed final to test what I’ve learned.  Also this is probably my first tilemap (unless we made one in another workout).  I don’t use them often – so it’s a great refresher on how to implement.  And also – you get to use a join calculation.

    Week 52 | UK’s Favourite Christmas Chocolates
    When I was building this one someone asked me why I was making it – specifically where was the challenge.  I explained that it was all in one sheet opposed to 4 different sheets.  A natural next question occurred which was “why would you want to do it in one sheet.”  I though that was a very interesting question and one that I explained by saying that for me personally knowing multiple ways to do things is important.  And more specifically as I know to be true of these types of builds – if you can do it in one sheet, it shows a level of mastery on making Tableau do exactly what you want (which is LOTS of things).

    And that wraps up 2017 beautifully.  Comparing the retrospective of this half of the year vs. the first half – there are stark differences from my perspective.  I can honestly say that each build got easier as time went on.  Once I got to the last few challenges – I was timing completion to be about 20 minutes.  Contrast that with the first few weeks where I spent hours (over multiple sessions) making my way through each build.

    Beyond building out my portfolio, having concrete examples of specific types of analysis, and fulfilling my own goals – #WorkoutWednesday has given me such depth of knowledge in Tableau that it’s ridiculous.  You know how people say things like “that’s just funny Tableau behavior” – well I can (for the most part) now verbally articulate what and why that funny behavior is.  And more than that – I know how to maximize the behavior which is really working as designed and use it to my own advantage.

    The last part of this blog series is going to be a ranking of each workout – aimed to help those who are interesting in completing the challenges approach these without getting too discouraged or burnt out on some of the builds that are still (to this day) hard.  Be on the lookout.