Tag: arizona

  • Without Water an Iron Viz feeder

    Without Water an Iron Viz feeder

    Jump directly to the viz

    At the time of writing it is 100°F outside my window in Arizona and climbing.  It’s also August and we’re right in the middle of feeder round 3 for Tableau Public’s Iron Viz contest.  Appropriately timed, the theme for this round is water.  So it’s only fitting that my submission for this round would take into consideration the mashup of these two and form my submission: Without Water, 2 decades of drought & damage in Arizona.

    The Genesis of the Idea

    I’ll start by saying that water is a very tricky topic.  The commonplace of it makes searching for data and a narrative direction challenging.  It’s necessary for sustaining life, so it seems to want to have a story tied directly to humankind – something closely related to water quality, water availability, loss of water – essentially something that impacts humans.  And because it’s so vital, there are actually several organizations and resources doing fantastic things to demonstrate the points above.  Unicef tracks drinking water and sanitation, Our World in Data has a lengthy section devoted to the topic, there’s the Flint Water Study, and the Deepwater Horizon oil spill.

    This realization around the plethora of amazing resources associated with water led me to the conclusion that I would have to get personal and share a story not broadly known.  So what could be more personal than the place I’ve called home for 14 years of my life: Arizona.

    Arizona is a very interesting state, it’s home to the Grand Canyon, several mountain ranges, and of course a significant portion of the Sonoran desert.  This means that in October it can be snowing in the mountains of Flagstaff and a stifling 90°F two hours south in Phoenix.  And, despite the desert, it needs water.  Particularly in the large uninhabited sections of the mountains covered with forests.  Getting to the punchline: since my time in Arizona, the state has been in a long sustained drought.  A drought that’s caused massive wildfires, extreme summer heat, and conversation thread that never steers far from the weather.

    Getting Started

    A quick google search led me to my first major resource: NOAA has a very easy to use data portal for climate data which includes: precipitation, various drought indices, and temperatures – all by month/state/and division.  This served as the initial data set along with the joining of climate division shapefiles maintained by NCEI.  Here’s the first chart I made showing the divisions by their drought index.  This uses the long term Palmer Drought Severity Index and any positive values (non-drought) are zeroed out to focus attention on deficit.

    My next major find was around wildfire data from the Federal Fire Occurrence website.  Knowing that fire is closely associated with drought, it seemed a natural progression to include.  Here’s an early iteration of total acres destroyed by year:

    It’s clear that after 2002 a new normal was established.  Every few years massive fires were taking place.

    And after the combination of these two data sets – the story started developing further – it was a time bound story of the last 20 years.

    Telling the Story

    I headed down the path of breaking out the most relevant drought headlines by year with the idea of creating 20 micro visualizations.  Several more data sources were added (including dust storms, heat related deaths, and water supply/demand).  An early iteration had them in a 4 x 5 grid:

    As the elements started to come together, it was time to share and seek feedback.  Luke Stanke was the first to see and gave me the idea of changing  from a static grid to a scrolling mobile story.  And that’s where things began to lock into place.  Several iterations later and with input from previous Iron Viz winner Curtis Harris – the collection of visualizations was starting to become more precisely tuned to the story.  White space became more defined and charts were sharpened.

    My final pass of feedback included outsourcing to Arizona friends (including Josh Jackson) to ask if it evoked the story we’re all experiencing and it’s what led to the ultimate change in titles from years to pseudo-headlines.

    Wrapping Up

    My one last lingering question: Mobile only or to include a desktop version?  The ultimate choice and deciding factor was to create a medium and version that was optimized for getting to the largest end audience – thus, mobile only.

    WITHOUT WATER

    And now that all leads to the final product.  A mobile only narrative data story highlighting the many facets of drought and it’s consequences for the state of Arizona.  Click on the image to view the interactive version on Tableau Public.

    click to view on Tableau Public

     

     

  • #MakeoverMonday Week 25 | Maricopa County Ozone Readings

    #MakeoverMonday Week 25 | Maricopa County Ozone Readings

    We had another giant data set this week – 202 million records of EPA Ozone readings across the United States.  The giant data set is generously hosted by Exasol.  I encourage you to register here to gain access to the data.

    The heart of the data is pretty straight forward – PPM readings across several sites around the nation for the past 25+ years.  As I went through and browsed the data set, it’s easy to see that there are multiple readings per site per day.  Here’s the basic data model:

    Parameter Name only has Ozone, Units of Measure only has Parts per million.  There is one little tweak to this data set – the Datum field.  Now this wasn’t a familiar term for me, so I described the domain to see what it had.

    I know exactly what one of these 4 things means (beyond Unknown) – that’s WGS84.  I was literally at the Alteryx Inspire conference two weeks ago and in a Spatial Analytics session where people were talking about different standards for coordinate systems on Earth.  The facilitators mentioned that WGS84 was a main standard.  For fun I decided to plot the number of records for each Datum per year to see how the Lat/Lon have potentially changed in measurement over time.  Since 2012 it seems like WGS84 has dominated as the preferred standard.

    So armed with that knowledge I sort of kept it in my back pocket of something I may need to be mindful of if I enter the world of mapping.

    Beyond that, I had to start my focus on preparing something for Tableau Public.  202 million records unfortunately won’t sit on Public and I have to extract the data.  Naturally I did what every human would do and zeroed in on my city: Phoenix Metropolitan area aka Maricopa County.

    So going through the data set there are multiple sites that are taking measurements.  And more than that, these sites are taking measurements multiple times per day.  I really wanted to express that somehow in my final visualization.  Here’s all the site averages plotted each day for the past 30 years – thanks Exasol!

    So this is averaged per day per site – and you can see how much variation there is.  Some are reporting very low numbers, even zeros.  Some are very high.

    If I take off the site ID, here’s what I get for the daily averages:

    Notice the Y-axis – much less dramatic.  Now the EPA has the AQI measurements and it doesn’t even get into the “bad” range until 0.071 PPM (Unhealthy for Sensitive Groups).  So there’s less of a story to some extent when we take the averages.  This COULD be because of the sites in Maricopa county (maybe there are low or faulty numbers dragging down the average) or it could be because when you do the average you’re getting better precision of truth.

    I’m going down this path because at this point I decided to make a decision: I wanted to look at the maximum daily measurement.  Given that these are instantaneous measurements, I felt that knowing the maximum measurement in a given day would also provide insight and value into how Ozone levels are faring.  And more specifically, knowing my region a little bit – the measurement sites could be outside of well populated areas and may naturally have lower occurring measurements.

    So that was step one for me: move to the world of MAX.  This let me leverage all the site data and get going.  (Also originally I wanted to jitter and display all the sites because I thought that would be interesting – I distilled the data down further because I wasn’t getting what I wanted in terms of presentation in the end result).

    Okay – next up was plotting the data.  I wanted to do a single page very dense data display that had all the years and the months and allowed for easy comparisons.  I had thought a cycle plot may be appropriate, but after trying a few combinations I didn’t see anything special about day of the week additions and noticed that the measurement really is about time of year (the month).  Secondary comparison being each year.

    Now that I’ve covered that part – next up was how to plot.  Again, this originally started out its life as dots that were going to be color encoded using the AQI scale with PPM on the Y-axis.  And I almost published it that way.  But to be honest with you, I don’t know if the minutia of the PPM really matters that much.  I think that dimension defined on top of the measurement is easier for an end user to understand.  Hence my final development fork: turn the categorical result into a unit measure (1, 2, 3, 4 etc.) as a byproduct to represent height of a bar chart.  And that’s where I got really inspired.  I made “Good” -1 and “Moderate” 0.  That way anything positive on the Y-axis is a bad day.  To me this will allow you to see the streaks of bad throughout the time periods.

    Close up of 2015 – I love this.  Look at those moderates just continuing the axis.  Look how clear the not so good to very bad is.  This resonates with me.

    Okay – so final steps here were going to be to have a map of all the measurements at each site (again the max for each site based on the user clicking a day).  It was actually quite cute showing Phoenix more close up.  And then I was going to have national readings (max for each site upon clicking a day) as a comparison.  This would have been super awesome – here’s the picture:

    So good.  And perhaps I could have kept this, but knowing I have to go to Tableau Public – it just isn’t going to handle the national data well.  So I sat on this for an evening and while I was driving to work I decided to do a marginal chart that showed the breakdown of number of days of each type.  The “why” was because it looks like things are getting better – more attention needs to be drawn to that!

    So last steps ended up being to add on the marginal bar charts and then go one step further to isolate the “bad days” per year and have them be the final distilled metric at the far far right.  My thought process: scan each year, get an idea of performance, see it aggregated to the bar chart, then see the bad as a single number.  For sheer visual pleasure I decided to distill the “bad” further into one more chart.  I had a stacked bar chart to start, but didn’t like it.  I figured for the sake of artistry I could get away with the area chart and I really like the effect it brings.  You can see that the “very bad” days have become less prominent in recent years.

    So that pretty much sums up the development process.  Here’s the full viz again and a comparison to the original output for Maricopa County, which echos the sentiment of my maximums – Ozone measurements are going down.

     

     

  • Synergy through Action

    This has been an amazing week for me.  On the personal side of things my ship is sailing in the right direction.  It’s amazing what the new year can do to clarify values and vision.

    Getting to the specifics of why I’m calling this post “Synergy through Action.”  That’s the best way for me to describe how my participation in this week’s Tableau and data visualization community offerings have influenced me.

    It all actually started on Saturday.  I woke up and spent the morning working on a VizforSocialGood project, specifically a map to represent the multiple locations connected to the February 2017 Women in Data Science conference.  I’d been called out on Twitter (thanks Chloe) and felt compelled to participate.  The kick of passion I received after submitting my viz propelled me into the right mind space to tackle 2 papers toward my MBA.

    Things continued to hold steady on Sunday where I took on the #MakeoverMonday task of Donald Trump’s tweets.  I have to imagine that the joy from accomplishment was the huge motivator here.  Otherwise I can easily imagine myself hitting a wall.  Or perhaps it gets easier as time goes on?  Who knows, but I finished that viz feeling really great about where the week was headed.

    Monday – Alberto Cairo and Heather Krause’s MOOC was finally open!  Thankfully I had the day off to soak it all in.  This kept my brain churning.  And by Wednesday I was ready for a workout!

    So now that I’ve described my week – what’s the synergy in action part?  Well I took all the thoughts from the social good project, workout Wednesday, and the sage wisdom from the MOOC this week to hit on something much closer to home.

    I wound up creating a visualization (in the vein of) the #WorkoutWednesday redo offered up.  What’s it of?  Graduation rates of specific demographics for every county in Arizona for the past 10ish years.  Stylized into small multiples using at smattering of slick tricks I was required to use to complete the workout.

    Here’s the viz – although admittedly it is designed more as a static view (not quite an infographic).

     

    And to sum it all up: this could be the start of yet another spectacular thing.  Bringing my passion to the local community that I live in – but more on a widespread level (in the words of Dan Murray, user groups are for “Tableau zealots”).