Skip to content

How I Taught Myself Data Visualization (Tableau, Datawrapper, and Dogs)

Which districts in Vienna have more dogs

I’ve always liked a good graph. When I used Twitter, Graph Crimes used to be one of my favorite accounts to follow. Graph Crimes reposted badly constructed charts like the one below. Why is it a bar chart?!

top 10 happiest countries in 2023 bad graph

Or this one. While the percentage has indeed fallen, the vertical axis of this graph starts at 58 instead of zero, exaggerating the appearance of the relative decline.

percentage of Americans that identify as Christians bad graph

Graphs = storytelling on steroids. They can be tweaked to send messages that don’t reflect reality accurately.

“A picture is worth a thousand words” is true and one of the reasons why I love data visualizations. I like analyzing things (sometimes to a fault) and I enjoy connecting the dots between things. I’m not as interested in learning powerful Excel formulas or creating visually impressive data visualizations.

Whatever you want to start learning, first identify which aspects you are and aren’t interested in.

What do I even need to learn?

I had an idea of what I wanted to create, but I wasn’t sure where to get started so I searched online for data visualization for beginners. I realized that I needed to learn about:

  1. different types of data visualizations – Common data viz types include pie charts, line charts, scatter plots, donut charts, Venn diagrams, and histograms.
  2. how to choose the right data visualization – For example, if I wanted to show change over time and growth year-over-year, a donut chart wouldn’t work.
  3. data gathering – Where would I find data? Was the data available or would I have to find my own data? How much time and effort would data gathering and cleanup take? What data formats worked?
  4. tools used for data viz

Learning by doing: Tableau and different types of data viz

I learn by doing so to build the knowledge I lacked, I tried to find ways to apply what I was learning and see what worked/didn’t.

When searching for data viz tools, Tableau came up. Tableau is a leading analytics company with a free version that allows users to explore, create, and share data visualizations online.

They provided sample data sets and a mini-course on learning foundational Tableau concepts and terminology while building an interactive dashboard for a fictitious company. I downloaded the Superstore Sales sample data which contained information about products, sales, and profits that we could use to identify key areas of improvement within the company.

I learned how to:

  • import the data and relate multiple tables to create a single data source
  • navigate the Workspace Area
  • create and format maps, profit/country for example
  • create area charts, sales by category for example
  • create text table reports on Key Performance Indicators
  • build an interactive dashboard that’s shareable and easy to navigate

While the course included 8 how-to videos that lasted less than an hour in total, replicating the teacher’s actions and actually learning to do the tasks above took way longer. The teaching style wasn’t my favorite either (quite dry/monotonous), but maybe that’s on me for expecting something else from training for an enterprise tool.

My first data viz: Great British Bake Off in Tableau

To create my first data viz without instructions, I used Tableau’s available data sets, specifically The Great British Bake Off data set which included 13 years of data on different episodes, contestants, and themes.

I still had a long way to go, though. I built a data viz based on 13 years of data using Tableau and tried a few different things to get to a single interesting graph.

For example, I wanted to see if a cooking theme (like Caramel or Vegan) resulted in better evaluations from the judges, but realized higher evaluations (as calculated by the number of “Star baker” evaluations given in every season) coincided with the most common themes (Cakes, Bread).

Check out my first data viz: handshakes given by judge Paul Hollywood to GBBO contestants per year (Tableau) and the process (Instagram video).

Handshakes given by judge Paul Hollywood to contestants per year

Learning by doing: Gathering data and choosing the right type of data visualization

For the Great British Bake Off, I had the data set and went looking for interesting connections between the data. How would the process look the other way around, for example, if I wanted to analyze the popularity of cold brew, one of my year-round favorite drinks? Where would I find the data and how would I “add” it to Tableau?

I’m particularly interested in topics related to 🎴☕️daily life☕️🎴 so I decided to try and get answers to everyday questions. My list of potential data visualizations included:

  1. Chair preferences in different countries
  2. “Trad wife” popularity
  3. Hymns – oldest/youngest, shortest/longest, updates, authors
  4. Getting your nails in done – comparing Austria, Albania, and the US
  5. Number of times “mom” is mentioned in rap songs
  6. How “in” is cold brew? Does that change with the seasons or do people want it year-round like me?
  7. Is everyone actually talking about frogs or is my social media feed reflecting my interests, not general sentiment?

Lisa Charlotte Muth suggests:

Start with the data, not with the topic

It’s unlikely you’ll get to work with data about that one exact thing you’ve always wanted to see visualized…because the data probably just doesn’t exist. Surprisingly many data sets, you will find, don’t exist. Or aren’t free for you to use. Or are just very well hidden on the internet. So instead of starting with a chosen topic, it’s way easier to look at what data sources exist and use them.”

Are there actually more tourists in Vienna during the holidays?

Lisa was right, as I would soon learn when I tried to visualize chair preferences in different countries. I googled it and searched for it on Kaggle, a large data community that offers thousands of free data sets, but couldn’t find related data sets. Statista, the statistics portal, was the closest I could get, but their market data isn’t free.

I changed my questions to “Are there actually more tourists around during this time of year?” and “Which districts in Vienna have more dogs?”

For the first question, I relied on Google Trends. For the second one, I found a large data set from the city of Vienna that I could format to get my answer.

  • Vienna is lovely during the holidays. I started with an assumption: More people visit Vienna during the holidays.
  • Google Trends shows Google searches for a term worldwide. I searched for “vienna” in a 5 year timeframe.
  • I noticed that interest peaks around October.
  • I wasn’t sure how long it takes the average person to plan/prepare for the holidays so I googled it. 6 weeks aka October.

So, my assumption was correct: interest in Vienna peaks around holiday planning time.

Interest in "Vienna" peaks around holiday planning time 🎄

Which districts in Vienna have more dogs?

The dogs per district data set was more complicated. I downloaded the CSV file which looked like this.

The website included these attribute descriptions:

NUTS | NUTS2 region (federal state) DISTRICT_CODE | Municipal district code (schema: 9BBZZ, BB=district number, ZZ=00) SUB_DISTRICT_CODE | Census district code according to the city of Vienna (schema: 9BBZZ, 9=Vienna identifier, BB=district number, ZZ=census district number, ZZ=99 if census district identifier is missing) REF_YEAR | Reference year REF_DATE | Reference date DOG_VALUE | Number of dogs (absolute) DOG_DENSITY | Number of dogs per 1,000 inhabitants

From CSV to data visualisation

  • Data cleaning – the process of fixing incorrect, incomplete, duplicate data in a data set. I removed duplicate dates or multiple location codes. All I needed was the year and a single location identifier. Vienna has 23 districts. I updated the 9BBZZ district code to actual numbers from 1-23.
surely there aren’t only 1.8 dogs in the whole district
  • Tool choice – Remaining tool-agnostic is generally a good thing #DobbyIsAFreeElf. Learning multiple tools when you’re just starting to learn about a new topic is not recommended, but I really wanted to try another tool besides Tableau. Enter: Datawrapper. Datawrapper doesn’t dominate the data viz tools market by any means, but it dominates my heart <3 It’s so intuitive and easy to use.
  • Getting things wrong – I was working directly in Datawrapper, the data viz tool, before figuring out what I wanted to visualize. I was getting frustrated that I couldn’t manipulate the data the way I wanted to until I remembered ✨ the power of pen and paper ✨. No pen or paper on me though (shameful, I know) so I opened my Notes app and quickly “drew” what I wanted the data to tell me.

I initially wanted an everything graph/map: showing the number of dogs per district in any given year, the number of dogs per 1000 habitants in any given year, and the year-to-year growth or decline of the number of dogs per district. Whether that’s doable or not, I still don’t know, but I opened my Notes app and outlined the 2 main things I wanted to see:

  • how the number of dogs in Vienna has changed since 2002
  • how many dogs there are in every district in Vienna
line chart showing how the overall number of dogs in Vienna has changed since 2002
a map showing the number of dogs in every district in Vienna

By the end, I had played around with the columns quite a bit, formatting the same data set in 7 different ways to get to the data I wanted 😅

…but I did it and enjoyed the ride.

  • In the year-to-year graph, I noticed a spike in 2012 and tried to figure out why.
  • For my second query, number of dogs in every district, I realized that my graph with the absolute number of dogs (see below: Woof woof) would be misleading since the biggest districts would have the most dogs. I switched to number of dogs per 1000 inhabitants and you’ll see that Hietzing and Liesing numbers were misinterpreted in the initial Woof woof graph.

*~`* Childlike wonder *~`* and continuous learning

There’s so much more to learn about data viz, but teaching yourself new things feels good. Teaching yourself new things for no other reason than finding them interesting? Inner child happiness up by 1000 points.

The sense of wonder is a muscle that must be cultivated like any other.

That, and patience: learning, getting things wrong, trying again.

HBR has a good article on this: Make Learning a Part of Your Daily Routine. The writers, Helen Tupper and Sarah Ellis, share how the founder of LinkedIn assesses founders of potential investments. He looks for individuals with an “infinite learning curve”: someone who is constantly learning, and quickly. As the CEO of Microsoft put it, “The learn-it-all will always do better than the know-it all.

The article includes different continuous learning methods such as experimenting and asking propelling questions.

“Propelling questions reset our status quo and encourage us to explore different ways of doing things. They often start with: How might we? How could I? What would happen if? These questions are designed to prevent our existing knowledge from limiting our ability to imagine new possibilities.

  1. Imagine it’s 2030. What three significant changes have happened in your industry?
  2. How might you divide your role between you and a robot?
  3. Which of your strengths would be most useful if your organization doubled in size?
  4. How could you transfer your talents if your industry disappeared overnight?
  5. If you were rebuilding this business tomorrow, what would you do differently?”

The examples are career-focused, but all the tips in the article can expand past that. For example, you might ask “Imagine it’s 2030. What three significant changes have happened in our friendship?”

Some thoughts on autodidacticism

Recently, I attended a casual philosophy discussion. There were about 12 of us and 3 Albanians which is a strange record. During the break between topics (”Does free will exist?” and “Is monogamy or polygamy better?”), we discussed our jobs. One of the participants asked how I got started in UX and whether I had a computer science degree. I said I was self-taught—he found it surprising—and talked about processes/tools I used, majoring in psychology, and general luck. “Luck is a frequent companion of a firmly fixed focus.” Luck is also often underestimated. More on this: Anita briefly explains redefining luck.

The terms self-taught or self-made have never really resonated. While I meet the textbook definition of self-taught, the term is misleading. While I didn’t get internships or pay for boot camps, I benefitted immensely from others who have shared generously. Be it passionate individuals or companies investing in dedicated roles for clear, thorough educational how-to guides, they’ve taught me what to do and what not to do despite us not having a 1:1 relationship and them not even knowing of my existence.

Even though I’m a writer and thus know very well the joy of having my work appreciated and recognized and shared and commented, I rarely comment online. There are hundreds of articles and books and videos out there that have taught me, healed me, challenged me, made me laugh or cry, and literally changed my brain — alas, no proof of it anywhere… but here and in my heart. Please keep sharing.

“Everyone needs help from everyone.”

Add a comment...

Your email address will not be published. Required fields are marked *