Bias in Data Analysis #musedata #musetech #data #bias

This is the second in a series of posts about confronting bias. These #longreads use narrative to help bring up bias in an accessible manner.

As Director of the Art Museum of New South Overthere, you are constantly being asked to make decisions based on data. Sure, you had your last math class in 10th grade, and then avoided math through your PhD. But, you are a specialist, I mean in lantern slides, but still, you got this, right?

Read these short scenarios and suss out where bias might come in.

Scenario 1

You are dying to know how much people like lantern slides. You write out a survey for your staff to deploy. You want to be direct with your visitors so that you don’t waste their time. So, you asked the people in the Joe Bright Memorial Lantern Slide Gallery (and broom closet).

  • “What do you like about lantern slides?”
  • “Is there anything that you don’t like about lantern slides?”
  • “What would you like to see with the lantern slide display?”

You were happy to find out that there is nothing that they don’t like about lantern slides. They also love your lantern display as it is. The only thing they want is more information, which is what you thought. They just wished they could know more about the slides! How wonderful to know what you do for a living is so relevant for people.

Explanation

In this case, you have several problems.  First, your survey questions are constructed with a particular slant towards lantern slides. This a situation of interview bias. It’s as they say in legal shows; you are leading the witness. When you construct survey questions you don’t want to tip the participants off to the “right” answer. People have an inherent need to please, and so they will answer in a way that seems correct.

Additionally, there is a selection bias at work here. You went to the gallery where you hope to make changes. On one hand, you are being proactive. However, you are skewing your data. You have a sampling error at play. A better study would interview not just current visitors to the lantern slide gallery but also those in the museum who are not currently going to the lantern slide gallery.  In other words, you want visitors and potential visitors to draw a complete picture of the situation.

______________________________________________________

Scenario 2

Your first meeting this morning was with the board. This nice old lady, Sweetie Monroe, heiress to the great Marshmallow Mills fortune, was hoping you could explain why you don’t have any students in the galleries.  Now, you have never seen Sweetie upright before 1:00 pm in the morning. But, you also know that the school scheduling staff member, Peaches LaPew, is busy every morning. Last week, she tried to get you to do a Kindergarten tour, because you had more students than staff.

You’re not going to be able to show Ms. Monroe children in the flesh (you aren’t a miracle worker), so you ask your staff to do a little comparative analysis. Your head of Marketing/ Audience Research/ Programming & Security, Joe Exhaustino, has emailed you a super long report. Does he understand how busy you are? You don’t have time to go through this like you were in school. Luckily, it has a clear summary.  You have plenty of kids coming. Fabulous.

Explanation

In this case, I have bad news for me. You didn’t look too closely at your data. You used the data like a yes-man. This is an example of choice-supportive data. If you looked more closely at the data, you would notice that only 4 percent of visitors are 18 and under. Most of those are school groups. Now, I don’t know what your measure of success is, but 4 percent of total visitors seems extremely low.

______________________________________________________

Scenario 3

Before you can get to your chance at regaling Ms. Monroe about your fabulous student tours, you find yourself stunned by numbers. You are sitting at a breakfast meeting with the directors of all the local organizations. The head of the Community Development Corporation is sharing a graphic. Apparently, the average percentage of family visitors at museums is 10 percentage. Eek. You get nervous. So, you turn to the guy next to you, the Director of Coffee Cups & Porcupine baskets. He smiles and then says, “Oh, yes, that’s just average. We are at 18%.” You leave the meeting despondent, and shoot off a quick email to your grant office/ gardener to get money right now for school tours.

Explanation

In this case, the data was combined inappropriately, though you wouldn’t necessarily know it. You didn’t have all the information when you sat in that meeting and looked the graph.  The number crunches decided more numbers were better. But, in doing so, they didn’t use like categories. In this case, the number cruncher didn’t separate out school groups. Only two of the four museums do school tour. This means that in those museums children are coming in though out the week without adults. They will have higher numbers of children than those who don’t do school tours. The lesson is that you need to thoughtful in combining data. There are other challenges with combining data. If you aren’t careful, when you aggregate data, you can accidentally contradict what the original data said. This is called the Simpson Paradox.

______________________________________________________

Scenario 4

You are hoping to buy more media adds. You call on Joe Exhaustino again, but this time with your demographic numbers. He gives you the classic bell chart with mostly mid-aged people attending the new exhibition. No surprises. So, you will just use digital adds to get more young people. Finally, an easy decision. After the ads go out, you take a turn in your Lantern Slide gallery. Something odd is up. The gallery is full of really old men. You go back to Joe and ask him if his numbers are right.  He shows you his numbers. He had used a good-sized sample. He crunched the numbers and he ended up with a graph that didn’t look right. So, Joe removed the outliers, the old people.

Explanation

Joe has been doing numbers for years. And, he assumed that the attendance numbers should conform to a bell curve. This is called the Non-Normality bias. But, another bias was in play here.  Instead of investigating the outliers, they disregarded those numbers. Joe did better on his second crack at it. Along with the quantitative data, he looked at surveys. Turns out the lantern slide gallery had become a mecca for the over 95 set. Practically, everyone in the state in that age demographic comes to the museum to check out those sweet slides, particularly on “free coffee Friday”.

5 Big Ideas from #GoogleIO For Museums to Note #IO17 #MuseTech

Google I/O, that glistening moment when developers galore descend up San Francisco to hear prognostications, occurred mid-May.  The keynote speech offered some insight into Google’s vision on the next decade. Admittedly, GoogleIO 2017 is an exercise in marketing synergy and willing suspension of disbelief. The keynote had the feel of equal parts TED-talk, Home Shopping Network, Dad Jokes, and Nickeleon’s “You Can’t Do that on Television”, with a soupcon of Svengali. If you look past the hokey jokes and the corporate name drops, there were some useful harbingers of our possible future.

And, why look to Google I/O for futurecasting? As Sundar Pichai said Google “Uses technical insights to solve problems at scale for deep engagement.” If you can’t image the scale of Google, think of it this way. 1.2 Billion images are uploaded to Google Photo every day. With about 35,000 museums in the nation, all of the collections in the country could be uploaded in a week or so.

Google is masterful at understanding first world problems and addressing them.  Much of the undertone for Google I/O was that they were helping cure the stress of the era (despite the fact those stresses grew from tech like Google). With the power of scale, they are poised to continue to make civilization wide changes. (Think I am being hyperbolic? Reflect on the diffusion of the phrase “Google It”.)

Overall, Google I/O was all about artificial intelligence, where machines perform actions that had originally needed human thinking. If the mobile period was about touch, AI is about sight & voice/sound. AI is becoming more human in its meaning making, particularly in its seamless understanding of visual and textual data. The change to AI will have major social changes. Think about the changes that occurred with mobile. When a new platform is introduced, peoples’ modes of interacting change until those practices become naturalized.

So, what practices will become natural for our future visitors?

  1. Computers will be able to read images and text: Google Lens will make reading images increasingly sophisticated. For example, your phone will be able to “read” signs, turning the pictures into text. In other words, images, not just text, will be understood and acted on.  What does this mean for museums? The answer is two-fold. First, museums will have ever more robust tools to read images.  Second, it means visitors will expect handheld technology to make sense of the world seamlessly. They will not want keyboards, QR codes, or any barrier in the way of knowledge acquisition.
  2. You will talk to computers and they will talk back: Google Home now has 4.9% error rate for misunderstanding spoken words; this down from 8% two years ago. Soon, a variety of tools will respond to a voice command. What does this mean for museums? Again, bye bye keyboard commands. If you want to find the fiercest dino, you will expect to ask a technology tool and then expect that tool to respond in audio correctly. (Unless you are in an art museum. In that case, it will tell you that sadly, they got no dinos).
  3. Google Maps will go granular: Virtual Positioning Services (VPS) helps move AR forward. This tool was described in terms of shopping, where you will be able to know where anything is on a mapped shelf in a warehouse. What does this mean for museums? There are several possibilities here. First, virtual collection record-keeping might change collections management and collection access. Next, think about gallery wayfinding. VPS will be able to calculate people’s position to a few centimeters. Instead of using text to point people to a certain basket amongst 100 baskets, your tool will be able to point people to the exact basket that defines the genre.
  4. The artificial world will feel pretty real: Overall, experience computing more like the real world. Already, many students are enjoying AR. Google Expedition is a classroom app where students can experience coral reefs. What does this mean for museums? Firstly, our visitors of the future will be raised with AR as part of their regular experiences. This can mean that they will expect this of museums, though if we implement AR, they will have high expectation. Alternately, they might choose museums for their authenticity.  I suspect the future of AR in museums will be both really good AR and then AR-absent experiences.
  5. Promiscuous content will be the norm: Youtube is already a culture that fosters creative, iterative, and interrogative content. And other tools like Google Seurat are making it easier to render in 3D for VR. In other words, AR/ VR will become ever easier to implement even for citizen-technologists. Everyone will be doing it. What does this mean for museums? This to me is the most exciting point. The tools that were once in the hands of only a few are ever more quickly being available to many. Visitors of the future will not only be living in a milieu suffused with Artificial Intelligence but also be creators of such content. In other words, imagine your crowdsourced Instagram beta project crossed with robots, Pokemon, or Jurassic Park.  Alright, kidding. My point is that you won’t be able to imagine a specific outcome of these techs for our field. However, we should expect that technology will become ever more seamlessly human in its behavior; our visitors will expect our tools to follow suit.

This is what struck me about Google I/O.  What about you? What struck you? And, where  will that tech take our field in the future?