Radiologist Jeremy M. Wolfe on search strategies in our everyday life, amnesiac search, and false alarms at medical examinations
What is the most effective way to find an object among similar ones? Are computers much of a help in screening exams? Professor of Ophthalmology and Radiology at Harvard Medical School explains how we make decisions during the search process.
When is it time to quit a visual search? If you’re searching for something like milk in the refrigerator, the answer is pretty clear – when you find the milk, it’s time to quit. That’s easy. Suppose, on the other hand, the milk is not there. How long are you going to search in the refrigerator, before you decide that the milk really isn’t there and you will need to go to the store to look for the milk? Or suppose that you’re looking through the refrigerator for the food that you need to throw out, because it’s gone bad. In the case of the milk there was just one thing that you were looking for. In the case of bad food, you don’t know how many things there are. So when are you sure that you have found enough of the bad stuff that it’s time to quit?
This is the problem of quitting visual search, and it turns out not to be as trivial as it might sound. So let’s go back to the milk that’s not there. Maybe you would say “Ok, the way to find out that the milk is not there is that I would pay attention to every object in the refrigerator. And when I have looked at or paid attention to every object and found that none of them are the milk, I will quit. This would be what would be called a serial exhaustive search.
One idea – an idea that was very popular in the 1980-s and, in fact, was a part of the version of a visual search model that I wrote called “Guided search” back in the late 1980-s – one idea was that what you did was you would mark every item in the scene that you were looking at. Not really with a marker pen or something, but you would go through and you would remember “oh, I looked at this, and I looked at this, and I looked at this and now I don’t have to look anymore, because I have a mark on every object that might possibly be milk”.
We know that’s not true. One of the reasons we know that’s not true is because Todd Horowitz and I did an experiment back in the 1990-s, where we did the following. What we were doing was we had subjects looking for the letter “T” and the distractor items on the screen were letter “L”s. So you have to imagine a screen full of a random collection of “L”s, and there might be one “T” there. It would make sense to say “Ok, if there’s not a “T” there, I would quit after I’ve sort of crossed out and marked every “L” on the screen.
But what we did was we’ve randomly juggled the position of the “L”s, every one hundred milliseconds, that’s every tenth of a second. So, ten times a second all the items are changing position, completely unpredictably. So, you have no possibility at all of actually marking the items that you have rejected, and you might think this would make it a horrible search test. It sort of looks like a snow storm of “L”s on the screen. But it’s really not difficult at all. You look around, you pay attention from item to item sort of randomly and then you find the “T”.
Then it turns out, quite amazingly, that you are no worse at finding that “T” if all the items are sitting still.
You’re no better at finding them if they are all sitting still than you are if they are all being repositioned every hundred milliseconds. So it can’t be that what you were doing is looking at and marking each item as you go along, you must be doing something else.
What Todd Horowitz and I proposed was “amnesiac search” – we claimed that people were just looking at random, really. If you can imagine from the example from your statistics course, maybe, there’re a whole lot of examples that get worked in statistics courses that you imagine looking for a black ball in a bowl full of white balls. One sensible way to find that black ball would be to pull out a white ball, look at it, put it aside if it’s not white, pull out another white ball, put it aside – that’s called “sampling without replacement”.
That’s what we’ve always assumed was going on in visual search. But what really seems to be going on is what they would call in your statistics class “sampling with replacement”. You pick the ball, you look at it and if it’s a white ball, you throw it back into the bowl. And even though this doesn’t sound like a really good idea, that seems to be what people are doing in visual search. You pick one item, if it’s not the item that you were looking for, you sort of throw it back into the bowl and you might end up paying attention to it again.
Now, we claimed that there was simply no memory for where you had searched at all. A fair number of other researchers have come along since then and said “Wait-wait-wait, that’s a little too strong”. That there really does seem to be memory for where you have looked a little bit. And here’s why: you could imagine that you are looking for that “T” among “L”s, but one item, maybe one of those “L”s is a dramatically different colour. That will attract your attention. If you somehow couldn’t remember that at all, all you would do is keep going back to that same one, and keep going back to that same one. And you would perseverate, which is a neuropsychological symptom, in some cases a brain damage, but it’s not something that you do. You don’t get stuck on one salient item that’s just sitting there in the scene. So you must be able to remember some things.
That kind of a model works pretty well. It works particularly well if you are getting feedback about your search, so that you know when the search is successful and when the search is not successful. So, if you discover that you have missed the target in a search, the milk was really there, but you told your friends that there was no milk, and then somebody else looked and said there was milk, next time you know you should look for a little longer. In that kind of situation it works pretty well, as a model to say that you develop a theory about how much time you should search.
Now, there’re applied examples in the world that become much more difficult. Think about the guy at the airport, who is looking for threat objects in your carry on luggage. Basically, he is never finding anything, right, his job is almost entirely to know when to quit. Because you don’t want to get on an airplane if there are a lot of targets going through that checkpoint. So how does he know when to quit?
One of the things that we have found – and we published work on this in about 2005 – is that when you’re searching for something that’s very rare, your understanding of when to quit becomes somewhat corrupted. You end up deciding to quit sooner and sooner, and sooner, and you miss too many targets. This is sort of sensible out in the real world. If I am looking for unicorns on a city street, I shouldn’t search for very long, because they are just not going to be there. But if my job is to find something that’s very rare, then I don’t want to quit too soon, I really want to look for the right amount of time. But, on the other hand, if I look for too long, then the line at the airport stretches all the way down the terminal, and people start yelling at you, and they miss their planes, and they are not happy at all.
So there’s a tricky balance between searching long enough and not searching long enough.
It’s very difficult to get people to set that balance correctly. So one of the things that people do is try to get the computer to help out. The computer doesn’t care how rare the target is, the computer would be happy to search all day and all night. The problem is that the computer can’t do it perfectly, so the computer must work with a human to do this. Humans and computers do not work together perfectly on this.
As one way to think about why this is imperfect, let’s think not about the airport, where they don’t really use computer detection too much, but let’s think about searching for breast cancer. If you are in North America and you do a thousand cases in a screening exam for women looking for breast cancer, there will be about three cancers on average, about 0,3% of the time there will be cancers. A computer can find almost all of these cancers.
The problem is that the computer will also produce what’s known as a false alarm, or a false positive error about 10% of the time. It will find about 90% of the cancers, and a false alarm in about 10% of the time. That sounds pretty good, except for when you’re looking for something very rare. So let’s suppose we’ve got that 1000 cases, we have three cancers, let’s suppose that the computer finds three of them. But if it’s false alarming 10% of the time, it will make a hundred false alarms.
So it is now delivering to the radiologists a clue that is only correct 3% of the time. And if you ask yourself, “How much do you like advice if it’s only right 3% of the time?” Now, if your friends tell you: “You should try this restaurant, they’ve got excellent food”, and they are right three times out of a hundred, you’re not going to listen to your friend very much.
That’s the problem that we currently face – to get the humans to search successfully for rare targets in cooperation with the computer. We don’t really know how to overcome that problem, but maybe you will think of the correct answer as of a ripe area for further research.