Wednesday, February 14, 2007

One Million Answers

I was thinking about the game of Twenty Questions recently. In particular, I became curious about just how many different subjects could be distinguished by asking twenty yes-or-no questions. As it turns out, answering this question touches on a number of interesting matters. I will explore those later, but, first, let us address the basic question: how many different items can one identify?

Consider first what you can find out with just one question. The best case is to ask a question that distinguishes between two items. It is not possible to do any better: a question can be posed such that yes matches one subject and no matches a second subject, but there is no third response to match a third subject. Of course, a poor question won't tell you anything at all; this doesn't affect the maximum number of subjects possible, so we will ignore it and assume the best case.

So far, we have not said those two subjects are. They could be anything: people, animals, plants, rocks, cars, fungi. They could also be sets. Take each item to be a set containing two other items. Now, for whichever set that we determine with the first question, a second question allows us to distinguish between the members of that set.

It should be clear how we proceed. We take the second item to again be a set with two items, each of which we take to be a set with two items, and so on. In order to identify a specific item, we must stop this progression with the twentieth question. Putting it all together, we have nested sets constructed such that each yes-or-no question peels away a layer until we are left with a unique item. Each question doubles the number of items that can be identified, so that one question distinguishes two possibilities, two questions distinguish four possibilities, three questions distinguish eight possibilities, and so on. In general, n questions allows us to distinguish 2n possibilities. For n=20 questions, that gives 220=1,048,576 possibilities—roughly one million possibilities.

Although it made a nice post title, the number of possible solutions to a game of twenty questions may not actually be one million. There is a traditional question of “animal, vegetable, or mineral?” at the beginning of the game. It is not clear to me whether that is supposed to be the first question, or if it is extra information provided. Either way, there are three possible response. Additionally, identifying the subject may or many not also count as a question, with a necessary positive answer. Given that, there may be as many as 3,145,728 possibilities, or as few as 786,432 possibilities.

Whatever the precise total, it is, to put it mildly, a big number. To put it in perspective, the Merriam-Webster unabridged dictionary has fewer than half a million entries. In fact, a sizable dictionary could be used as a solution key: pick the middle page of the dictionary and ask whether the object of interest occurs (or, better, would occur under the assumption it is in the dictionary) before the first word on the page, pick the middle page of the identified half of the dictionary and ask a similar question, and so on, continually halving the possible results. Strictly speaking, each question in this approach does not halve the number of words under consideration, but it is close enough to be an effective strategy. More generally, any effective strategy aims to halve the number of remaining possibilities, which leads naturally to beginning with general questions and moving to successively more specific matters.

The above questioning strategy is an example of a divide and conquer algorithm, where a problem is recursively broken down into smaller subproblems until a direct solution is trivial. More specifically, the strategy is a binary search procedure, where the space of possible solutions is continually halved in size until an answer is found. Such algorithms are important and effective ways to solve many problems of diverse character; Twenty Questions is an amusing parlor game, but also a model for efficient problem solving.

No comments: