Looking down from my apartment balcony, I can see several air conditioner units attached to my building. The question is: which one operates for my apartment?
To narrow it down, I would note if the AC is on or off in my apartment and then see which AC units are on outside by looking at the fan blades. Over several checks, I can eliminate ones whose status do not match the apartment AC status, until there is just one left. This process could vary in length by quite a bit, however. It could be accomplished in one observation if all but one of the AC units are off and the apartment AC is on. On the other hand, if all the AC units are off while the apartment AC is off, no new information is gained. What is the average number of checks it would take to determine which AC unit is yours?
Probabilities
The probability of any one step happening is pretty simple. The probability that the apartment AC is on, and that you can narrow the units down from 3 possibilities to 1 is
where is the probability that an AC unit is on. The first term is the probability that the apartment AC is on. The corresponding AC unit will be on given that the apartment AC is on so that doesn’t affect the calculation. The second term is the probability that the other two units are off.
A weighted average of all the possible paths from 3 eligible AC units to 1, would give the average number of checks. However, the number of possible combinations quickly grows unwieldy. In the case of only 3 ACUs to start, there are 2 paths of length 1, 8 paths of length 2, 24 paths of length 3, etc.
Initial Simulation Case
A better way to investigate this is by simulating a bunch of runs and then taking averages. I did 10,000 runs where and , which is the starting number of AC units. The average number of checks before the AC unit is found was 4.17.
Plotting the distribution another way, a Cumulative Distribution Function shows that about 75% of the runs have 5 or fewer checks. Some runs have a lot of checks where no AC units get ruled out.
One other way to look at it is by calculating a confidence in which AC unit is the one. That is the inverse of the number of AC units still on the table. For instance, if there are 3 candidates, the confidence is 33%. Over time, the confidence increases as you narrow the field.
The curve is very similar to the CDF curve, even though the axes are different. This is the average over all the runs, which is why the curve smoothly varies.
Varying u
If we decrease , that opens up more opportunity to differentiate between AC units. When AC units are on 80% of the time, most are likely to be on at the same time. At 50%, the statuses are more mixed and you can eliminate roughly 50% of the AC units each check. I ran the simulation for values of from 0.05 to 0.95, with . The resulting curve shows the same phenomenon over a wider range. The shortest runs occur at 50% and the process takes longer when there is more agreement on either side.
The confidence curves for the higher values take longer to reach certainty.
Varying n
Increasing to and holding , the mean checks increases to 8.95. The distribution is more symmetric, with a long tail.
The CDF curve shows a change in concavity because early on, it is unlikely for so many AC units to get eliminated at once. It takes a few checks to get the candidates down to a manageable level and then the correct AC unit will be discovered from there.
I ran the simulation for values of from 2 to 30, with . As expected, the mean checks to find the AC unit goes up the more options you have to narrow down.
The confidence curves for higher are lower, agreeing with the mean checks curve. All runs converge to a solution by about 20 checks.
Checking Only When Off
The assumption is that you check the AC status at random times and that the apartment AC will be on % of the time. When the apartment AC has a high utilization rate, then the other AC units will also likely be on, which makes it hard to rule out AC units. If you wait to check when the apartment AC is off, the other AC units will still be likely on, and you can knock out more units. For the same setup as the initial case, the mean checks drops from 4.17 to 1.48!
We’ve seen how many checks you can expect to make to identify your air conditioner unit as a function of how often the AC is on and the number of starting possibilities. We also found a simple strategy to make the process easier. Will that come in handy? Maybe not, but sometimes it’s more interesting to think about a task and learn something from it, than to just do it and miss out on new intellectual territory.