Saturday, 11 December 2010

Crowd Estimates

It appeared in the last Chance News a short article about the challenges of estimating the number of persons in a crowd. I haven't seen much about crowd estimates in the statistical literature, but it looks like by far the most acceptable way of estimating a crowd is by aerial photos. By using estimates of the density (persons per area) one can get to an estimate of the size of the crowd.

One problem with aerial photos is that it captures a moment and therefore will make it possible to estimate the crowd at that specific moment. If you arrived at the place before the photo and left the place before the photo, you are not included in the picture. I can see that that is okay for gatherings like the Obama's swearing, or any sort of event that is composed of a single moment, like a speech. But if you think about events that go over a period of time, like the Carnival in Brazil or the Caribana in Toronto then you need more than photos, and even photos taken at different points in time can be problematic because you don't know the amount of folks that are in, say, two photos.

Sometime ago I got involved in a discussion about how to estimate the number of attendees in the Caribana Festival. This is a good challenge, one that I am not sure can be accomplished with good accuracy. We thought about some ways to get this number.

The most precise way, I think all agreed, would be by conducting two surveys. One in the event to estimate the percentage of participants that were from Toronto. This by itself is not a simple thing in terms of sampling design if you think that the Caribana goes over 2 or 3 days, in different locations. But lets not worry about the sampling here (maybe in another post). The other survey would be among Torontonians, to estimate what is the percentage of the folks living in the city that went to the event. Because we have good estimates of the Toronto population, this survey will give us the estimate of the number of Torontonians that went to the event. From the first survey we know that they are X percent of the total, so we can estimate the total attendance to the event.

Another way, we thought, would be to spread some sort of thing to the crowd. For example, we could distribute a flier to the crowd. This should be done randomly. Then we, also with a random sample, interview folks in the crowd and ask then whether or not they received the flier. By knowing the percentage of the crowd that received the fliers and the total number of fliers distributed we can also get to an estimate of the crowd.

A related way would be by measuring the consumption of something. We thought about cans of coke. If we could get a good estimate of the number of cans of coke sold in the area and if we could conduct a survey to estimate the percentage of the crowd that bought a can of coke (or average number of cans per person) we could then sort out the crowd size.

Another way would be to use external information about garbage. It is possible to find estimates of the average amount of garbage produced by an individual in a crowd. The if you work with the garbage collectors to weight the garbage from the event, you could get to an estimate of the crowd. A problem with an external estimate of something like amount of garbage per person is that it could variate a lot depending on the event, the temperature, weather, place, availability of food and drinks and so on. So I am not sure this method would work very well, maybe some procedure could be put in place to follow a sample of attendees (or maybe just interview them) to get an estimate amount of garbage generated per person.

Finally, an aerial photo could be combined with a survey in the ground. Photos could be taken in different points in time and a survey would ask participants when they arrived at the event and when they intend to leave. Of course the time one leaves the event will not be very accurately estimated but hopefully things would cancel out and in average we would be able to estimate the percentage of the crowd that are in two consecutive photos, so that we can account for that in the estimation.

I think crowd estimates is an area not much developed for these events that goes for a long period of time. There is room for creative and yet technically sound approaches.

No comments: