There is an old saying in the world of computer programming: GIGO, which stands for “Garbage In, Garbage Out”. This applies in many other areas of life, and certainly to the use of statistics. If you start with wrong information you will end with wrong conclusions. Simple! The essential first step in the process of producing statistics is often called “data capture” but it means getting hold of the basic information. But how often do we ask about the way that step was carried out when we look at statistics?
Think about the times you provide that basic information for someone, such as when you fill in a form or answer a questionnaire, whether on paper or online, or even when you are stopped in the street by someone doing market research:
- How clear are the questions? (There is an art, or science, in producing questionnaires, but it is not always practiced).
- How much time or effort do you put into your answers?
- Do you often find yourself mentally tossing a coin to decide whether you were “totally satisfied” or “quite satisfied” with a product, or whether you do something “often” or “sometimes”?
- What do you do when asked a question you would rather not answer? Perhaps about your income or health. You might even want to keep your opinions on certain things private if you are unsure how anonymous or confidential your answers will be, especially when the questions are being asked on behalf of your employer. Do you ever give in to the temptation to give a safe, bland answer or leave that one blank?
- What assumptions do you think the statisticians will make about the blanks? Or the “don’t knows”?
Sometimes researchers do not have to rely on the answers you or I give. They can get the information they need by counting the number of people using a facility, or the number of cars using a road, or the number of badgers living in a wood. Can we rely on these sorts of things being measured more accurately?
The front-line research is not always carried out by the scientists or others studying the phenomena in question. They often use students, volunteers, or casual employees to do a lot of it for them. This raises some more questions for us:
- How committed are these data collectors to the project?
- How much training do they get to ensure consistency?
- Do they have to use initiative to interpret what they see? E.g. do they include people who came and went without really using the facility? Is a motorcycle with a sidecar a vehicle? Is that the same badger or another?
I am not saying we should not use statistics – we have to – but I am saying that unless we have satisfactory answers to the sort of questions above, we should take the conclusions with a pinch of salt. Or more than a pinch perhaps.
P.S. I have just heard someone making some very bold claims about the opinions of the majority of people in the UK based on a survey of 1000 people. How typical were they? Out of over 60 million – yes, million – Britons!