2011. október 8., szombat

A flaw of frequentist statistics


Consider the following story from Wikipedia’s Likelihood Principle

'An engineer draws a random sample of electron tubes and measures their voltage. The measurements range from 75 to 99 volts. A statistician computes the sample mean and a confidence interval for the true mean. Later the statistician discovers that the voltmeter reads only as far as 100, so the population appears to be 'censored'. This necessitates a new analysis, if the statistician is orthodox. However, the engineer says he has another meter reading to 1000 volts, which he would have used if any voltage had been over 100. This is a relief to the statistician, because it means the population was effectively uncensored after all. But, the next day the engineer informs the statistician that this second meter was not working at the time of the measuring. The statistician ascertains that the engineer would not have held up the measurements until the meter was fixed, and informs him that new measurements are required. The engineer is astounded. "Next you'll be asking about my oscilloscope".'

The statistician probably used a hypothesis test, where the whole distribution of possible outcomes does matter. If the theoretical distribution changes (even if at values that never happened), he gets a different answer. I.e. thinking about "as extreme or more extreme events" depends on the distribution of more extreme events too.

So besides the inability of frequentist statistics for making actual predictions about what is going to be the next observation (it only gives confidence intervals or tests), we also have this paradox.

There are two other school of statistics left, Bayesian or fiducial. They actually give prediction intervals for the next observation -- which is usually what we need -- and they are also asymptotically correct.

Philosophy

Some Principles 


1. After reading [Kass, Wasserman The Selection of Prior Distributions by Formal Rules], I have finally accepted the first principal rule, which I struggled to do so far:


There is an initial phase of data collection when the researcher is ignorant. 


Sad but true. Imagine a researcher running in a computer simulation you started. He observes one normal draw in his world, when he is asked about the standard deviation based on one single sample. Clearly truth of his statements will be the result of luck.


2. Second principle is that inference should be non-informative, possibly with accurate or pessimistic frequentist coverage.


3. The third principle is that inference should be invariant over parametrization. (Bayesian inference on p parameter of Bernaulli trials is a good example. It does matter whether we use non-informative, i.e. uniform prior over p, or p square.)


Notes


Principle 1: Necessitates a subjectivist thinking in these situations, i.e. Bayesian or minimax.




In several cases is it wise to make probabilistic statements using minimax. I.e. assuming that the world has a distribution which is the worst outcome in response to your future actions based on your model. So one should choose a probabilistic model of the world, which minimizes your utility being maximized by your actions.



Principle 3: Bernardo's reference prior or Jeffreys prior serve as invariant priors for Bayesian inference, but there are many others to choose from.


Note that there is no invariance on how to build a model. I.e. X, or X cube having normal distribution is in the discretion of the model designer. Even if choosing the model is not in our scope, subjectivity lurks in the process of inference one more time.


Conclusion


So far it appears that for doing inference in general, one must adhere to ignorance in some situations, subjectivity, and pessimism (1), aiming for accurate frequentist coverage (2) and the requirement for invariance over parametrization (3).


Frequentist thinking fulfills principles 2 and 3, but fails to handle 1, hence there is space for other inference methods, such as Bayesian.

2011. október 6., csütörtök

Schools of statistical inference

So why writing about statistics, while there is so much about it out there already? It is just the problem... Let alone consider frequentist statistics, it is quite a mess today. Scientific literature combines schools of the previous century such as fixed-level testing (rejection/acceptance of Neyman and Pearson), showing actual p-values (Fisher) and parameters estimation such as maximum likelihood or use of confidence intervals. So how to infer based on your data? Even more what about Bayesians with their subjectivity? What about non-informative priors? And the third school, fiducial inference?

Probably too complicated for the 'everyday' scientist, including myself. So let's see what is what, and try to find the right one, possibly without diving into all those formulas of the macho math gurus.