mathNEWS

The Pseudo-Expert on Statistics

Part 1: The Schools of Thought

As I look back in my textbook from STAT 330 (and remember the pain and suffering that went along with it), I notice that the range of topics and philosophies were quite diverse. Indeed, the courses I was required to take involving the dreaded four letter prefix provided me with a wealth of information that must be seen to be believed. But what fascinated me the most was the three different schools of thought that statistics encompasses. Being a statistician myself, I found that it was my duty to provide others who, because of STAT 230 and STAT 231, never found the joy and euphoria that one encounters when taking a statistics course with a number higher than 329.

Statistics, as you know, is the study of analyzing raw data and, using a various array of distributions (both discrete and continuous), create a valid conclusion about the said data points. However, how these data points are analyzed have caused great debate with statisticians, oft staying up until the wee hours of the morn just to argee on what variables to use. It is these debates which brings out the three philosophies and how they differ.

The Frequentist School

One of the statistical philosophies that exist is the frequentist school of thought. This school is the most prevalent of the three since the frequentists were the pioneers of statistics. The frequentists will look at a set of data points that are independent and identically distributed in some distribution and use either a UMVUE or a MLE. Really, it depends on what the statistician wants to do that day; if he (or she) wants to use a UMVUE, then so be it. (By the way, a UMVUE is a Uniformly Minimum Variance Unbiased Estimator and a MLE is a Maximum Likelihood Estimator, but I disgress... )

So, our friendly frequentist has chosen the method of estimation that is best for the data points. But now what? Well, the statistician will create graphs, models and yes, the infamous confidence interval. The interpretation of the 100(alpha)% confidence for the frequentists is that if we have many samples from the same distribution and each sample was put in a range for our estimated parameters, then 100(alpha)% of these intervals shall include the true value for the parameters of interest. Of course, it should be noted that alpha must be less than 1 but greater than 0.

Interestingly enough, the frequentist philosophy is taught to many students, especially to second year students who have a vague idea on statistics. It is a shame, in reality, because once students swear never to take a statistics course ever again, they slam the doors to the other philosophies that, in my opinion, are quite interesting.

The Bayesian School

A lesser known but equally beloved philosophy is the Bayesian School of thought. Named after some guy named (get ready for this) Bayes, the Bayesians believe in using the same distributions and use the same methods of finding models as their frequentists brethren. However, there are some differences that have caused heated debate. For one thing, Bayesians estimate parameters using a posterior distribution (or prior density). For those not acquainted with posterior distributions, these are prior data sets that have a known distribution with already estimated parameters. The prior density helps Bayesians produce Bayesian risk and the ever popular Bayesian estimator. As for the interpretation of the 100(alpha)% confidence interval, Bayesians are not afraid to state that for 100(alpha)% of the time, the parameters that are being estimated will lie in between the lower and upper bound. Frequentists disagree vehemently with Bayesians on this point and will be very quick to demonstrate this.

Often, Mathies will not be exposed to Bayesian philosophy until they reach their third year of statistics. But once a student is exposed to Bayesian thought, the result is highly rewarding.

The Samplist School

Finally, there is a school of thought known as the Sample Theorists (or the Samplists). Unlike the frequentist and the Bayesian, the Samplist does not believe in using distributions to estimate parameters. In fact, the Samplist's main tools in analyzing data points are the mean, the standard deviation and the confidence interval. The mean and standard deviations are easily calculated while the confidence interval needs only a Student t or a Gaussian table to determine the range of this interval.

The Samplist, however, is more concerned about bias and variability than calculations. Samplists oft ask questions even about how the sample was achieved. Such questions include:

Indeed, this school of thought is ridiculed, ostracized and generally hated by Bayesians and frequentists. This philosophy has nonetheless found a home in the venerable institution that is StatsCan. Here, StatsCan has embraced the Samplists and have used their talents to create the best statistical establishment in the world (No, really... It is! Would the pseudo-expert lie to you?). Unfortunately, there is only one course that teaches the philosophy of the Samplist, and that is good old STAT 332. I just love that course!

So, there in a nutshell, are the three schools of statistics. All three schools are fascinating not only in their similarities but also in their differences. I personally encourage all Math students to take at least one third year STAT course, for not only will you be illuminated in everything you wanted to know about the normal and gamma distributions, but also you will be enlightened about frequentist, Bayesian and Samplist thought. Having been taught by Samplist, a Bayesian and plenty of frequentists, I am at peace with the tumultuous road that is statistics.


John ``The Pseudo-Expert'' Swan


[mathNEWS Home Page][Issue Index][Last Article][Next Article][Search][Feedback]

© 1997 mathNEWS