This following is based largely on a flowchart designed by Dr. Robert Gerwein, formerly of Bates College in Maine. If you want to open Dr. Gerwein's flowchart, you can retrieve it from the Resources area of the courseroom or you can find it at Dr. Gerwein's Web site.
Objectives
This is not a primer on statistics. In Track 3, your grasp of basic statistics should be growing, and, if you would prefer a refresher, you can open your favorite statistics textbook and brush up!
In fact, Dr. Gerwein's website is a good, basic primer (remember, though, that your dissertation will probably require more sophistication).
At the end of this document you will be able to use your research question and the flowchart to analyze the kind of analysis methods you need, and to decide the appropriate descriptive and inferential statistics to use.
While we need to spend a little time on the topic here, the following is not going to teach statistics. This document takes for granted that you know the concepts and terms used in statistics, such as distributions, parameter assumptions, significance levels, type-I and type-II errors, and so on. We focus on actually designing a plan for your data analysis, which will be statistical in nature. To build an analysis plan, we start with the research question, as you have heard so often before!
Let's begin.
You will always compute central tendency (that is, the mean, the median, and the mode), the range, and the standard deviation, and sometimes the variance (which is the standard deviation squared) if you need it for the inferential statistics.
When choosing your central tendency statistics, consider two things:
Take a look at the flowchart. The very first question you should ask is
How do you find out? Is the obvious next question, and the answer is look at the variables in your research question. Not the sample variables but the independent and dependent variables (or the predictor and outcome variables).
Continuous data are data that express a range of values. Examples would include such things as:
Categorical data are data that occur in discrete chunks. Examples would include:
Examine your Research Question: Types of Data for Analysis
Are the variables (DV especially) continuous or categorical?
Now you look at what you're asking in your question. Are you asking about relationships? Take the left-hand path and ask yourself:
On the other hand, if your research question asks about differences: take the right-hand fork and ask, "what sort of differences?"
Let's take a look at the correlation analysis path first. Then we'll return to the difference-between-multiple-means kind of question.
Going beyond the descriptive and interpretive goals of many other qualitative models, grounded theory's goal is building a theory: It seeks explanation, not simply description.
It uses a constant comparison method of data analysis that begins as soon as the researcher starts collecting data. Each data collection event (for example, an interview) is analyzed immediately, and later data collection events can be modified to seek more information on emerging themes. In other words, analysis goes on during each step of the data collection, not merely after data collection.
The heart of the grounded theory analysis is coding, which is analogous to but more rigorous than coding in thematic analysis.
This kind of research question asks for differences between groups, basically. It's the sort of question often asked in a quasi-experiment, for instance, where we're interested in how a particular independent variable might have a differential effect on a dependent variable. The next question to ask is: How many groups? The answer, of course, is either two groups or more than two groups.
Okay, let's start with a two-group comparison.
If we're comparing two groups, we ask whether our data meet parametric assumptions. You'll have to know what this means!
Suppose they do not? We will take a little side trip by asking whether the data can be transformed to meet parametric assumption. If we can answer Yes, the data can be transformed—great, we will use the student's t-test as our statistic. This is the most used statistic for a two-group comparison of means when the data are parametric.
But if we answer No, the data cannot be transformed, then we can decide whether to use the Mann-Whitney U test or the Wilcoxon rank sums test.
Suppose that we have more than two groups? Let's look at that part of the flowchart next.
Suppose a study asks about differences on a score among three or more groups? The process is the same, but it leads to different statistical tests.
The first question to ask is whether the data satisfy parametric assumptions.
If we can answer Yes or if we can transform the data, then we can use the one-way ANOVA—analysis of variance—to compare the means.
But if we have to answer No, or if we cannot transform the data, then we can use the Kruskall-Wallace test to compare medians.
Last step if the ANOVA or Kruskall-Wallace is significant: Do a post-hoc test to determine between which groups the greatest difference lies. There are a number of different post-hoc tests available. You and your mentor can determine which is best for your study.
As you probably know quite well, these are not the only kinds of statistics you can use to analyze quantitative data. However, they are perhaps the most common statistics used in social science dissertations. When you are designing your data analysis plan, use the flowchart or your statistics textbooks to zero in on the best available method.
If your research is more complex and the opening questions do not yield a pathway for you, chances are your analysis methods will be more complex and your statistics will also be more sophisticated.
Doc. reference: phd_t3_u04s3_quantanalysis.html