Quantitative Data Analysis Methods

This following is based largely on a flowchart designed by Dr. Robert Gerwein, formerly of Bates College in Maine. If you want to open Dr. Gerwein's flowchart, you can retrieve it from the Resources area of the courseroom or you can find it at Dr. Gerwein's Web site.

Objectives

This is not a primer on statistics. In Track 3, your grasp of basic statistics should be growing, and, if you would prefer a refresher, you can open your favorite statistics textbook and brush up!

In fact, Dr. Gerwein's website is a good, basic primer (remember, though, that your dissertation will probably require more sophistication).

At the end of this document you will be able to use your research question and the flowchart to analyze the kind of analysis methods you need, and to decide the appropriate descriptive and inferential statistics to use.

Data Analysis in Quantitative Studies

While we need to spend a little time on the topic here, the following is not going to teach statistics. This document takes for granted that you know the concepts and terms used in statistics, such as distributions, parameter assumptions, significance levels, type-I and type-II errors, and so on. We focus on actually designing a plan for your data analysis, which will be statistical in nature. To build an analysis plan, we start with the research question, as you have heard so often before!

Let's begin.

Choosing Descriptive Statistics

You will always compute central tendency (that is, the mean, the median, and the mode), the range, and the standard deviation, and sometimes the variance (which is the standard deviation squared) if you need it for the inferential statistics.

When choosing your central tendency statistics, consider two things:

Will your data satisfy parametric assumptions? Some common parametric assumptions are that the sample data are distributed normally or like the population distribution, that the sample size is large enough, and that you're dealing with continuous data (that is, not categorical data, frequencies, or rankings).

If you answer yes, your data will satisfy parametric assumptions, then most likely, use the mean.
If your data do not satisfy parametric assumptions and you answer No: Then most likely, you will use the median or mode.

What central tendency statistic will the inferential statistical test require? To answer this question, you'll first need to figure out what your inferential statistics are going to be, which is where we'll turn our attention now.

Using the Research Question to Build the Analysis Plan

Take a look at the flowchart. The very first question you should ask is

What type of data will you be analyzing?

It will either be Continuous or Categorical.

How do you find out? Is the obvious next question, and the answer is look at the variables in your research question. Not the sample variables but the independent and dependent variables (or the predictor and outcome variables).

Background: Types of Data

Continuous data are data that express a range of values. Examples would include such things as:

Scores on a measure or test.
Rankings on a Likert-type scale.
Age expressed as a number of years (e.g., 21, 24, 30, 32).
Temperature on the thermometer.
Rates or percentages.

Categorical data are data that occur in discrete chunks. Examples would include:

Gender (either male or female).
Affiliation with a group (for example, one checks off "Evangelical Christian" when asked for religious affiliation).
Presence or absence of something (for instance, one has been treated for a condition or one has not).
Age expressed in chunks (e.g., 18–25, 26–30, 31–35, 36–40, etc.).

Examine your Research Question: Types of Data for Analysis

Are the variables (DV especially) continuous or categorical?

Continuous data? Use the left-hand path. It leads to a series of subsequent questions.
Categorical (discrete) data? Use the right-hand path. And, basically, you're done! You'll probably want to use chi-square analysis.

Next Question: What Does Your Research Question Want?

Now you look at what you're asking in your question. Are you asking about relationships? Take the left-hand path and ask yourself:

Do you have independent and dependent variables, or predictor and outcome variables?

If the answer is Yes, think about some sort of regression analysis.
If your answer is No, you'll do some kind of correlation analysis.

On the other hand, if your research question asks about differences: take the right-hand fork and ask, "what sort of differences?"

Are you comparing the difference between a single mean score and an existing or hypothetical value? Use one-sample t-test.
Are you comparing variances among scores? You can see the two options that might work best for your analysis on the flowchart. You might be done!
Are you comparing two or more mean scores, on a single dependent variable? For this, you have a few more questions to answer!

Let's take a look at the correlation analysis path first. Then we'll return to the difference-between-multiple-means kind of question.

Research Questions About Relationship Among Variables

Going beyond the descriptive and interpretive goals of many other qualitative models, grounded theory's goal is building a theory: It seeks explanation, not simply description.

It uses a constant comparison method of data analysis that begins as soon as the researcher starts collecting data. Each data collection event (for example, an interview) is analyzed immediately, and later data collection events can be modified to seek more information on emerging themes. In other words, analysis goes on during each step of the data collection, not merely after data collection.

The heart of the grounded theory analysis is coding, which is analogous to but more rigorous than coding in thematic analysis.

Research Questions About Differences Between Multiple Means

This kind of research question asks for differences between groups, basically. It's the sort of question often asked in a quasi-experiment, for instance, where we're interested in how a particular independent variable might have a differential effect on a dependent variable. The next question to ask is: How many groups? The answer, of course, is either two groups or more than two groups.

Okay, let's start with a two-group comparison.

If we're comparing two groups, we ask whether our data meet parametric assumptions. You'll have to know what this means!

Suppose they do not? We will take a little side trip by asking whether the data can be transformed to meet parametric assumption. If we can answer Yes, the data can be transformed—great, we will use the student's t-test as our statistic. This is the most used statistic for a two-group comparison of means when the data are parametric.

But if we answer No, the data cannot be transformed, then we can decide whether to use the Mann-Whitney U test or the Wilcoxon rank sums test.

Suppose that we have more than two groups? Let's look at that part of the flowchart next.

Differences Between More than Two Groups

Suppose a study asks about differences on a score among three or more groups? The process is the same, but it leads to different statistical tests.

The first question to ask is whether the data satisfy parametric assumptions.

If we can answer Yes or if we can transform the data, then we can use the one-way ANOVA—analysis of variance—to compare the means.

But if we have to answer No, or if we cannot transform the data, then we can use the Kruskall-Wallace test to compare medians.

Last step if the ANOVA or Kruskall-Wallace is significant: Do a post-hoc test to determine between which groups the greatest difference lies. There are a number of different post-hoc tests available. You and your mentor can determine which is best for your study.

Wrapping Up

As you probably know quite well, these are not the only kinds of statistics you can use to analyze quantitative data. However, they are perhaps the most common statistics used in social science dissertations. When you are designing your data analysis plan, use the flowchart or your statistics textbooks to zero in on the best available method.

If your research is more complex and the opening questions do not yield a pathway for you, chances are your analysis methods will be more complex and your statistics will also be more sophisticated.

Doc. reference: phd_t3_u04s3_quantanalysis.html