Having your cake… part 2

August 3rd, 2010 by Brad

[This is the second post in a series of posts dealing with the representativeness of the YourMorals data, see here to read the first post]

Last time, I gave a broad overview of the descriptive representation of the YourMorals dataset. In a nutshell, we discovered that the YourMorals respondents were much more educated, more likely to self-identify as liberal, and more likely to be white than the population.

In this post, I will explore the question of whether the YourMorals respondents are representative of the population after we condition on observable characteristics. Put another way, would we expect two individuals, one randomly chosen from the population and one drawn from the YourMorals data, who share all the same demographic characteristics (age, race, education, political ideology, place of residence) to look the same in terms of their scores on the Moral Foundations Questionnaire?

To conduct this kind of analysis, first we need a benchmark against which to compare the YourMorals data. As I mentioned in my previous post, the gold standard is a randomly drawn sample from the population. Luckily, we have just such a survey. Prior to the 2008 election, Knowledge Networks* fielded a version of the Moral Foundations Questionnaire to a representative sample of the U.S. population. This provides a good point of comparison for our (much larger) convenience sample.

The first task is to process the YourMorals data so that it looks more like the general population. I used a basic sample matching technique to match individuals from the YourMorals data and the Knowledge Networks data. This is a crude technique, but effective. Basically for each individual in the Knowledge Networks sample (the “match target”), I found an individual (or individuals) in the YourMorals data that matched the demographic information for the “match target.” These cases then become the comparison group. After the samples have been balanced in terms of observable characteristics, any differences we observe between the two can be ascribed to the compounding factors that we cannot observe.**

The following figures show how the distributions of the matched YourMorals data compares with the distributions in the sample from Knowledge Networks. The dashed lines show the distribution for Knowledge Networks, the solid lines represent the YourMorals data.

Figure 1

The distributions of the foundations in the two data sources look very similar for the Fairness/Reciprocity foundation, but for all of the others, there are significant differences between the YourMorals and the Knowledge Networks respondents.

A little more digging reveals some interesting patterns. Splitting up the sample by ideology yields:

Liberals:

Figure 2 - Liberals only

Conservatives:

Figure 3 - Conservatives onlyTwo of the foundations seem to stand out in these comparisons. Liberals in the YourMorals data are particularly low on the Purity foundation (when compared against liberals in the Knowledge Networks data), and conservatives from the YourMorals sample seem to score lower on the Harm foundation. In both cases, YourMorals liberals seem more like population liberals on the first two foundations (Harm and Fairness), and the conservatives in the sample seem more like population conservatives on the last two foundations (Authority and Purity). No matter how the data is cut, the YourMorals sample seems to score lower on the Ingroup foundation.

The comparisons between the general population sample and the convenience sample in this post raise some significant questions about the possibility of using the self-selected respondents in the YourMorals sample to make inferences about the population. These problems in the data are particularly evident in the Ingroup foundation, the purity foundation (for liberals), and the harm foundation (for conservatives).

As was the case with demographics, all is not lost. One last look at the data shows that again the foundations are more or less proportionally correct. Liberals score higher in on the Harm and Fairness foundations in relation to their scores on the other three, and conservatives show more or less equal scores across each of the foundations. The bar chart below shows the average scores of the foundations broken out by survey source (KN and YM for Knowledge Networks and YourMorals respectively) and ideology:

Figure 4

Next time, I’ll discuss how we might correct for some of these demographic and attitudinal biases in the data.

*For the uninitiated, Knowledge Networks is a survey research firm that has gone to great lengths to put together a panel of internet users that is nationally representative. They have recruited a large panel of individuals to take internet surveys. These individuals were generally contacted by telephone, and in cases where the respondent did not have internet access, Knowledge Networks provided access. See this link for more information.

**For a quick primer on the theory behind sample matching see this Wikipedia entry.  I am using exact matching on categories of age, race, education, ideology, and state of residence.

Posted in conservatives, liberals, Uncategorized2 Comments »
Tags: , ,

2 Responses to “Having your cake… part 2”

  1. […] In my previous posts, I’ve  discussed the many potential difficulties in using an entirely self-selected internet sample for inferences about general population parameters (whether or not a particular state or congressional district scores higher than another in terms of its moral foundations) as opposed to the intra-individual comparisons that are the bread and butter of psychologists (like how the foundations tend to correlate with ideology). I think that I have shown that the raw data are unsuitable for talking about the general population. The sample is demographically unrepresentative (see here) and somewhat attitudinally unrepresentative (see here). […]

  2. […] statistical techniques** and data from Knowledge Networks (which I’ve described a little elsewhere), I estimated a model of political moderation as a function of these three dimensions. We can then […]

RSS feed for comments on this post. And trackBack URL.

Leave a Reply

You must be logged in to post a comment.