A Workbook For Arguments, Part 2: Generalizations and Statistics

RenderedImage-2

In this series, I’m walking through the critical thinking rules that are sections of the book A Workbook for Arguments: A Complete Course in Critical Thinking. The information presented in this particular series is largely borrowed from the aforementioned source. Nonetheless, paraphrases and original examples are mixed into the posts alongside the content derived from the book.

My aim for this blog (as you guys know), before I get into any actual philosophical debates, is to equip you with the tools necessary for how to think, not what to think.

The key is to be as specific as possible in explaining the ways in which an argument does or does not follow each rule. So, without further ado, let’s get into the second installment in this series!

Some arguments offer one or more examples in support of a generalization. A generalization is a claim about some or all things of a certain type.

Women in earlier times were married very young. Juliet in Shakespeare’s Romeo was not even fourteen years old. In the Middle Ages, thirteen was the normal age of marriage for a Jewish woman. And during the Roman Empire, many Roman women were married at age thirteen or younger.

This argument generalizes from three examples to “many” or most women in earlier times. In premise-by-premise form:

Juliet in Shakespeare’s play was not even fourteen years old.
Jewish women during the Middle Ages were normally married at thirteen.
Many Roman women during the Roman Empire were married at age thirteen or younger.
Therefore, women in earlier times married very young.

The question, then, is: when do premises like these adequately support a generalization?

Given the examples are accurate, generalizing from them can sometimes be difficult, and the rules to be presented below will aid in one’s ability to analyze and evaluate arguments about generalizations.

Rule 7: Use more than one example

A single example offers next to no support for a generalization. In a generalization about a small set of things, the strongest argument should consider all, or at least many, of the examples.

Generalizations about larger sets of things require picking out a sample. How many examples are required depends partly on how representative they are (explored in more depth below). It also depends partly on the size of the set being generalized about. Large sets usually require more examples.

When thinking about generalizations, it’s helpful to ask yourself two questions: First, what type of thing is the generalization about? Second, what does the generalization say about the things of that type?

Rule 8: Use representative examples

Even a large number of examples may still misrepresent the set being generalized about. A large number of ancient Roman women, for instance, might establish very little about women generally, since ancient Roman women are not necessarily representative of other women. The argument needs to consider women from other early times and from other parts of the world as well.

Everyone in my neighborhood favors McGraw for president. Therefore, McGraw is sure to win.

This argument is weak because single neighborhoods seldom represent the voting population as a whole. A well-to-do neighborhood may favor a candidate who is unpopular with everyone else. Student wards in university towns regularly are carried by candidates who do poorly elsewhere. Besides, we seldom have good evidence even about neighborhood views. The set of people eager to display their political preferences to the world is probably not a representative cross-section of the neighborhood as a whole.

In general, look for the most accurate cross-section you can find of the population being generalized about. For instance, if you want to know what people in other countries think about the United States, don’t just ask tourists — for of course they are the ones who chose to come here! A careful look at a range of foreign media will give you a more representative picture.

Many generalizations are about diverse groups. Consider, for instance, an opinion poll showing that Europeans disapprove of capital punishment. Europeans are a diverse group of people. To find representative examples, then, we need to find or assemble a group of people that is, on the whole, representative of all Europeans. That is, we need to select examples so that our group has roughly the same (distribution of) characteristics as the group of all Europeans — the same mix of ages and the same proportion of men to women, of college-educated people to non-college-educated people, of native-born people to immigrants, of wealthy people to poor people, and so on.

A sample that fails to represent the group accurately is known as a biased sample. The best way to ensure that our sample is unbiased is a random sample — a sample in which every member of the group has an equal chance of being included in the sample.

In order to construct a random sample, consider the following:

First, ensure that you are sampling from the entire group about which you’re making a generalization.

Furthermore, Rule 8 requires choosing your examples in ways that ensure a truly proportionate sample. If you contact students at random from the college’s email directory, you’ll miss students who don’t use their college email address. If you approach students who are on campus during the day, you’ll miss students who only take evening classes. When you design your methods for choosing and contacting members of the group, think carefully about whether your methods overlook, or under- or over-represent, any part of the group. Try to ensure that each member of the group has an equal chance of being in the sample.

Second, (generally) do not let individual members of the group decide for themselves whether they want to be in the sample.

Rule 9: Background rates may be crucial

To persuade you that I am a professional archer, it is obviously insufficient to show you a bull’s-eye I have made. We need to know how many times I missed. Getting a bull’s-eye in one shot tells quite a different story than getting a bull’s-eye in, say, a thousand.

Horoscopes are another common example. To properly evaluate the “evidence” adduced by believers in astrology, we need to know something else as well: how many horoscopes don’t come true.

To evaluate the reliability of an argument featuring a few vivid examples, then, we need to know the ratio between the number of “hits” (so to speak) and the number of tries. It’s a question of representativeness again. Are the featured examples the only ones there are? Is the rate impressively high or low?

The “Bermuda Triangle” area off Bermuda is famous as a place where many ships and planes have mysteriously disappeared. Avoid it at all costs! There have been several dozen disappearances in the past decade alone.

But several dozen out of how many ships and planes that passed through the area? Several dozen, or several hundred thousand? If only twenty, say, have disappeared out of maybe two hundred thousand, then the disappearance rate in the Bermuda Triangle may well be normal, or even unusually low — certainly not mysterious.

The takeaway is that non-events or non-examples are actually just as important as examples in evaluating the generalization we might make from them. That is what the occurrence rate tells you: how significant the examples really are against the relevant background.

When dealing with an argument for a generalization, think about which background rates are relevant for deciding how well the particular examples or statistics in the argument support the argument’s conclusion. In the Bermuda Triangle example above, for instance, the relevant rate is the rate at which planes/ships disappear in the Bermuda Triangle (and also the rate at which planes/ships disappear in any randomly chosen area of ocean identical to the area of the Bermuda Triangle).

Once the rate has been specified, ask yourself what further information you would need to calculate that rate. In the Bermuda Triangle example, you would need to know how many planes pass through that area.

Sometimes background rates matter in a more subtle way. Consider this little puzzle:

Tanya is a talented card player with the most impassive poker face you’ve ever seen. Which is she more likely to be: a high school teacher or a professional poker player?

Tanya sounds a great deal like a professional poker player, and since this doesn’t appear to be an argument by generalization, you might not think to consider background rates. If you do consider background rates, though, you’ll realize that there are a very large number of high school teachers — many of whom could be excellent poker players — and almost no professional poker players at all. Thus, regardless of Tanya’s poker skills, she’s much more likely to be a high school teacher. The lesson here is to think about background rates even when the argument does not obviously invoke any generalizations.

Sample

In a recent experiment, some students used a studying technique called “retrieval practice essays.” After reading a passage, the students wrote down what they remembered from it, without the passage in front of them. A week later, these students answered about two out of three questions about the passage correctly. Therefore, writing retrieval practice essays is a good way to study.

We need to know, however, how well students did if they studied using different techniques — or even if they didn’t study at all. That is, we need to know the proportion of questions that students got right if they used other forms of studying besides the “retrieval practice essay.”

You might think that this argument gives the only background rate that you need. However, in claiming that retrieval practice essays are a good way to study, the argument is implicitly comparing retrieval practice essays to other forms of studying. So, we need to be able to compare the rates for the various alternatives, including the rate for students who don’t study at all.

Sample

In the second half of 2010, the University of Western Ontario did not have a single car stolen on campus. The campus police must be doing an outstanding job protecting the university.

Look at the number of cars stolen as far back into the past, such as the early 1900s. If the number has always been zero, well then there may not be anything particularly extraordinary about the current campus police.
Look at other types of crimes — for example murders, theft, rape, and much more. The police may be good at protecting the university from car theft, but it does not follow that they are doing a good job in other respects for protecting the campus.
Look at the surrounding city/town/area/province. Maybe it is an extremely stable and safe city with low crime all around and low car theft all around. In that case, there is likely nothing special about the campus police.
Look at how many cars there actually are on campus. Maybe 98% of the students and faculty bike, walk, or take the bus to classes. In that case, there will be very few cars there in the first place, meaning of course the amount of stolen cars will be incredibly small.

First, we need to know how many cars there are on campus: that is, we need to know (or readily be able to calculate) the theft rate on campus. Second, we would need to know what the rate of car thefts is in the surrounding area. If there are few cars on campus or a very low theft rate in the surrounding area, then it’s not as big an accomplishment if there have been no car thefts on campus. We could get even more precise here by figuring out how we’re going to count the number of cars on campus. Are we looking at the number of cars on campus on, say, an average weekday morning at 10 a.m.? The average number of cars parked on campus overnight? The total number of cars parked on campus at any point during the month? All of these suggestions are ways of refining the basic idea that we need to know how many cars there are on campus.

Sample

Exactly zero children died from measles in the United States between 2004 and 2015. But in that same time period, 106 infants died after exhibiting reactions to measles vaccines. Obviously, the real danger here is not measles; it’s the measles vaccine.

To determine which is more dangerous — measles or the measles vaccine — we’d need to know what fraction of people who get measles died (i.e., the death rate from measles) and what fraction of people who get a measles vaccine died (i.e., the death rate from the vaccine). Far more people get the measles vaccine than get measles. So, even if the death rate from the measles vaccine were much lower than the death rate from measles, the total number of deaths from the vaccine would probably be higher. Hence, a higher total number of deaths from the vaccine does not by itself constitute evidence that the real danger is the measles vaccine.

Sample

Jennifer’s financial troubles began when she lost her job. After ordering supplies online to perform some hoodoo rituals, her financial situation has turned around. Tammie and Angela have similar stories: When they fell on hard times, they turned to hoodoo rituals and found their financial problems disappearing. Therefore, people in financial trouble who perform hoodoo rituals are likely to recover from their financial problems.

We would need to know several things to justify this conclusion. First, we would need to know how many other financially troubled people who turned to hoodoo saw an improvement in their financial situation. That is, we need a rate, and not merely a few, probably unrepresentative examples (after all, those for whom the hoodoo was a associated with a subsequent increase in financial success are precisely the ones most likely to spread the news about it, whereas those for whom the hoodoo was not associated with a subsequent increase in financial success are less likely to spread news about it).

Second, we would need to know the rate at which people who did not turn to hoodoo also saw an improvement in their financial situation. Only by comparing the two rates can we decide whether people who turn to hoodoo are more likely to recover from financial troubles. We’re implicitly comparing people who turned to hoodoo to people who did not turn to hoodoo. Thus, if we’re going to say that hoodoo is connected to financial recovery, we need to know the rate of financial improvements for both groups (and, plausibly, financial downfalls for both groups).

Moreover, there is a general trend in any system toward mean values. This is called regression toward the mean. So, it was already likely that after severe, unusual, or extraordinary financial difficulties, such individuals would regress back to the mean and hence perform better financially.

Rule 10: Statistics need a critical eye

After an era when some athletic powerhouse universities were accused of exploiting student athletes, leaving them to flunk out once their eligibility expired, college athletes are now graduating at higher rates. Many schools are now graduating more than 50 percent of their athletes.

First, though “many” schools graduate more than 50 percent of their athletes, it appears that some do not — so this figure may well exclude the most exploitative schools that really concerned people in the first place.

The argument does offer graduation rates. However, it would be useful to know how a “more than 50 percent” graduation rate compares with the graduation rate for all students at the same institutions. If it is significantly lower, athletes may still be getting the shaft.

Most importantly, this argument offers no reason to believe that college athletes’ graduation rates are actually improving, because no comparison to any previous rate is offered. The conclusion claims that the graduation rate is now “higher,” but without knowing the previous rates this is a wholly unjustified assertion.

Another statistical error is overprecision, particularly when it concerns matters where precision is physically or practically impossible, such as someone claiming to know the exact number of water molecules in the Arctic Sea.

Additionally, be wary of numbers that are easily manipulated or statistics that are based on guesswork or extrapolation (such as data about semi-legal or illegal activities). Such activities, by their very nature, disincentivize people to report the truth.

When analyzing and evaluating statistical arguments, ask the following questions:

What exactly are these statistics saying?
Are these statistics believable?
Do these statistics really show what the argument claims they show?
Are rates or percentages offered without relevant background information?
Are the statistics over-precise?
Are the extrapolations (if there are extrapolations) legitimate?
If the argument is trying to make conclusions based on the rate of Y among a population of X’s, compare this information to:
- The rate of non-Y in the population of X’s
- The rate of Y in the population of non-X’s
- The rate of non-Y in the population of non-X’s

A further question to ask is: how would (or could) someone have learned that particular statistic? And how reliable is that method? Suppose you are told that 90% percent of people wash their hands upon leaving the bathroom. How would anyone know that? Most likely, a pollster asks people whether they washed their hands immediately after they left the bathroom (or they asked people if they generally do so). People sometimes lie (or, let’s just say, shade the truth) for pollsters — especially when they are embarrassed about the true answer, don’t quite want to face it themselves, or otherwise fear that the true answer is not “socially desirable.” So, that figure of 90 percent probably overestimates the percentage of people who wash their hands upon leaving the bathroom.

Also be wary of organizations with vested interests, biased samples, loaded questions, discarded or discounted data or trials, or any other version of data manipulation or misrepresentation.

Sample

According to U.S. News & World Report’s compilation of statistics provided by law schools, 93 percent of law school graduates have a job nine months after finishing law school. That’s up nearly ten percentage points from 1997, when law schools reported an average employment rate of 84 percent. The employment picture for law school graduates is better than ever!

This argument cites two “employment rates” for recent law school graduates to show that the employment picture for law school graduates is “better than ever.” There are several reasons to be skeptical of this argument. First of all, it’s worth noting that these statistics come from the law schools themselves, who have an incentive to inflate employment rates. Second, the argument doesn’t specify that 93 percent of graduates are employed as lawyers, which is what we really want to know about. It could be that only a small portion are employed as lawyers and the vast majority have minimum wage or other non-lawyer jobs. Third, the argument claims that the employment picture is “better than ever,” but it offers only one point of comparison: 1997. It could be that 1997 was a particularly bad year for law school graduates. We need more background information to evaluate the relevance of that statistic.

Rule 11: Consider counterexamples

Counterexamples are examples that contradict a generalization. Seek them out on purpose and systematically.

In light of one or more counter-examples to a generalization, one may alter one’s conclusion from, say, “all X’s” to “most X’s”. Ask whether the conclusions might have to be revised, limited, or rethought in subtler and more complex directions.

Sample

Of the 3,141 counties in the United States, the 314 counties with the lowest rates of kidney cancer per year are almost all very rural counties. Furthermore, based on data from 2004, the only counties in which no one developed kidney cancer all had populations of less than 100,000. Thus, the counties where people have the lowest risk of kidney cancer are sparsely populated, rural counties.

This is a weak argument, but for subtle reasons. On the one hand, it gives lots of examples — 314 of them, to be exact (Rule 7). And because it’s looking at all 3,141 counties in the United States, it’s not “cherry picking” just the counties that support its conclusion (Rule 8) or ignoring counterexamples (Rule 11).

The problem here has to do with background rates (Rule 9) and the use of statistics (Rule 10). Because of the very low background rate of people developing kidney cancer, the statistics given in the argument don’t support the conclusion that people in rural counties have a lower risk of developing kidney cancer. In densely populated counties, even a low background rate will lead to lots of cases of kidney cancer. But in sparsely populated counties, that same rate would often mean that no one in the county develops kidney cancer in a given year. So, it’s no surprise that those counties have the lowest rates of kidney cancer. But that’s different than saying that people in those counties have a lower risk of kidney cancer.

Furthermore, rural communities tend to have less access to resources including hospitals, in which case many actual cases of kidney cancer may go unreported.

The flaw in this argument illustrates a common mistake based on “small sample sizes” — that is, statistics generated by looking at small groups. When you have a random process at work in a small group, you’re far more likely to get “extreme” results than when you have that same process at work in a large group.

For instance, suppose that you toss four quarters in the air, and your professor tosses a hundred quarters in the air. When the coins land, it would be much more likely that all of your coins landed heads than that all of your professor’s coins did. But this doesn’t mean that your quarters have a greater chance (or greater “risk”) of landing heads.

Sample

A power company in the state of Georgia is trying to build a new nuclear power plant. They plan to use a safer, more efficient nuclear reactor design from Westinghouse, called the AP1000. China started building a power plant with an AP1000 reactor last year, and the construction costs there have skyrocketed. So far, they’re already more than three times higher than expected. Therefore, building power plants with AP1000 reactors will usually lead to cost overruns.

The problem with this is that it only gives one example, and it is likely non-representative (given that business operations are different in China compared to the United States). It also gives no relevant background rates of other companies and their successes or failures with the AP100.

In the next post we will explore analogies and the critical thinking required in their analysis and evaluation.

Author: Joe

Email: [email protected]

One Comment