In seminar yesterday we were discussing the following argument, which purports to be an a priori argument that if most Xs are Ys, then all Xs are Ys. (This is a slightly simplified version of the argument in Induction and Supposition, but I think the simplifications are irrelevant to what I’m saying here.)

- Assume most Xs are Ys, for conditional proof.
- Assume a is an X.
- Then a is a Y. (By statistical syllogism.)
- So if a is an X, then a is a Y. (By conditional proof, discharging assumption 2.)
- So for all x, if x is an X, then x is a Y. I.e., all Xs are Ys. (By universal introduction, since ‘a’ was arbitrary.)
- So if most Xs are Ys, then all Xs are Ys. (By conditional proof, discharging assumption 1.)

The conclusion is absurd, so the issue is which is the mistaken step. My conclusion is that the mistake is to apply ampliative inference rules, like statistical syllogism, inside the scope of a supposition. Indeed, I think the core mistake is to think that we can formalise inference rules as being things that can slot into natural deduction proofs. Proofs are things that tell you about implication, and inference rules are things that tell you about good inference, and implication is not, after all, inference.

But the conclusion of the last paragraph would be better supported if I could claim there is nothing else wrong with the proof, save for the use of an inference rule at a point in the proof where only a rule of implication is permitted. And that was being disputed.

We know that statistical syllogism has defeaters. It isn’t good to infer that a is Y from Most Xs are Ys and a is X, if you have strong independent evidence that a is not Y. I wanted to reason as follows. The inference from Most Xs are Ys and a is X to a is Y goes through in the absence of any reason to think that a is especially likely to be not Y. You don’t need to have a positive reason to think that a is a ‘normal’ X (with respect to Y-hood). You just need an absence of reason to think it is abnormal. And of course you have an absence of such a reason. We’re doing this all a priori, and we don’t know anything about a. So the conditions for using statistical syllogism in **inference** are met.

The reply that my students came up with was two-fold. (I think the reply was primarily due to Una Stojnic, Lisa Miracchi and Tom Donaldson, though there was a fairly wide ranging discussion.) First, if ‘a’ is a dummy name, or as it were the name of an arbitrary object, then we can’t really say that this condition is satisfied. We know that it’s not true that the arbitrary object is not normal. After all, some Xs are not Ys. Or, at least, we have no reason to think they all are. So we must be treating ‘a’ as the name of a real object, not a ‘dummy name’, or the name of an ‘arbitrary object’. But there’s an issue about which kinds of objects we can even refer to in a priori reasoning. Perhaps the only objects we can refer to a priori are abstract mathematical objects (like the null set, or the number 2). And the problem then is that we may well have reason to defeat the statistical inference from 1 and 2 to 3, since a priori we may know that a is a special case. For instance, the following reasoning is bad a priori.

- Assume most primes are odd.
- Assume two is prime.
- So two is odd. (By statistical syllogism.)
- So if two is prime, two is odd.
- Since two is arbitrary, all primes are odd.
- So if most primes are odd, all primes are odd.

That’s bad reasoning because (perhaps inter alia) it’s a bad use of statistical syllogism. And it’s a bad use of statistical syllogism because even a priori we have reason to think that two is an ‘abnormal’ prime with respect to parity.

So there’s a dilemma for the reasoning I was using. If ‘a’ is a genuinely referring expression, then it isn’t clear that the preconditions for statistical syllogism are satisfied, because the only things it could refer to in a priori reasoning are things that we have a priori knowledge about. But if ‘a’ isn’t a referring expression, then it seems surely true that the step from 1 and 2 to 3 fails. Either way, we have reason to think the argument to 3 is bad, and that reason is independent of my general view that you can’t use ampliative inference rules (if such things exist) in suppositional reasoning.

I’m not sure I agree with the objection to using dummy names in probabilistic reasoning. I suspect there’s a different lesson to be drawn from this line of thought: “ First, if ‘a’ is a dummy name, or as it were the name of an arbitrary object, then we can’t really say that this condition is satisfied. We know that it’s not true that the arbitrary object is not normal. After all, some Xs are not Ys. Or, at least, we have no reason to think they all are.” That lesson is: if you’re going to use statements like “most Xs are Ys” in your reasoning, you need to be very careful about applying the rule of universal generalization to formulas with arbitrary names in them. That’s what the thought about “some Xs are not Ys” helps teach us. But other sorts of reasoning will be fine.

For example, changing the probabilistic premise slightly, from “Xs tend strongly to be Ys”, one can suppose the is at least one X, a; infer that a is a Y; apply EG to get that there is at least one Y; and then discharge the assumption to get “if there is at least one X, then there is at least one Y.” Nothing looks at all fishy about that line of reasoning. The important thing about arbitrary names is what you plan to do with them later — if you want to apply UG, then you had better not apply any information to formulas with that dummy name that doesn’t hold universally. If you’re using probabilistic information, then you can still legitimately use probabilistic forms of generalization. (Consider the chain of reasoning that will get you from the premise “Most Xs are Ys”, via a path that includes suppositional arguments with ampliative components, to “if the Zs are the Xs, then most Zs are Ys”.) It’s not really about the use of dummy names per se.

It seems to me that what is going on here, is that all of this is just a case of the various restrictions on the conditions under which one can use universal generalization. Supposition isn’t the problem — it’s using universal-introduction on a formula generated in a bad way for that rule of inference. You can’t generally use universal-introduction on a formula introduced by existential-introduction, right? And for the same kinds of reason, you cannot use it on a formula derived via statistical syllogism. That’s the misstep in the original argument: using universal introduction in the penultimate step, on a premise that had been “contaminated” by the earlier use of statistical syllogism.

The problem with the reductio is that the conclusion is not absurd, or at least, there’s nothing absurd about are being able to reason to this conclusion. It’s important to remember that when you apply defeasible inference rules within the scope of a supposition, you are not doing conditional proof. You can’t prove the conditional by using a defeasible inference rule. Suppositional reasoning with defeasible inference rules generates only a defeasible reason to believe the conditional. So the argument yields only a defeasible reason to believe (6)

Statistical syllogism requires ‘most’ to indicated a very high percentage. Now suppose you know that most crows are black. I claim you thereby have a defeasible reason to believe all crows are black. Your reason would be defeated if you had a rebutting defeater i.e. a reason to think that some crow is not black, or you had an undercutting defeaters i.e. a reason to think the sample is bias. But absent your having such a defeater, your knowing that most crows are black constitutes a good reason to believe all crows are black. Isn’t that why we do (presumably rationally) believe all crows are black? So if most Xs are Ys is a defeasible reason to believe all Xs are Ys, then one can reason suppositionally to (6), and thereby acquire a defeasible reason for (6). There’s no absurdity here.

One other thing: One reason to think that it’s okay to apply defeasible inference rules within the scope of a supposition is that we do it all the time. Here’s just one example. Suppose we know that we are looking at Fido the Pit Bull. You ask me whether Pit Bulls tend to be dangerous. I (noticing Fido coming toward us) reply, “I don’t really know, but if most Pit Bulls are dangerous, the we’d better run.” How did I arrive at this conclusion? I used statistical syllogism within the scope of a conditional to infer (defeasibly) that if most Pit Bulls are dangerous, then Fido is dangerous. Then I used some practical reasoning to infer that if most Pit Bulls are dangerous, we better run. Surely this is completely unobjectionable reasoning.

I really don’t see why this should be so: “Now suppose you know that most crows are black. I claim you thereby have a defeasible reason to believe all crows are black.” I don’t see why I wouldn’t have at least as good, or even better, reason to believe other contrary claims like “almost (but not quite) all crows are black”. I think in general these sorts of inferences have to be massively informed by oodles & oodles of information about how the sample was selected & measured, and what we know about the general kind of thing the Xs and Ys are. We can often do this, but that’s just because we really do often have lots of information of that sort.

Relatedly, I don’t think this is right: “Isn’t that why we do (presumably rationally) believe all crows are black?” It’s not like we’ve actually observed a majority of crows, after all. We reason here from the uniformity of the sample we’ve observed, and lots of background theory about biological kinds, bird coloration, and so on, and from all

thatto the claim that all crows are black.I totally agree with the last paragraph of Stew’s comment, though. (And I agree with the general spirit of the first paragraph, except I do think that there should be restrictions on using “all”-introduction as discussed in my earlier comment.)

We often make these kinds of inferences without very much information at all—certainly not on the basis of massive amounts of information. Presumably, prescientific people were rational to think all crows are black, even though they had no background theory (at least, not “lots of” such theory) about biological kinds.

But we can sidestep these issues by thinking about a simple enumerative case. An urn contains 1 thousand balls. I have drawn 990 and they are all black. So I know most of the balls are black. I claim I now have a defeasible reason to believe that all the balls are black. If I get evidence that one of the balls is not black (either by reexamining the ones I’ve drawn or by getting some information about the remaining balls) my reason will be defeated. The reason would also be defeated if I had some reason to believe the sample was not representative. But absent these kinds of defeaters, I have a good reason to believe all the balls are black. And if I can acquire such a reason for the color of balls in an urn, why not for the color of crows (assuming we know most of them are black)?

Jonathan claims “I don’t see why I wouldn’t have at least as good, or even better, reason to believe other contrary claims like “almost (but not quite) all crows are black”. So presumably in my urn case, he would think that I have at least as good, or even better, reason to believe that almost, but not quite, all the balls are black. This would entail that after drawing 990 black balls, I have some reason to believe that one of the remaining 10 is not black. But what would such a reason be based on? Moreover, if I did have a reason to believe not quite all the balls are black, it would trivially be at least as good, or even better, than my reason to believe they are all black, since my reason for this latter claim would be defeated.

I’m not convinced any of the steps in the reasoning is incorrect. Jonathan, could you explain what you mean by “You can’t generally use universal-introduction on a formula introduced by existential-introduction” and how this is analogous to performing universal generalization on a formula derived via statistical syllogism?

“We often make these kinds of inferences without very much information at all—certainly not on the basis of massive amounts of information. Presumably, prescientific people were rational to think all crows are black, even though they had no background theory (at least, not “lots of” such theory) about biological kinds.” We have to distinguish between having this sort of information, and having it in the sort of ample, explicit, and high-grade form we can get from science. As Scott Atran among others have shown, prescientific peoples actually have rather rich bodies of information about the biological kinds around them. Moreover, our innate biases towards essentialist inferences regarding (whatever we take to be) biological kinds plays an important role in facilitating such inferences — though they also lead us, sometimes massively, to over-apply such inferences. Which leads to a second line of response here: how much & when people

makethese inferences isn’t the relevant explananda here, but rather how much & when peoplecorrectlymake them. If someone (like me) is arguing that the demands on successful ampliative inference are somewhat high, that’s not a problem to the extent that (i) we do often have adequate resources for meeting those demands, and (ii) in the large number of cases where we don’t, we are in fact not doing well, epistemically-speaking, in making those inferences.I think the balls/urn example is not a counter-example to my point, but an illustration of it! The needed kind of “oodles & oodles of information” is not just about the underlying kinds, but also about “how the sample was selected & measured”. Your example won’t work if you were simply informed, without any other information (including any implicatures from discourse, etc.) that at least 990 of the 1,000 balls are black. From that proposition all by itself, nothing whatsoever follows about the other 10 balls. So it’s really important, in your case, that we select those 990 balls in some random way. That’s what can make it pretty unlikely, given the stipulated observations, that any of the remaining balls are nonblack. You don’t actually say anything about random draws but I do take it as implicit in these ball/urn cases. And also, without some such assumption, the intended inference to “all balls are black” will fails.

For the stuff about existential-introduction and universal-introduction, see e.g.

http://en.wikipedia.org/wiki/Generalization_(logic)

If a name gets introduced as (speaking very metaphorically here) “attached” to any information that is not fully general, then that name cannot be used in a universal generalization. Using an instance of statistical syllogism is to bring in information that applies only to some (perhaps large) subset of the universe. So it gives us a decent but defeasible inference to the token case, but properly rules out applying universal generalization from that token case later.

Hi All,

Two points I’d like to add: (1) Stew’s position (I hope it’s alright if I use first names) is a little more radical than he is indicating. (2) And Jonathan’s position is not quite as in line with logical orthodoxy as he is making it seem.

On point (1):

Stew’s position is that Brian’s argument actually succeeds in generating a defeasible reason to believe its conclusion (contrary to Brian’s characterization of the argument as a reductio). Stew defends his position in part by analogizing the defeasible reason he sees generated by Brian’s argument to the defeasible reason ordinary people might have to believe all crows are black. The point I want to add is that Brian’s line of argument cannot be so innocent, because it doesn’t just generate a reason to believe something as mild as that all crows are black. Brian’s line of argument threatens to lead to the conclusion that statistical inference is logically valid! It requires a bit more dressing up to make it rigorous how Brian’s trick leads to exactly that conclusion, but I think the entailment is somewhat intuitive to recognize once the idea is explicitly pointed out. (For the rigorous dress up, see my Nous paper, “Knowledge of Validity”, pp.8-10 of the version available on my personal website, http://www.sinandogramaci.net). I don’t have a knock-down argument against the claim (if Stew even wanted to make it) that we can acquire a defeasible reason to believe statistical reasoning is valid, but clearly it’s a lot more radical of a claim than that we have a defeasible reason to think all crows are black or all 1000 marbles are black.

On point (2):

Jonathan’s inclusion of a link to Wikipedia gives the casual impression he’s just defending textbook restrictions on universal generalization. But the textbook restriction on generalizing “Fa” is only that there should be no undischarged premises containing “a”, and that the resulting generalization not contain “a”. That restriction has been fully met in Brian’s argument. Jonathan is actually promoting a novel restriction on universal generalization, which I take to be something along the lines of: only generalize when you know that your earlier reasoning was generally truth-preserving. That paper of mine I already mentioned is dedicated to arguing for this position. (Great minds must think alike!) But it’s certainly not an orthodox view. As I explain in the paper, a consequence of the position is that one of the most famous “textbook” proofs, namely the soundness proof(s) in metalogic, are as fallacious as Brian’s reductio!

Jonathan says “Your example won’t work if you were simply informed, without any other information (including any implicatures from discourse, etc.) that at least 990 of the 1,000 balls are black. From that proposition all by itself, nothing whatsoever follows about the other 10 balls.” Let me remind you that we are talking about

defeasiblereasons, so of course nothing follows. The reason has to be undefeated. As near as I can tell, here’s the fundamental difference between me and Jonathan. He thinks that when you reason by statistical induction, you need positive reason to believe that the sample is representative. In effect one of your premises is that the sample is representative. Following Pollock, I think that the composition of the sample, by itself provides a defeasible reason for the inference. Issues about whether the sample is fair figure in the reasoning only as (undercutting) defeaters. So I think the fact that a very high percentage of Fs are Gs is, absent some positive reason to think that sample is not fair (and absent any rebutting defeaters) , a good reason to believe that all Fs are G.s Now we could argue about how high that percentage has to be, but Jonathan’s position entails that no matter how high the percentage of Fs that are Gs, absent positive information about the fairness of the sample of Fs, one does not even have a defeasible reason to believe that all Fs are Gs. So even if I know that 99.99999…. of the Fs are Gs, unless, e.g., I know how that sample was obtained, I don’t have even a defeasible reason to believe all Fs are Gs. That strikes me as implausible.I took a look at the argument in Sinan’s paper. And he does show that I’m committed to Brian’s initial argument providing a defeasible reason to believe statistical Inference is logically valid. While this certainly sounds strange, I see no reason in principle why one couldn’t have such a reason. Part of what makes this sound odd, is that in all but the most unusual of cases, a moment’s reflection will immediately provide the reasoner with a defeater. Sinan’s example is somewhat artificial in that he stipulates that the subject is considering for the first time whether statistical inference is valid. And she has no independent information that bears on the matter. Now suppose this subject, instead of going through the reasoning described by Sinan, is told by her logic teacher that statistical inference is valid. I don’t find it counterintuitive to suppose that she thereby has a defeasible reason to believe that statistical inference is valid. Even then, unless she is an extremely impaired reasoner, she will be able, with a little reflection, to come up with a defeater. So I think that the strangeness of having a defeasible reason to believe that statistical inference is valid, can be explained by noting that in the typical case, the subject will have a defeater of that reason.

So, I’m inclined to that that if Brian’s initial argument fails because of restrictions on UG, it will have to be for reasons independent of Sinan’s argument.

How is the assumption that Most Xs are Ys supposed to be interpreted? Here are two options:

(1) Most, but not all, of the Xs in the world are also Ys. That is, we know that there are N = n + m total Xs, we know that n of the Xs are Ys, we know that m of the Xs are not-Ys, and we know that n > m. (Maybe n >> m.)

(2) All of the Xs we have observed so far are also Ys, and we know that we have already seen more than half of the Xs. That is, we know that there are N = n + m total Xs, we observe that n Xs are Ys, and we know that n > m. (Maybe n >> m.) However, we do not know anything about the m Xs that we have not yet observed.

If the claim that Most Xs are Ys is interpreted like (1), then the inference to All Xs are Ys is indeed absurd. However, if the claim that Most Xs are Ys is interpreted like (2), then the inference to All Xs are Ys might be perfectly reasonable. My sense is that the ordinary language reading of Most Xs are Ys is (1), which fits with Brian’s claim that the conclusion is absurd.

Incidentally, it is not true that all crows are black. Albinism is pretty widespread in nature, and crows are not exempt. A quick google image search turns up lots of examples of white crows. Here’s one: http://www.jpvpk.gov.my/2008/English/Jul07%2015e.htm

‘Most Xs are Ys’ does not semantically entail that some Xs are not Ys. But in most contexts, asserting that most Xs are Ys will pragmatically implicate that some Xs are not Ys.

Stewart, won’t Brian’s reasoning work just as well if his premiss 1 is “Exactly 90% of Xs are Ys”? From that premiss and the fact that this object is an X, we infer that this object is a Y by statistical syllogism, right? In that case, we already have a defeater for the generalization. We know that it is not true that all Xs are Ys, because exactly 90% of the Xs are Ys.

In the case I just described, Brian’s initial premiss looks like my (1). Your defense of Brian’s reasoning as non-absurd requires (2). But Brian’s argument is going to work just fine with (1). Hence, I think it is important to distinguish the two cases carefully. For one but not the other, the conclusion of Brian’s reasoning is absurd.

I’m also not at all sure how good the reasoning is from “Most Xs are Ys” to “All Xs are Ys.” If we set up the problem like a classical probabilist would, then the inference is very sensitive to how many objects are in the universe of discourse. (What probability replaces “most” also matters, of course.) For example, if we mean by “Most Xs are Ys” that “More than 50% of Xs are Ys,” then in a universe with only 10 objects, the inference that “All Xs are Ys” will only be true about 13% of the time. Even if we mean “More than 80% of Xs are Ys,” the inference will only be good about 67% of the time in a universe with 10 objects. But in any interesting case I’ve ever seen (like your examples on crows or dogs), the number of objects in the universe of discourse is in the thousands (or more), not in the teens.