Reading Hints for Carnap's "On Inductive Logic"

Kevin T. Kelly

Department of Philosophy

Carnegie Mellon University


A predicate describes a property:  e.g., "is black".  Predicates are symbolized by P, Q, R, ....

An individual constant names an individual: e.g., "Sam".  Individual constants are symbolized by a, b, c, ....

A state description asserts or denies each  predicate of each individual constant.  In other words, it says as much as can be said in a language including a given collection of predicates and individuals.  For example, the predicates P, Q and individuals a, b, give rise to state descriptions of the form:

P(a) & -P(b) & Q(a) & -Q(b),

where the "not" signs may be distributed however you please.  Think of a state description as specifying everything the language can say about some possible world.  Therem may be other facts, but they are inexpressible.

The range of a sentence is the set of all state descriptions that entail it (make it true).  Thus, the range of sentence

(for all x, P(x))

in a language containing just P, Q, and individuals a, b, would be the set:

{P(a) & P(b) & Q(a) & Q(b),
P(a) & P(b) & Q(a) & -Q(b),
P(a) & P(b) & -Q(a) & Q(b),
P(a) & P(b) & -Q(a) & -Q(b)}.

The smaller the range, the more possibilities the sentence excludes.  A sentence whose range is empty is logically false, and a sentence whose range includes all state descriptions is logically true.  One sentence entails another if the range of the first is a subset of  the range of the second.  Carnap learned this from Wittgenstein.

A measure function m assigns some number to each state description so that the numbers all add up to 1. To find the measure value of a sentence, add up the measure values of all the state descriptions in the range of the sentence.

If m is a measure function, then a regular c function is defined from m according to the definition of conditional probability:

c(h, e) = m(h & e)/m(e).

Two state descriptions are isomorphic or "structurally the same" just in case you can get from one to the other by uniformly and uniquely substituting individual constants for one another.  For example,

P(a) & -P(b) & Q(a) & -Q(b) is isomorphic to
P(b) & -P(a) & Q(b) & -Q(a).

An isomorphism class of state descriptions consists of the set of all state descriptions isomorphic to a given state description.

A c-function is symmetrical if its measure function assigns the same values to isomorphic state descriptions.
Carnap's favorite c-function, called c*, is defined in terms of the measure function m* that assigns:

  1. equal weight to each isomorphism class and
  2. equal values to the state descriptions within an isomorphism class.

This is what Carnap takes to be the "true logical remnant" of the classical principle of indifference relative to a given language.  The idea is basically "method II" illustrated in the figure on p.285.

A maximal property is a conjunction of all the predicates in the language, either asserted or denied.  In our two-predicate language, a maximal property is  P(x) & -Q(x).  Carnap's beautiful terminology is to call maximal properties "Q-properties"!

Fact:  Every property can be rewritten as a disjunction of maximal properties.

The width of a property is the number of maximal properties required to write it in the preceding form.  Thus the width of P(x) & -Q(x) is 1, since it consists of just one maximal property.  But the width of P(x) or -Q(x) is three,  since it gets rewritten as:

(P(x) & -Q(x)) or (-P(x) & -Q(x)) or (P(x) & Q(x)).

That's about it for the first eight sections of the paper.  Pick up at section 9.


More on Carnap's Paper

Let's consider the toy example of a single unary predicate P and three individuals a, b, c.

Now we can think of P as partitioning a, b, c into two classes, P and -P. I will put P individuals to the left of a slash and -P individuals to the right. Then m* weights state descriptions as follows:

Now to find c*(P(a) | no information) just add up all the probabilities of state descriptions in which a is to the left of the bar.

To see how we can learn from experience, suppose we learn that P(b). Now we add up the rows in which a,b are both on the left and we divide this number by the number of rows in which b is on the left.

Now suppose we learn, in addition, that P(c). Then repeating the procedure we get:

So what justifies this? There are lots of ways to put numbers down.

As Carnap observes in the paper, if we weight state descriptions evenly, the confirmation value of P(a) never changes, no matter what we see. All data become irrelevant.

By putting proportionally less mass on the uniform worlds, we can get anti-learning from experience!

Now we have

But after seeing two positive instances of P, our expectation goes down by half.

Carnap justifies induction by making uniform worlds more probable. But aside from obtaining the result that we expect similarity, what justifies that kind of choice of weighting? What makes it logical and the counterinductive weighting non-logical?