Predicate Calculus with Identity: Syntax and Proof Theory

*The predicate calculus is more tricky that PC. I strongly urge you to study chapter 3 of Lemmon. None of the proofs or questions below are provided in Lemmon.

1. Introduction

The job of logic is to codify valid inference. We have seen that PC is adequate to this task where validity solely turns on the occurrence of coordinate expressions, such as and, not, etc. Clearly, however, not every valid inference turns on such expressions. That is, there is more to validity than PC mandates.

Consider the following arguments:

(1) Every footballer is blonde.

Pele is a foootballer.

Pele is blonde.

(2) Pele is a footballer.

Therefore, someone is a footballer.

(3) No footballer is blonde.

Pele is a footballer.

Therefore, Pele is not blonde.

Transparently, each of these arguments is valid, but PC is unable to represent them as valid. Because each sentence of the arguments is distinct, and none include constants, the best PC can do is to notate each sentence with a distinct propositional variable. Thus:

(4) P

Q

R

(4) is not a valid argument of PC.

The essential problem is that PC abstracts away from all structure a given sentence possesses. But some valid inferences depend upon such structure. The predicate calculus (PredC) is designed to represent such structure so as to capture the valid inferences which depend upon such structure.

2. Syntax

The syntax of PredC is somewhat more complicated than that of PC. Consider the arguments above. Whereas PC just has propositional variables and constants, to represent the above arguments we appear to require at least names (‘Pele’) and predicates (‘is a footballer’, ‘is blonde’). Do we need more? Yes.

First off, note that the validity of the arguments (1)-(3) do not depend on the fact that they are about footballers and Pele, etc. Witness, e.g.,

(5) Every A is B

a is A

a is B

Whatever we uniformly substitute for ‘A’, ‘B’, and ‘a’, the result is a valid argument. But note that we can’t substitute ‘every’ for another similar expression:

(6) No A is B

a is A

a is B

Any uniform substitution of the terms of (6) results in an invalid argument: e.g.,

(7) No footballer is blonde.

Pele is a footballer.

Pele is blonde.

We call expressions such as every, some, no, etc. quantifiers (this will be explained). How, then, are we to treat expressions such as ‘Every/No footballer’? Can we treat them like names? No.

(i) As just seen, we can substitute into the positions of names and retain validity. We can’t do that with quantifiers.

(ii) The sentence, ‘Pele is a footballer’ attributes a property - being a footballer - to a thing - Pele. Something else appears to be going on with ‘Every footballer is blonde’. This sentence does not attribute a property to a thing - every footballer is not a thing, and even if we were to think it was, heaven knows what it would be for it to be blonde. ‘Every footballer is blonde’ says ‘Everything which is a footballer is also blonde’. Alternatively, ‘For all things, if it is a footballer, then it is blonde’.

It seems that not only do we need something more than names to represent quantifiers, we also require the logical constants of PC. Let us introduce some symbols of the formal language of PredC.

Symbols

(i) Arbitrary Names

Since PredC abstracts away from what particular names refer to, we may use arbitrary names - a, b, c, d, etc. We call them arbitrary for they stand for any given object, not a particular object. Thus, ‘a’ and ‘b’ do not necessarily ‘refer’ to distinct things.

(ii) Predicates

Again, PredC abstracts away from the properties expressed by particular predicates. Our symbolism thus adopts predicate letters - F, G, H, I, etc. So far, then, we can represent ‘Pele is a footballer’ as ‘Fa’ (Note: ‘Ia’ would also do as well as ‘Fd’, etc.).

(iii) PC Constants

We adopt wholesale the logical constants of PC. Thus, the sentence ‘Pele is a  footballer and Tony Blair is a politician’ may be represented as ‘Fa & Gb’.

(iv) Quantifiers (Constants)

In English, quantifiers constitute a mixed and complex category of expression. Let us just think about ‘every’ and ‘some’. We said above that we can think of a sentence of the form ‘Every F is G’ as expressing a proposition of the form ‘For every thing, if it is F, then it is G’. Let us replace the conditional here with the constant ‘→’ and substitute the variable x for the pronoun ‘it’ and the dummy generic noun ‘thing’. This gives us

(6) For every x, x is F → x is G.

We can drop the copula (is) in line with our convention of notating predication (i.e., putting the predicate letter before its subject). This gives us:

(7) For every x, FxGx.

Finally, let us notate the prefix ‘For every x’. Here we employ a variable binding operator (the universal quantifier):

(8) ("x)(FxGx)

‘Every x is such that if x is F then x is G’

The procedure is analogous for something/someone, with one crucial difference. Consider, ‘Some footballers are blonde’. Could we notate this as (9)?

(9) For some x, FxGx.

No. (9) doesn’t say that any x at all is F (a footballer). It simply says that there is something such that if that thing is F, then it is G. We get the right reading if we substitute ‘&’ for ‘→’:

(10) For some x, Fx & Gx.

We fully formalize by introducing the existential quantifier:

(11) (\$x)(Fx & Gx).

‘There is an (at least one) x such that x is F and x is G’.

Variables ≠ Names

Variables are not names. Names - a, b, etc. - are read as referring to arbitrary objects. Variables do not refer at all. The difference may be seen in English between proper names and pronouns. Compare:

(12)a. Bill is tall.

b. It/he/she is tall.

In a sense, b. doesn’t say anything determinate, for we don’t know what it is. Pronouns can, however, be bound by a preceding quantifier, e.g.,

(13) The car skidded, then it hit a lamp-post.

Here, ‘it’ is bound by the preceding nominal phrase ‘The car’. We can’t bind names in the same way. Imagine you have a car called Herbie:

(14) The car skidded, then Herbie hit a lamp-post.

Here, it seems as if there are two cars.

Variables substitute names in the formation of PredC formulae. They allow us to generalise over the positions of names. The following procedure gives an example:

(15)a. Fab

b. (\$x)(Fxb)

c. ("y)(\$x)(Fxy)

Let ‘F’ = ‘loves’.

We go from ‘a loves b’ to ‘Someone loves b’ to ‘Everyone is loved by someone’. Note, this is not valid. It is just an example of how variables allow for generalisation.

Differences Between the Quantifiers.

(16) ¬(\$x)Fx

‘Nothing is F’ (Here we are denying that something is F, which means that nothing at

all is F.)

(17) (\$x) ¬Fx

‘Something is not F’ (Here we are saying that there is something or other, but it is not

F.)

(18) ¬("x)Fx

‘Something is not F’ (Here we are saying that F-ness doesn’t hold universally, i.e.,

there is something which lacks F.)

(19) ("x) ¬Fx

‘Nothing is F’ (This is not consistent with something being F. It doesn’t say

‘Not every thing is F, but something might be’ - it denies F-ness, as it were, of

everything.)

The differences of interpretation are here to do with the scope of ‘¬’. If another reason were needed, this is why quantifier phrases are not names, for names are insensitive to the scope of ‘¬’: ¬a is F ↔ a is ¬F.

It may be noted from above that

(20)a. ("x)fdf ¬(\$x)¬f

b. (\$x)fdf ¬("xf

c. Everything is f iff it is not the case that something is not f.

d. Something is f iff it is not the case that everything is not f.

2.1. Recursive Definition

As with PC, we want a recursive definition of ‘formula of PredC’.

(PredC-Df)

(i)    x, y, z,… are terms of PredC (variables).

(ii)    a, b, c, …. are terms of PredC (arbitrary names).

(iii)   F, G, H,… are predicates letters of PC.

(iv)   If F is a predicate letter and <a1, …, an> is a sequence of terms (where n ˜1), then

Fa1, …, an’ is a formula of PredC.

(v)    If ‘Fa1,… v,…an’ is a formula is PredC, with at least every occurrence of v

unbound, then

(a) ‘("x)Fa1,… x,…an  and

(b) ‘(\$x)Fa1,… x,…an  are formulae of PredC.

(vi)   If ’Fa1,t,…an  is a formula is PredC, with t as a name, then

(a) ‘("x)Fa1,… xan’ and

(b) ‘(\$x) Fa1,… xan’ are formulae of PredC.

(vii)  If X is a formula of PredC, then ¬X,

X & X,

X v X,

X→X,

and X ↔ X are formulae of PredC.

(viii) These are all the formulae of PredC.

Q: From the definition, which of the following are formulae of PredC?

(i)   Fa & ¬Fz

(ii)  (\$x)("x)Faxz

(iii) Fx → ("y)(FxGb)

(iv) ("x)Fxx

3. Rules of Inference for PredC

We take over all the rules of PC and in addition have four new rules for, respectively, the introduction and elimination of the universal and existential quantifiers.

(i) Universal Elimination (UE)

For any formula, either derived or assumed, ‘("x)Fx’, which contains at least one occurrence of ‘x’ bound, one can derive ‘Fa’, which rests on all the assumptions upon which ‘("x)Fx’ rests.

Justification: If everything is F, then any arbitrary object (a) will also be F.

(ii) Universal Introduction (UI)

If one can derive ‘Fa’ resting upon no assumptions which feature ‘a’, then one can derive‘("x)Fx’, which binds every occurrence of ‘x’ substituting ‘a’, and rests upon all assumptions upon which ‘Fa’ rests.

Justification: If  F holds of an arbitrary object (a) resting upon no assumptions about the object, then F holds of everything.

(iii) Existential Introduction (EI)

For any formula, either derived or assumed, ‘Fa’, one can derive ‘(\$x)Fx’, which binds every occurrence of ‘x’ substituting ‘a’, and rests upon all assumptions upon which ‘(\$x)Fx’ rests.

Justification: If an arbitrary object (a) is F, then something or other is F.

(iv) Existential Elimination (EE)

For any formula, ‘(\$x)Fx’, if, from assumption ‘Fa’, where ‘a’ substitutes every occurrence of x, one can derive C, then one can conclude C, resting upon all assumptions upon which ‘(\$x)Fx’, ‘Fa’, and C rest, minus any assumptions upon which two or more formulae rest (in effect, this will mean that C follows from ‘(\$x)Fx’ minus ‘Fa’.

Justification: if C follows from an arbitrary object (a) being F, then it follows from something or other being F.

Examples

We saw above that  ¬(\$x)Fx’ says the same thing as ‘("x) ¬Fx’. We should thus be able to prove one from the other.

(i) ¬(\$x)Fx ┤├ ("x) ¬Fx

1   (1)  ¬(\$x)Fx                   A

2   (2)           Fa                   A

2   (3)    (\$x)Fx                   2EI

1,2(4)  ¬(\$x)Fx & (\$x)Fx   1,3&I

1   (5)         ¬Fa                   2,4RAA

1   (6) ("x) ¬Fx                   5UI

The second proof is quite complex; its elegance will repay scrutiny:

(ii) ("x) ¬Fx ┤├ ¬(\$x)Fx

1   (1)   ("x) ¬Fx                           A

2   (2)       (\$x)Fx                           A

3   (3)              Fa                           A

1   (4)            ¬Fa                           1UE

1,3(5)              Fa & ¬Fa                3,4&I

3   (6) ¬("x) ¬Fx                           1,5RAA

2   (7) ¬("x) ¬Fx                            2,3,6EE

1,2(8)   ("x) ¬Fx & ¬("x) ¬Fx      1,7&I

1   (9)    ¬(\$x)Fx                             2,8RAA

We also saw that ‘(\$x) ¬Fx’ says the same thing as ‘¬("x)Fx. We should thus be able to prove one from the other.

(i) (\$x) ¬Fx ┤├ ¬("x)Fx

1   (1) (\$x) ¬Fx            A

2   (2)    ("x)Fx            A

3   (3)         ¬Fa            A

2   (4)           Fa            2UE

2,3(5)        ¬Fa & Fa    3,4&I

3   (6) ¬("x)Fx            2,5RAA

1   (7) ¬("x)Fx            1,3,6EE

As before, the second proof is quite complex. First, consider this putative proof:

*(ii) ¬("x)Fx ┤├ (\$x) ¬Fx

1   (1) ¬("x)Fx              A

2   (2)           Fa              A

1   (3)         ¬Fa             1UE

1,2(4)           Fa & ¬Fa  2,3&I

1   (5)         ¬Fa             2,4RAA

1   (6) (\$x) ¬Fx             5EI

Q: What is wrong with this proof?

The correct proof is complex, but exceedingly pretty:

(ii) ¬("x)Fx ┤├ (\$x) ¬Fx

1   (1)     ¬("x) Fx                          A

2   (2)     ¬(\$xFx                         A

3   (3)             ¬ Fa                         A

3   (4)       (\$xFx                         3EI

2,3(5)     ¬(\$xFx & (\$x) ¬Fx      2,4&I

2   (6)            ¬¬Fa                         3,5RAA

2   (7)                 Fa                        6DN

2   (8)         ("x)Fx                        7UI

1,2(9)         ("x)Fx & ¬("x)Fx     1,8&I

1  (10) ¬¬(\$x) ¬Fx                        2,9RAA

1  (11)     (\$x)  ¬Fx                       10DN

Q: Prove the following:

(i) ("x)(FxGx) ┤├ ("x) ¬Gx → ("x) ¬Fx

(ii) ("x)(FxGx) ┤├ (\$x) ¬Gx → (\$x) ¬Fx

(iii) ("x)(Fx → ¬Gx) ┤├ ¬(\$x)(Fx & Gx)

Examples of relations

So far, we have just established results about monadic predication, i.e., of the form ‘Fa’. The proof theory set out also deals with relations, e.g., ‘Fab’, ‘Fabc’, etc. One basically proceeds in the same way, with a few minor complications. First, one is dealing with distinct variables - not just ‘x’: distinct arbitrary names must substitute distinct variables. It doesn’t follow that one is talking about distinct things - see the section on identity. Second, one has quantifiers in the scope of other quantifiers: always proceed from the widest scope quantifier in - from left to right.  The proof below gives an example:

(i) ("x)(\$y)("z)Fxyz ┤├ ("x)("z)(\$y)Fxyz

1(1)  ("x)(\$y)("z)Fxyz                A

1(2)         (\$y)("z)Fayz                1UE

3(3)               ("z)Fabz                A

3(4)                      Fabc               3UE

3(5)               (\$y)Fayc                4EI

3(6)        ("z)(\$y)Fayz                5UI

1(7)        ("z)(\$y)Fayz                1,3,6EE

1(8) ("x)("z)(\$y)Fxyz                7UI

Q1: We here apply EE at line (7) to eliminate assumption (3). Could we have first derived ‘("x)("z)(\$y)Fxyz’ at line (7) from line (6), then applied EE?

Q2: Prove the following:

(i) ("x)("y)("z)Fxyz  ┤├ ("z)("y)("x)Fxyz

(ii) (\$x)(\$y)("z)Fxyz ┤├ ("z)(\$y)(\$x)Fxyz

4. Identity and Rules of Inference

The concept of identity adds great expressive power to our logic. Let us say that we want to express the proposition that a given equation has a unique solution, a case common in mathematics. How would we represent this in PredC? The best we could do so far would be (21), where ‘F’ = ‘is a solution to equation E’:

(21) (\$x)Fx

But this just says that the equation has some solution, not a unique one. We might try

(22) (\$x)Fx & ¬(\$y)Fy

But this just says, in effect, that not everything is a solution to the equation.

With identity, however, we can express the uniqueness claim:

(22) (\$x)(Fx & ("y)(Fyx = y))

This says that the equation has a solution and ‘anything else’ which is a solution is (in fact) identical with the given solution.

Axioms for Identity

Axioms are logical truths which we take to be definitive of our logical concepts:

A1: ("x)x = x

‘Everything is identical with itself’.

A2: ("x)("y)((x = y & Fx) → Fy)

‘Identical things share all their properties’.

It follows from the axioms that everything is identical with itself and nothing else. Why? Because identity itself is a property.  One can use an indefinite number of variables to range over a ‘universe of objects’, even if the universe contains just one object.

From the axioms, we may introduce two rules of inference.

(i) Identity introduction (=I)

At any stage of a proof, one introduce ‘a = a’ as a logical truth which rests upon no assumptions.

Justification: Everything is identical with itself as a matter of logic, so ‘a = a’ need rest on no assumptions.

(ii) Identity elimination (=E)

If, as assumptions or derived formulae, ‘Fa’ and ‘a = b’, then one may derive ‘Fb’, resting upon all assumptions upon which ‘Fa’ and ‘a = b’ rest.

Justification: Since identities share all properties, if a = b, then b has any properties a does.

Since these rules simply encode our axioms, we should be able to derive the axioms as theorems:

(i) ├ ("x)(x = x)

(1)          a = a              =I

(2) ("x)(x = x)            1UI

(Note: Normally, we can apply UI to an arbitrary name formula only if it rests upon no assumptions which include the arbitrary name; thus, we can’t apply it directly to an underived formula. Yet since the formula on line 1 has no assumptions, we have not derived it from any assumptions about a, so the restriction on UI isn’t in play.)

(ii) ├ ("x)("y)((x = y & Fx) → Fy)

1(1)                   a = b & Fa                    A

1(2)                   a = b                            1&E

1(3)                               Fa                    1&E

1(4)                                          Fb         2,3=E

(5)                 (a = b & Fa) → Fb        1,4CP

(6)         ("y)((a = y & Fa) → Fy)       5UI

(7) ("x)("y)((x = y & Fx) → Fy)       6UI

(Note:  Again, we can apply UI without its restriction being applicable.)

Examples of other proofs

It follows from our axioms that ("x)(x = a → Fx) ↔ Fa. That is, if everything which is a is F, then a is F, and if a is F, then everything which is a is F. Thus, we should be able to establish: ("x)(x = a → Fx) ┤├ Fa

(i) ("x)(x = a → Fx) ┤├ Fa

1(1) ("x)(x = a → Fx)      A

1(2)          a = a → Fa       1UI

(3)          a = a                 =I

1(4)                       Fa       2,3MPP

Consider this putative proof of the reverse entailment:

*(ii) Fa ┤├ ("x)(x = a → Fx)

1   (1) Fa                            A

2   (2) x = a                        A

1,2(3) Fx                            2,2=E

1   (4) x = a  Fx              2,3CP

1   (5) ("x)(x = a →  Fx)    4UI

Q: There are two fundamental errors in this ‘proof’- what are they? (Hint: Think about the difference between names and variables and the restriction on UI.)

The correct proof is

(ii) Fa d ("x)(x = a →  Fx)

1   (1) Fa                             A

2   (2) b = a                         A

1,2(3) Fb                            1,2=E

1   (4) b = a     Fb             2,3CP

1   (5) ("x)(x = a →  Fx)     4UI

Study the difference between the proofs -most enlightening on the application of UI.

Q: Prove the following:

(i) b = c & c = a ┤├  b = c

(ii) a = b ┤├  FaFb

(iii) a = b ┤├  a =a ↔ c = b