Thinking hardly or hardly thinking? Philosophy |
Major trains of thought |
The good, the bad, and the brain fart |
Come to think of it |
Cogito ergo sum Logic and rhetoric |
Key articles |
General logic |
Bad logic |
Part of a convergent series on Mathematics |
1+1=11 |
Formal Epistemology is the multidisciplinary field of epistemology that approaches epistemic problems in philosophy through the use of formal systems such as discrete mathematics, computer science, formal logic, probability theory, decision theory and game theory. [1]
Formal Epistemology tends to ignore broader more general problems within epistemology such as radical skepticism, and instead focuses on narrower issues with the application of formal systems.
Bayesian Epistemology is a sub-field within formal epistemology that involves the application of Bayes Theorem to epistemic problems related to hypothesis confirmation and amplitative reasoning (as in non-deductive reasoning).
This is done through the use of Bayes rule which is the following...
P(H|E)= (P(E|H)x P(H)) / P(E).
Where...
- P(H) = the probability of the hypothesis.
- P(E) = the probability of the evidence.
- P(E|H) = the conditional probability of the evidence given the hypothesis.
- P(H|E) = the conditional probability of the hypothesis given the evidence.
What this amounts to is that the probability of a hypothesis given the evidence is equal to the value that results from the probability of evidence given the hypothesis multiplied by the hypothesis itself, and the product of that divided by the probability of the evidence itself. Of course in order for this to work the probability of the evidence cannot equate to zero. Appropriate application of Bayes rule helps one avoid the pitfalls of the base rate fallacy, and such failing to understand Bayes rule is often considered a big risk for irrational belief formation.
An application for this would be like determining the likelihood of a disease given an observed set of symptoms by again multiplying the probability of the symptoms given the disease by the probability of the disease itself and dividing that by the probability of the set of symptoms occurring. Failure to apply this reasoning can lead to concluding a false positive for a given diagnosis.
Bayesian Epistemologists often reject the idea that beliefs are either wholly or not justified, or that we simply believe propositions and/or disbelieve propositions. Instead the focus is on the strength or degrees of said beliefs or it’s credence. In this case Bayesians believe that rationally one would proportion the credence of their beliefs according to values calculated in accordance to Bayes theorem. [2]
In many cases, a known probability for the initial evidence or the probability of the hypothesis isn’t available and so a degree of arbitrariness comes in when one is picking the probabilities they start with, this is known as the problem of priors. Priors in this case being prior probability. [2]
Different Bayesians have different responses to how to select the values of one’s priors and this in turn leads to debates in the field of bayesian epistemology. Subjective bayesians argue that any value for a prior is acceptable provided it is coherent with one’s background beliefs and knowledge. [2] Objective bayesian believe whatever the priors are they should be free of bias and any strongly held stance [2]
The concept of Dutch-books are used in favor of bayesian epistemology, extending a simple intuition that beliefs guide our actions. [2] A dutch-book is simply a betting scenario in which each of the individual outcomes that one can bet on will for sure result in gains, but only individually as jointly betting on multiple outcomes will result in a sure loss. [2] Baysians argue that their system of updating one's beliefs on the basis of Bayes theorem makes them especially resistant to dutch book scenarios. The philosopher Bas Van Fraassen has made explicit arguments against abductive reasoning also known as inference to the best explanation on the grounds it violates bayesian norms and makes one susceptible to dynamic dutch-books, and thus inference to the best explanation should be seen as irrational. [3]
This is a controversial argument as inference to the best explanation is often thought of as essential to science, especially medicine as doctors normatively make use of explaining a patient's symptoms on the basis of what best explains them.
When it comes to how a given set of evidence comes to confirm a given hypothesis or support a particular theory (or alternatively how the evidence discounts certain hypotheses), formal epsitemologists often try to develop theories or apply formal systems to give formalized accounts of how the evidence relates to a given hypothesis. This may be in the applications of Bayes theorem as described above, or some other particular theory of probability, or even taking the observed evidence as propositions that are entailed by a hypothesis in a system of formal logic. [1]
The raven paradox is an odd paradox developed by Carl Hempel to which the statement "all ravens are black" implies "if it is a raven then it is black" but then by the rule of contraposition this statement is logically identical to "if something is not black, then it is not a raven". This on the face of things seems intuitive enough but on that basis not only are instances of black ravens evidence of the hypothesis "all ravens are black" but so too are instances of non-black things being not ravens. Hempel intended this paradox as an argument that inductive logic contradicts our intuitions. There is also the use of the phrase "all ravens are black" to illustrate the importance of falsification given that confirming cases of black ravens does not necessarily entail all ravens are black, but a single white raven is enough to refute the hypothesis.
Below are explanations how each approach deals with theory confirmation as it relates to evidence.
If we take a statement like “all foxes are gray” or “all men are mortal” we can represent that in first-order logic as ∀x(Px ⊃ Qx). If we have an instance of an object say “a” that is P(a) then we can conclude Q(a) on the basis of ∀x(Px ⊃ Qx) in first order logic. Essentially if we see a man and all men are mortal then we can logically conclude that this man is mortal. Same inference applies when we observe an electron to it having a negative charge with the statement “all electrons have a negative electric charge”.
Of course this is simply entailment, and not how we confirm a hypothesis in science to which the goal is often to conclude whether or not all electrons have a negative charge. The philosopher Jean Nicod took theory confirmation as a inverse of logical deduction, with entailments being what amounts to a hypothesis’s prediction. [1] This gives rise to what is called Nicod’s criterion that states that universal statements within a scientific hypothesis or theory is confirmed via every observed instance of an object possessing the properties the universal statement would entail provided that there are no counterexamples. [1]
In essence, every observed mortal man confirms the hypothesis that all men are mortal provided that no counter-examples (immortal men) are observed. This runs into a problem with the raven paradox however as the statement ∀x(Px ⊃ Qx) also entails ∀x(~Qx ⊃ ~Px) by contraposition. So not only does electrons with negative charge affirm the theory that all electrons have negative charge, but so to does things not having negative electrical charge being not electrons. [1] This presents a serious problem for the deductive approach, as that would mean neutrons simply existing with neutral charge is itself evidence of all electrons having negative charge — when the charge of the neutron is sort of irrelevant.
Another issue is with statements like “only half of all crows are black” given there is no obvious means to express that in first order logic. [1]
Bayesian approaches have already been covered a bit in the section above but it is important to look at probabilistic accounts more broadly. First some basic assumptions of probability....
For conditional probabilities like P(A|B ) you would calculate it like so...
P(A|B) = P(A ^ B) / P(B). [1]
or alternatively if the situation calls for it you can use Bayes rule as described above. A theory for confirmation comes in calculating the degree of confirmation or c(H,E) through the following calculation....
c(H,E) = P(H|E) - P(H ). [1]
For many reasons probabilistic accounts are most preferred compared to deductive accounts of theory confirmation as it avoids many of the limitations of the deductive approach in handling statements like "50% of all crows are black". It also doesn't result in the same entailments of the raven paradox due to the absence of entailment from a universal statement in first-order logic. There is still many areas of interaction between this probabilistic approach and deductive logic however. The first is that for any logical tautology the probability must equal to 1 as stated in the above axioms, but in addition to that any logical contradiction such as (A ^ ~A) will result in a probability of 0 as it is impossible to happen.
The philosopher and cognitive scientist Paul Thegard has done appreciable work in developing formal systems in regards to explanatory coherence. This notion of explanatory coherence is not only useful for epistemologists, but it also has value to scientists by giving a direct means to compare the explanatory coherence between competing hypotheses or theories. [4] This works with the basic intuitions that if a given theory creates a picture that is incoherent given the current evidence and background theories then something is likely wrong with the theory. Though theories may not be classed as simply coherent/incoherent as coherency can be measured in matters of degrees. Regardless one would arguably desire more coherent theories to explain our experimental observations over ones that are less coherent.
To make this formal measurement Thegard uses a connectionist model based in 7 basic principles those being; symmetry; explanation; analogy; data prioritization; contradiction; acceptability; and systems coherence; all to determine whether any given element presented to the network coheres or decoheres with the other elements presented. [4]
Formally these principles are presented as...
"Principle 1. Symmetry.
(a) If P and Q cohere, then Q and P cohere.
(b) If P and Q incohere, then Q and P incohere.
Princple 2. Explanation
If P1 . . . P m explain Q then…
(a) For each Pi in P1 . . . Pm , Pi and Q cohere.
(b) For each Pi and Pj in P1 . . . Pm , Pi and Pj cohere.
(c) In (a) and (b), the degree of coherence is inversely proportional to the number of propositions P1 . . . Pm.
Principle 3. Analogy
(a)If P1 explains Q1 , P2 explains Q2 , P1 is analogous to P2 , and Q1 is analogous to Q2, then P1 and P2 cohere, and Q1 and Q2 cohere.
(b) If P1 explains Q1 , P2 explains Q2 , , and Q1 is analogous to Q2 , but P1 and P2 are disanalogous, then P1 and P2 decohere
Principle 4. Data Prioritization
(a) If P describes observational results then P has a bit of acceptability all on its own.
Principle 5. Contradiction
(a)If P contradicts Q then P and Q incohere.
Principle 6. Acceptability
(a) The acceptability of a proposition P in a system S depends on it’s coherence with the proposition in S.
(b)If many results of relevant experimental observations are unexplained, then the acceptability of a proposition P that explains only a few of them is reduced.
Principle 7. System Coherence
(a)The global explanatory coherence of a system S of propositions is a function of the pairwise local coherence of those propositions” - Thegard (1989)[4]
With these principles the neural network was able to determine that Priestly/Lavoisier's oxygen theory of combustion had greater explanatory coherence with the given evidence then did the phlogiston theory.[4] Additionally the model was able to demonstrate the greater explanatory power the theory of natural selection has over creationism. [4]
Of course, an critical question to ask is whether or not this network is externally/internally valid as a measure for coherence. Thegard himself works with his own set theoretical definition for coherence. Thegard writing with his colleague Verbeurgt argue that coherence is best thought of as a type of constraint satisfaction. Formalizing coherence as...
“…Let E be a finite set of elements {ei} and C be a set of constraints on E understood as a set {(ei,ej)} of pairs of elements of E. C divides into C+, the positive constraints on E, and C-, the negative constraints on E. With each constraint is associated a number w, which the weight (strength) of constraint. The problem is to partition E into two sets, A and R, in a way that maximises compliance with following two coherence conditions:
- if (ei,ej) is in C+, then ei is in A if and only if ej is in A.
- if (ei,ej) is in C-, then ei is in A if and only if ej is in R.
Let W be the weight of the partition, that is, the sum of the weights of the satisfied constraints. Then the coherence problem is then to partition E into A and R in a way that maximizes W.” - Thegard & Verbeurgt (1998) [5]
The application of such a conception of coherence can be used in certain algorithmic models such as the connectionist model Thegard himself used to compare the explanatory coherence of different explanations/theories; but this formal account can also be used by other computer models.
Hume's problem of induction is considered to be especially problematic for the theoretical justifications of science. Induction in the way Hume described it can be thought of in the way of taking observed instances of a phenomena or entities i.e. swans, days, slices of bread, etc. Making an observation "every swan that has been observed has been white", and then generalizing to the conclusion "all swans are white" or less strongly "all swans are probably white". Hume argues that these kinds of inference are only justified if there exist a principle of generalizability or uniformity to nature. The only problem is that there no means of deductively proving such a principle exist, and you can't use induction to justify that the principle is probably the case because induction already assumes the principle a priori. This would make any inference of induction irrational or unjustifiable.
Some like Karl Popper try to argue that science does not actually rely on inductive inferences, but this notion can be dismissed with a simple counter-example. Take the notion that all people with down-syndrome have an extra chromosome on the 21st pair. This is concluded on the basis that every person with down syndrome that has been observed has been observed to have trisomy 21, which makes this sort of inference an inductive one. [6]
The formal epistemologist Gerhard Schurz has a unique response to the problem of induction utilizing computer simulations, machine-learning, and complex mathematics. Schurz argues that what Hume has demonstrated is that object-induction can not be proven reliable, neither on the basis of deduction or induction (at least without pre-assuming the uniformity of nature).[7] Object-induction being used to denote the induction described above where observations of observed instances are used to generalize about unobserved instances of the same category. Meta-induction on the other hand is an entirely different beast, and Schurz is of the view that meta-induction can be rationally justified. [7] Meta-induction is of a higher order and looks at the results of all possible prediction methods and generalizes a conclusion on the basis of what is generally being predicted.
According to Schurz meta-induction can be mathematically proven to be the most optimal of prediction method depending on the method that meta-induction uses, and this is without assuming the uniformity of nature.[7] Schurz uses a serious of theorems and a prediction game based on machine learning to run a simulation of various players making predictions about future events across multiple possible worlds, with some worlds possessing a predictive uniformity while others lack said uniformity. [7] The other players are not restricted within this game to naturalistic axioms, as in order to effectively deal with Hume's problem meta-induction needs to perform well even in an absence of a generalizability and uniformity in nature -- so on that basis supernatural worlds and supernatural opponents need to be allowed. [7]
The formal systems that Schurz uses demonstrate that even if the meta-inductive player was simply to copy the best performing player, the clairvoyant players can simply throw them off by purposely and strategically making false guesses. To compensate the meta-inductive player would have to use a different strategy, namely in attractivity weighting. [7]. Doing such the meta-inductive player can be proven to be the most optimal player in obtaining successful predictions, and with that an a priori justification for the use of meta-induction can be provided.
Of course the actual technical details of this demonstration with it's various mathematical proofs, theorems, and convoluted graphs can not be adequately summarized or demonstrated here -- the details of which can only be found in Schurz's book Hume's problem solved: the optimality of meta-induction.
The way in which this "resolves" Hume's problem is that if were to apply meta-induction into our world we would have means to demonstrate the reliability of object-induction without making any assumptions about the uniformity of our reality, and from that the optimality of meta-induction provides us with a posteriori justification of object-induction. [7]
This however arguably leads to another kind of problem of induction, as meta-induction only illustrates the superior reliability of scientific object-induction compared to other prediction methods so far. Absent of a uniformity to nature what is stopping this trend from changing tomorrow? [8]
Igor Douven is not a man to let the Bayesians have the last word in regards to the rational status of abductive reasoning. As stated earlier philosophers such as Bas van Fraassen have appealed to the problem of dynamic dutch-books to argue that abductive reasoning (aka inference to the best explanation) violates Bayes theorem and such should be classed as irrational. Douven no doubt disagrees but he does so challenging the notion there exists one universal standard to what classifies something as rational or irrational, and does not believe that violating Bayes theorem per se makes something inherently irrational; it all depends on context. [9]
Douven grants in the context of dynamic dutch-books abduction does produce false positives more often then the application of Bayes rule, but he argues that in other contexts abductive reason can be shown to be more reliable. [9] Douven applies the use of machine learning to run a series of computer simulations to have various explanatory methods compete against each other. In this case Douven devises a hypothetical situation of a patient arriving to the hospital with a specific set of symptoms. The doctor must determine what the cause of these symptoms are and whether or not it is a rare disease. If the doctor gives a false positive in terms of a diagnosis the patient will still live, but if the doctor were to provide a false negative for the diagnosis the patient will not receive treatment and will die. The hypothetical situation is played round by round with the presence or absence of the deadly disease varying across each simulated trial. Each method acts as it's own separate player, and once the player concludes a false negative they are eliminated from the game. Running this simulation the player using abductive reasoning though not the best performing still none the less outperforms the player using Bayes rule who is always the first to get eliminated. [9]
Douven could have easily turned this around to try to argue that Bayesian reasoning is the truly irrational method, but rightfully sees that as an inappropriate conclusion. Instead Douven appeals to the concept of bounded-rationality - which is a concept that argues that whether or not something is "rational" depends on the environmental context and what the method comes to produce. [9] From that Douven argues that in certain context it is rational to use Bayes rule especially in many gambling situations involving the use of dutch-books, but that in other contexts abduction is the preferable method.