Derek Sivers
Innumeracy - by John Allen Paulos

Innumeracy - by John Allen Paulos

ISBN: 0809058405
Date read: 2017-10-09
How strongly I recommend it: 8/10
(See my list of 360+ books, for more.)

Go to the Amazon page for details and reviews.

Why are so many people so mathematically illiterate? (Hence the title: illiteracy → innumeracy.) I wish I was an expert at this. I love it when someone is able to blow apart a claim in a minute, or know a good versus bad deal, just by running the numbers. I’d love to get great at this, then re-learn almost everything in life, but now with this additional lens.

my notes


* how fast human hair grows in miles per hour
* approximately how many people die on earth each day
* how many cigarettes are smoked annually in this country
* what the population of the United States is
* the approximate distance from coast to coast
* what percentage of the world is Chinese
* How many pizzas are consumed each year in the United States?
* How many words have you spoken in your life?
* How many different people’s names appear in The New York Times each year?
* How many watermelons would fit inside the U.S. Capitol building?
* How long would it take dump trucks to cart away Mount Fuji, to ground level? Assume trucks come every fifteen minutes, twenty-four hours a day, are instantaneously filled with mountain dirt and rock, and leave without getting in each other’s way.

Practice estimating whatever quantity piques your curiosity.

Information can be gleaned from the barest numerical facts, and claims can often be refuted on the basis of these raw numbers alone.

To get a handle on big numbers, it’s useful to come up with one or two collections corresponding to each power of ten, up to maybe 13 or 14.

Taking a human being to be spherical and about a meter in diameter:

* The size of a human cell is to that of a person as a person’s size is to that of Rhode Island.
* A virus is to a person as a person is to the earth
* An atom is to a person as a person is to the earth’s orbit around the sun
* A proton is to a person as a person is to the distance to Alpha Centauri.

Genesis says of the Flood that “… all the high hills that were under the whole heaven were covered…” Taken literally, this seems to indicate that there were 10,000 to 20,000 feet of water on the surface of the earth, equivalent to more than half a billion cubic miles of liquid! Since, according to biblical accounts, it rained for forty days and forty nights, or for only 960 hours, the rain must have fallen at a rate of at least fifteen feet per hour, certainly enough to sink any aircraft carrier.

The multiplication principle is deceptively simple and very important:
If some choice can be made in M different ways and some subsequent choice can be made in N different ways, then there are M × N different ways these choices can be made in succession.

The number of possible license plates in a state whose plates all have two letters followed by four numbers is 262 × 104.

People don’t generally appreciate how large such seemingly tidy collections can be.

31 different flavors of ice cream? The number of possible triple-scoop cones without any repetition of flavors is therefore 31 × 30 × 29
If we’re not interested in how the flavors are arranged on the cone but merely in how many three-flavored cones there are, we divide 26,970 by 6, to get 4,495 cones.
The reason we divide by 6 is that there are 6 = 3 × 2 × 1 different ways to arrange the three flavors

Lotteries choose six numbers out of a possible forty. (40 × 39 × 38 × 37 × 36 × 35)
If, however, we are interested not in the order then we divide by 720, since there are 720 (= 6 × 5 × 4 × 3 × 2 × 1) ways to arrange the six numbers

Same in all three examples:
(32 × 30 × 29)/(3 × 2 × 1) different three-flavored ice-cream cones
(40 × 39 × 38 × 37 × 36 × 35)/(6 × 5 × 4 × 3 × 2 × 1) different ways to choose six numbers out of forty
(52 × 51 × 50 × 49 × 48)/(5 × 4 × 3 × 2 × 1) different poker hands.

Numbers obtained in this way are called combinatorial coefficients.
They arise when we’re interested in the number of ways of choosing R elements out of N elements and we’re not interested in the order of the R elements chosen.

To calculate probabilities:
If two events are independent in the sense that the outcome of one event has no influence on the outcome of the other, then the probability that they both occur is computed by multiplying the probabilities of the individual events.

The probability of obtaining two heads in two flips of a coin is ½ × ½ = ¼ since the probability of five straight coin flips resulting in heads is (½)⁵ = 1⁄32.

The probability that an event doesn’t occur is equal to 1 minus the probability that it does (a 20 percent chance of rain implies an 80 percent chance of no rain).

(5⁄6)⁴ is the probability of not rolling a 6 in four rolls of the die.
Hence, subtracting this number from 1 gives us the probability that this latter event (no 6s) doesn’t occur or, in other words, of there being at least one 6 rolled in the four tries: 1 – (5⁄6)⁴ = .52.
Likewise, the probability of rolling at least one 12 in twenty-four rolls of a pair of dice is seen to be 1 – (35⁄36)²⁴ = .49.

Binomial probability distribution arises whenever a procedure or trial may result in “success” or “failure” and one is interested in the probability of obtaining R successes in N trials.

Half of the time that twenty-three randomly selected people are gathered together, two or more of them will share a birthday.
By dividing this latter product (365 × 364 × 363 × 362 × 361) by 3655, we get the probability that five people chosen at random will have no birthday in common.
Now, if we subtract this probability from 1, we get the complementary probability that at least two of the five people do have a birthday in common.
A similar calculation using 23 rather than 5 yields ½, or 50 percent, as the probability that at least two of twenty-three people will have a common birthday.

It would be very unlikely for unlikely events not to occur.
If you don’t specify a predicted event precisely, there are an indeterminate number of ways for an event of that general kind to take place.

The average value of a large collection of measurements is about the same as the average value of a small collection,
whereas the extreme value of a large collection is considerably more extreme than that of a small collection.
The average scientist in tiny Belgium will be comparable to the average scientist in the United States,
even though the best scientist in the United States will in general be better than Belgium’s best.

The expected value of a quantity is simply the average of its values weighted according to their probabilities.
For example: If ...
* 1/4 of the time a quantity equals 2
* 1/3 of the time it equals 6
* 1/3 of the time it equals 15
* remaining 1⁄12 of the time it equals 54
... then its expected value equals 12.
This is so since [12 = (2 × 1/4) + (6 × 1⁄3) + (15 × 1⁄3) + (54 × 1⁄12)].

Consider a home-insurance company: On average, each year...
* one out of every 10,000 of its policies will result in a claim of $200,000
* one out of 1,000 policies will result in a claim of $50,000
* one out of 50 will result in a claim of $2,000
* the remainder will result in a claim of $0.
The insurance company would like to know what its average payout is per policy written.
The answer is the expected value, which in this case is
($200,000 × 1/10,000) +
($50,000 × 1/1,000) +
($2,000 × 1/50) +
($0 × 9,789/10,000) =
$20 + $50 + $40 = $110.

You pick a number from 1 to 6 and the operator rolls three dice.
If the number you pick comes up on all three dice, the operator pays you $3
if it comes up on two of the three dice, he pays you $2
if it comes up on just one of the three dice, he pays you $1.
Only if the number you picked doesn’t come up at all do you pay him anything - just $1.
Say you pick the number 4.
Chances that a 4 will come up on all three dice?
= 1⁄6 × 1⁄6 × 1⁄6 = 1⁄216; so, approximately 1/216th
Chances of a 4 coming up only twice?
= Use the binomial probability distribution: X44, 4X4, or 44X, the X indicating a non-4.
The probability of the first is 5⁄6 × 1⁄6 × 1⁄6 = 5⁄216,
the probability of a 4 coming up on two of the three dice is 15⁄216,
probability of obtaining exactly one 4 among 4XX is 1⁄6 × 5⁄6 × 5⁄6 = 25⁄216,
Adding, we get 75⁄216.
Chances that no 4s come up when we roll three dice?
= find how much probability is left over.
Subtract (1⁄216 + 15⁄216 + 75⁄216) from 1
The expected value of your winnings is thus:
($3 × 1⁄216) + ($2 × 15⁄216) + ($1 × 75⁄216) + (–$1 × 125⁄216)
= $(–17⁄216) = – $.08
And so, on average, you would lose approximately eight cents every time you played this seemingly attractive game.

Gambler’s fallacy is the mistaken belief that because a coin has come up heads several times in a row, it’s more likely to come up tails on its next flip.

Peter and Paul, who flip a coin at the rate of once a day and who bet on heads and tails respectively. Whoever is ahead will probably have been ahead almost the whole time. If Peter is ahead at the end, it’s considerably more likely that he’s been ahead more than 96 percent of the time than that he’s been ahead between 48 percent and 52 percent of the time.

It can take a long, long time for the lead to switch.

The number of accidents each year at a certain intersection, the number of rainstorms per year in a given desert, the number of cases of leukemia in a specified county, have all been described quite accurately by the so-called Poisson probability distribution. It’s necessary first to know roughly how rare the event is. But if you do know, you can use this information along with the Poisson formula to get a quite accurate idea of the percentage of years in which there would be no desert rainstorms, one such storm, two storms, three, and so on. In this sense, even very rare events are quite predictable.

Assume the probability to be one out of 10,000 that a particular dream matches in a few vivid details some sequence of events in real life.

Since (9,999/10,000)365 is about .964, we can conclude that about 96.4 percent of the people who dream every night will have only nonmatching dreams during a one-year span. But that means that about 3.6 percent of the people who dream every night will have a predictive dream.

Incredible coincidences, whose probability, let’s say, is estimated to be one in a trillion (1 divided by 1012, or 10−12). Should we be impressed? Not necessarily. Since by the multiplication principle there are (2.5 × 108 × 2.5 × 108)/2 or 3.13 × 1016 different pairs of people in the United States, and since we’re assuming the probability of this collection of coincidences to be about 10−12, the average number of “incredible” linkages we can expect is 3.13 × 1016 times 10−12, or about 30,000.

The gravitational pull of the delivering obstetrician far outweighs that of the planet or planets involved.
Does this mean that fat obstetricians deliver babies that have one set of personality characteristics, and skinny ones deliver babies that have quite different characteristics?
There is no correlation between the date of one’s birth and scores on any standard personality test.
Birth dates of more than 16,000 scientists and 6,000 politicians found the distribution of their signs was random, the signs uniformly distributed throughout the year.
The records of 3,000 married couples found no correlation between their signs and astrologers’ predictions about compatible pairs of signs.

Many mundane mistakes in reasoning can be traced to a shaky grasp of the notion of conditional probability.
Unless the events A and B are independent, the probability of A is different from the probability of A given that B has occurred.

The probability of rolling a pair of dice and getting a 12 is 1⁄36.
The conditional probability of getting a 12 when you know you have gotten at least an 11 is 1⁄3.
A confusion between the probability of A given B and the probability of B given A is also quite common.

Imagine a man with three cards.
One is black on both sides, one red on both sides, and one black on one side and red on the other.
He drops the cards into a hat and asks you to pick one, but only to look at one side; let’s assume it’s red.
The man notes that the card you picked couldn’t possibly be the card that was black on both sides, and therefore it must be one of the other two cards - the red-red card or the red-black card.
He offers to bet you even money that it is the red-red card.
Is this a fair bet?
At first glance, it seems so. There are two cards it could be; he’s betting on one, and you’re betting on the other.
But the rub is that there are two ways he can win and only one way you can win.
His chances of winning are thus 2⁄3.
The conditional probability of the card being red-red given that it’s not black-black is ½, but that’s not the situation here.
We know more than just that the card is not black-black; we also know a red side is showing.

Confusing a conditional statement - if A, then B - with its converse - if B, then A - is a very common mistake.
A slightly unusual version of it occurs when people reason that if X cures Y, then lack of X must cause Y.
If the drug dopamine, e.g., brings about a decrease in the tremors of Parkinson’s disease, then lack of dopamine must cause tremors.
If some other drug relieves the symptoms of schizophrenia, then an excess of it must cause schizophrenia.
One is not as likely to make this mistake when the situation is more familiar. Not too many people believe that since aspirin cures headaches, lack of aspirin in the bloodstream must cause them.

* Combinatorics (which studies various ways of counting the permutations and combinations of objects)
* graph theory (which studies networks of lines and vertices and the phenomena which can be modeled by such)
* game theory (the mathematical analysis of games of all sorts)
* and especially probability, are increasingly important.

To teach calculus is wrongheaded if it leads to the exclusion of the above topics in finite mathematics.

Most priesthoods (mathematicians included) are inclined to hide behind a wall of mystery and to commune only with their fellow priests.

People personalize events excessively, resisting an external perspective.
Since numbers and an impersonal view of the world are intimately related, this resistance contributes to an almost willful innumeracy.

They’re often attracted to New Age beliefs, since these provide them with personally customized pronouncements.

Questions arise naturally when one transcends one’s self, family, and friends.
How many?
How long ago?
How far away?
How fast?
What links this to that?
Which is more likely?

Our innate desire for meaning and pattern can lead us astray if we don’t remind ourselves of the ubiquity of coincidence.

Regression to the mean is the natural behavior of any random quantity.
Behavior is most likely to improve after punishment and to deteriorate after reward.
The sequel to a great movie is usually not as good as the original.
The same can be said of the novel after the best-seller, the album that follows the gold record.
Simply another instance of regression to the mean.

Choose between a sure $30,000 or an 80 percent chance of winning $40,000 and a 20 percent chance of winning nothing?
The average expected gain in the latter choice is $32,000 (40,000 × .8).

People tend to avoid risk when seeking gains, but choose risk to avoid losses.

Imagine four dice, A, B, C, and D, strangely numbered as follows:
* A has 4 on four faces and 0 on two faces
* B has 3s on all six faces
* C has four faces with 2 and two faces with 6
* D has 5 on three faces and 1 on three faces.
If die A is rolled against die B, die A will win - by showing a higher number - two-thirds of the time.
If die B is rolled against die C, B will win two-thirds of the time.
If die C is rolled against die D, it will win two-thirds of the time.
Nevertheless, and here’s the punch line, if die D is rolled against die A, it will win two-thirds of the time.
A beats B beats C beats D beats A, all two-thirds of the time.
That die C beats die D may require some explanation:
Half of the time, a 1 will turn up on die D, in which case die C will certainly win.
The other half of the time, a 5 will turn up on die D, in which case die C will win one-third of the time.
Thus, since C can win in these two different ways, it beats D exactly ½ + (½ × 1⁄3) = 2⁄3 of the time.

Social irrationality rests on a base of individual rationality.
Assume that one-third of the electorate prefers Dukakis to Gore to Jackson, that another one-third prefers Gore to Jackson to Dukakis, and that the last one-third prefers Jackson to Dukakis to Gore.
Dukakis will boast that two-thirds of the electorate prefer him to Gore, whereupon Jackson will respond that two-thirds of the electorate prefer him to Dukakis.
Finally, Gore will counter by noting that two-thirds of the electorate prefer him to Jackson.
If societal preferences are determined by majority vote, “society” prefers Dukakis over Gore, Gore over Jackson, and Jackson over Dukakis.
Even if the preferences of all the individual voters are rational, it doesn’t necessarily follow that the societal preferences determined by majority rule are transitive, too.

There is never a way to derive societal preferences from individual preferences that can be absolutely guaranteed to satisfy these four minimal conditions:
1. the societal preferences must be transitive
2. the preferences (individual and societal) must be restricted to available alternatives
3. if every individual prefers X to Y, then the societal preference must be for X over Y
4. and no individual’s preferences automatically determine the societal preferences.

Whether we’re businessmen in a competitive market or spouses in a marriage or superpowers in an arms race, our choices can often be phrased in terms of the prisoner’s dilemma.
The parties involved will be better off as a pair if each resists the temptation to double-cross the other and instead cooperates with or remains loyal to him or her.
If both parties pursue their own interests exclusively, the outcome is worse than if both cooperate.
Adam Smith’s invisible hand ensuring that individual pursuits bring about group wellbeing is in these situations quite paralyzed.

The character of a society is reflected in which such transactions lead to cooperation between parties and which don’t.
If the members of a particular “society” never behave cooperatively, their lives are likely to be, in Thomas Hobbes’s words, “solitary, poor, nasty, brutish and short.”

Statistics is to probability as engineering is to physics - an applied science based on a more intellectually stimulating foundational discipline.

A Type I error occurs when a true hypothesis is rejected.
A Type II error occurs when a false hypothesis is accepted.
When money is being distributed...
the stereotypical liberal tries especially hard to avoid Type I errors (the deserving not receiving their share)
the stereotypical conservative is more concerned with avoiding Type II errors (the undeserving receiving more than their share).
When punishment is being meted out...
the stereotypical conservative is more concerned with avoiding Type I errors (the deserving or guilty not receiving their due)
the stereotypical liberal worries more about avoiding Type II errors (the undeserving or innocent receiving undue punishment).

Capture-recapture method:
Assume we want to know how many fish are in a certain lake.
We capture one hundred of them, mark them, and then let them go.
After allowing them to disperse about the lake, we catch another hundred fish and see what fraction of them are marked.
If eight of the hundred we capture are marked, then a reasonable estimate of the fraction of marked fish in the whole lake is 8 percent.
Of course, care must be taken that the marked fish don’t die as a result of the marking, that they’re more or less uniformly distributed about the lake, that the marked ones aren’t only the slower or more gullible among the fish, etc.

Central limit theorem states that the sum (or the average) of a large bunch of measurements follows a normal curve even if the individual measurements themselves do not.

Quite often, two quantities are correlated without either one being the cause of the other.
Changes in both quantities to be the result of a third factor.
Body lice were considered a cause of good health.
When people took sick, their temperatures rose and caused the body lice to seek more hospitable abodes.
The lice and good health both departed because of the fever.
The correlation between the quality of a state’s day-care programs and the reported rate of child sex abuse in them is certainly not causal, but merely indicates that better supervision results in more diligent reporting of the incidents which do occur.

A technically correct yet misleading statistic is the fact that heart disease and cancer are the two leading killers of Americans. This is undoubtedly true, but according to the Centers for Disease Control, accidental deaths - in car accidents, poisonings, drownings, falls, fires, and gun mishaps - result in more lost years of potential life, since the average age of these victims is considerably lower than that of the victims of cancer and heart disease.

A dress whose price has been “slashed” 40 percent and then another 40 percent has been reduced in price by 64 percent, not 80.

Always ask yourself: “Percentage of what?”
If profits are 12 percent, for example, is this 12 percent of costs, of sales, of last year’s profits, or of what?
When I hear that something or other is selling at a fraction of its normal cost, I comment that the fraction is probably 4⁄3.

The “broad base” fallacy: quoting the absolute number rather than the probability.
“Holiday Carnage Kills 500 Over Four-Day Weekend” (this is about the number killed in any four-day period).

Someone offers you a choice of two envelopes and tells you one has twice as much money in it as the other.
You pick envelope A, open it, and find $100.
Envelope B must, thus, have either $200 or $50.
When the proposer permits you to change your mind, you figure you have $100 to gain and only $50 to lose by switching your choice, so you take envelope B instead.
The question is: Why didn’t you choose B in the first place?
It’s clear that no matter what amount of money was in the envelope originally chosen, given permission to change your mind, you would always do so and take the other envelope.
Without any knowledge of the probability of there being various amounts of money in the envelopes, there is no way out of this impasse.
Variations of it account for some of the “grass is always greener” mentality.

What’s required in many situations is not more facts - we’re inundated already - but a better command of known facts.

Part of the motivation for any book is anger.

The beauty of pure mathematics “cold and austere”.
The sham romanticism inherent in the trite phrase “coldly rational” (as if “warmly rational” were some kind of oxymoron).