Also available at http://amc.imfm.si
ISSN 1855-3966 (printed edn.), ISSN 1855-3974 (electronic edn.)
ARS MATHEMATICA CONTEMPORANEA 4 (2011) 29–62
How to lose as little as possible∗
Vittorio Addona
Macalester College, St. Paul, MN, USA, 55105
Stan Wagon
Macalester College, St. Paul, MN, USA, 55105
Herb Wilf
University of Pennsylvania, Philadelphia, PA, USA, 19104-6395
Received 14 June 2010, accepted 31 August 2010, published online 5 March 2011
Abstract
Suppose Alice has a coin with heads probability q and Bob has one with heads proba-
bility p > q. Now each of them will toss their coin n times, and Alice will win iff she gets
more heads than Bob does. Evidently the game favors Bob, but for the given p, q, what
is the choice of n that maximizes Alice’s chances of winning? We show that there is an
essentially unique value N(q, p) of n that maximizes the probability f(n) that the weak
coin will win, and it satisfies
⌊
1
2(p−q) −
1
2
⌋
≤ N(q, p) ≤
⌈
max (1−p,q)
p−q
⌉
. The analysis uses
the multivariate form of Zeilberger’s algorithm to find an indicator function Jn(q, p) such
that J > 0 iff n < N(q, p) followed by a close study of this function, which is a linear
combination of two Legendre polynomials. An integration-based algorithm is given for
computing N(q, p).
Keywords: Legendre polynomials, symbolic summation, probability.
Math. Subj. Class.: 60C05, 33D45, 42C05
1 The problem
Suppose Alice has a coin with heads probability q and Bob has one with heads probability
p. Suppose q < p. Now each of them will toss their coin n times, and Alice wins iff
she gets more heads than Bob does (n.b.: in case of a tie, Bob wins). Evidently the game
∗To Michael Albertson: We miss you.
E-mail addresses: addona@macalester.edu (Vittorio Addona), wagon@macalester.edu (Stan Wagon),
wilf@math.upenn.edu (Herb Wilf)
Copyright c© 2011 DMFA Slovenije
30 Ars Math. Contemp. 4 (2011) 29–62
favors Bob, but for the given p, q, what is the choice of n that maximizes Alice’s chances
of winning?
Interestingly, there is a nontrivial (i.e., in general> 1) unique value of n that maximizes
her probability of winning. For example, in the case p = 0.2, q = 0.18, Figure 1 is a plot
of Alice’s win probability as a function of n.
5 15 25 35 45 55 65 75
0.14
0.16
0.18
0.2
0.22
0.24
0.26
0.28
0.3
0.32
0.34
0.36
n
pr
ob
ab
ili
ty
A
lic
e
w
in
s
Figure 1: Probability that Alice wins vs. n.
In this example, if each player flips their coin 26 times, which is the best choice for her,
Alice’s chance of winning will be about 0.36, compared to a chance of 0.14 if each coin is
tossed only once.
In general, her chances of winning are
f(n) = f(n, q, p) =
def
∑
r≥0
(
n
r
)
pr(1− p)n−r
∑
s>r
(
n
s
)
qs(1− q)n−s. (1.1)
This problem, which first appeared in [6], arose from a consideration of real-world events
in the National Football League, where teams play a season of 16 games and do not play
all other teams. If teams A and B have probabilities q and p > q, respectively, of winning
any game and never play each other, one can wonder about the chance that A’s season
record will be strictly better than that of B. That is easy to answer, but then one is led to
the question of whether the season length, 16, is favorable or not to such an outcome and
what the optimal choice would be. A study of a related topic, where the central issue is the
chance that the underdog beats a certain point-spread, has been published by T. Lengyel
[2].
We will also give, in Section 10, an algorithm that uses repeated numerical integra-
tion to compute the optimum value N(q, p). Mathematica code for various computations,
graphics, and algorithms (e.g., the generation of graphs of pn or computation of N(q, p))
is available in the electronic supplement at [7].
Much of the work here has relied on computing power, both for numerical experiments
and for proofs using symbolic computation. Some sophisticated algorithms in Maple and
Mathematica (Zeilberger’s MultiZeil, and cylindrical algebra reduction of polynomial
V. Addona, S. Wagon and H. Wilf: How to lose as little as possible 31
systems) played crucial roles; without them the discoveries and proofs would have been
difficult, if not impossible, to find.
1.1 Acknowledgments
Professor Bruno Salvy, of INRIA, France, has kindly supplied to us some highly refined
asymptotic results, which were of great assistance in this work. We thank Rob Knapp and
Tamas Lengyel for some helpful discussions.
2 Overview of methods and results
It develops that there is, in this problem, a nice indicator function Jn(q, p), which is simply
a linear combination of two consecutive Legendre polynomials, with the property that the
sign of f(n+ 1)− f(n) is the same as the sign of Jn(q, p). We will find this indicator by
using the multivariate form of Zeilberger’s algorithm [1]. We will then show that for small
n, J is positive and for large enough n, J is negative, and that there is only a single integer
value of n, or a consecutive pair (n, n + 1), at which the sign of J changes. Thus f has a
unique maximum, at n = N(q, p), say. Here is the precise result.
Theorem 2.1. With f(n) defined by (1.1) we have
f(n+ 1)− f(n)
((1− p)(1− q))n+1
=
(
y +
1
2
(1 + xy)
)
φn(xy)−
1
2
φn+1(xy), (2.1)
where x = p/(1− p), y = q/(1− q), and
φn(z) =
n∑
r=0
(
n
r
)2
zr = (1− z)nPn
(
1 + z
1− z
)
, (2.2)
and Pn(t) is the classical Legendre polynomial. Therefore the indicator function
Jn(q, p) =
(
y +
1
2
(1 + xy)
)
φn(xy)−
1
2
φn+1(xy)
has the desired properties.
We remark that, once found, the recurrence (2.1) can be proved directly, i.e., without
Zeilberger’s algorithm, with little difficulty.
Next, in Section 5 we will prove uniqueness of and find upper and lower bounds for
N(q, p) by using various properties of the Legendre polynomials and by a close study of
a function pn(q) which for each q ∈ (0, n/(2n + 1)), is the unique value of p for which
Jn(q, p) = 0. The properties of the curves p = pn(q) in the (p, q) plane play crucial roles
here. First, concerning uniqueness, we have
Theorem 2.2. (Unimodality) Given probabilities p > q with p + q 6= 1, there are either
one, or two consecutive, values of n such that
1. f(n) ≥ f(n− 1), and
2. f(n+ 1) ≤ f(n), and
3. at least one of the above two inequalities is strict.
32 Ars Math. Contemp. 4 (2011) 29–62
Definition. Given q < p, let N(q, p) be the value of n that maximizes f(n, p, q). When
the value is not unique, define N to be the smaller of the two possible values that yield the
maximum.
It follows from Theorem 2.2 and the definition of N that N(q, p) is the smallest integer
n such that Jn(q, p) ≤ 0.
The resulting upper and lower bounds for N(q, p) are given by
Theorem 2.3. If N(q, p) is the choice of n that maximizes the probability that the player
with the weaker coin will win (and with ties going to the lower value) we have:
1. b 12(p−q) −
1
2c ≤ N(q, p), but if p+ q 6= 1, then b
1
2(p−q) +
1
2c ≤ N(q, p).
2. N(q, p) ≤
⌈
max (1−p,q)
p−q
⌉
.
Section 9 contains proofs of various properties of the graphs of pn, and they are used
to obtain improvements to the upper and lower bounds on N .
2.1 Definitions and notation
The heads probabilities of the two coins are p and q, with p > q. We write x = p/(1− p),
y = q/(1−q), z = xy, u = (1+z)/(1−z) = 1+2pq/(1−p−q), ρ = (1−p+q)/(1−p−q).
Further, Pn is the nth Legendre polynomial and rn = rn(u) = Pn(u)/Pn−1(u). The Pn’s
satisfy the well known recurrence
Pn+1(u) =
2n+ 1
n+ 1
uPn(u)−
n
n+ 1
Pn−1(u). (2.3)
If we divide through by Pn(u), we obtain the ratio recurrence
rn+1(u) =
2n+ 1
n+ 1
u− n
(n+ 1)rn(u)
. (2.4)
which will be of use in the sequel.
The indicator function Jn(q, p), which has the sign of f(n+ 1)− f(n), is
Jn(q, p) = yφn(z)− ψn(z)
=
1
2
(1− p− q)n
((1− p)(1− q))n+1
((1− p+ q)Pn(u)− (1− p− q)Pn+1(u)),
where φn is given by (2.2) and1
ψn(z) =
n∑
r=0
(
n
r + 1
)(
n
r
)
zr+1 =
1
2
(φn+1(z)− (1 + z)φn(z)) (2.5)
T will denote the interior of the triangle in the (p, q) plane whose vertices are (0, 0), (1, 0),
( 12 ,
1
2 ). Tn will be the open interval (0, n/(2n + 1)). The line Lk(q) in the (p, q) plane is
the line p = 12k+1 + q, and the line Mn(q) is
Mn(q) : p =
1
n+ 1
+
n
n+ 1
q. (2.6)
1For a quick proof of (2.5), square both sides of the Pascal triangle recurrence.
V. Addona, S. Wagon and H. Wilf: How to lose as little as possible 33
3 Finding the indicator function
Our first task will be to find a recurrence for f(n). To do this we will use the multivariate
form of Zeilberger’s algorithm, MulZeil [1]. As usual the results that are returned by the
algorithm can be easily verified by substitution.
Remarkably, this recurrence will show that f(n + 1) − f(n) is simply expressible in
terms of Legendre polynomials; this will enable us to identify the values of n for which
f(n+ 1) ≥ f(n) and those for which f(n+ 1) ≤ f(n).
In view of eq. (2.1), Alice’s probability of winning increases with n as long as
yφn(xy)− ψn(xy) =
(
y +
1
2
(1 + xy)
)
φn(xy)−
1
2
φn+1(xy) (3.1)
is positive, and decreases otherwise. We will show that for fixed x, y there is a unique value
of n at which this function changes its sign.
4 Finding the recurrence for f(n)
In this section we will find the recurrence that is satisfied by f(n), the sum in eq. (1.1),
using the multidimensional version of Zeilberger’s algorithm. This will prove Theorem
2.1.
4.1 Finding the recurrence for the summand
With
x = p/(1− p), y = q/(1− q), g(n) = f(n)/((1− p)n(1− q)n), (4.1)
the definition (1.1) of f(n) becomes
g(n) =
∑
r≥0
∑
s>r
(
n
r
)(
n
s
)
xrys.
Let G(n, r, s) =
(
n
r
)(
n
s
)
xrys, be the summand. We use Zeilberger’s algorithm, and his
program MulZeil returns a recurrence
G(n+ 1, r, s)− (x+ 1)(y + 1)G(n, r, s) =
(Kr − 1)(c1(n, r, s)G(n, r, s)) + (Ks − 1)(c2(n, r, s)G(n, r, s)), (4.2)
where Kr,Ks are forward shift operators in their subscripts, and the ci are given by
c1 = c1(n, r, s) =
r(1 + y)
r − n− 1
; c2 = c2(n, r, s) =
s(n+ 1)
(s− n− 1)(n− r + 1)
. (4.3)
This is the recurrence for the summand, and it can be quickly verified by dividing through
by G(n, r, s), canceling all of the factorials, and noting that the resulting polynomial iden-
tity states that 0 = 0.
34 Ars Math. Contemp. 4 (2011) 29–62
4.2 Finding the recurrence for the sum
To find the recurrence for the sum, we sum the recurrence (4.2) over s > r, and then sum
the result over r ≥ 0. To do this we have first, for every function φ of compact support,∑
r≥0
∑
s>r
(Kr − 1)φ(r, s) = −
∑
s≥1
φ(0, s) +
∑
r≥1
φ(r, r), (4.4)
and ∑
r≥0
∑
s>r
(Ks − 1)φ(r, s) = −
∑
r≥0
φ(r, r + 1). (4.5)
Consequently, if we sum the recurrence (4.2) there results
g(n+ 1)− (x+ 1)(y + 1)g(n) =
−
∑
s≥1
c1(n, 0, s)G(n, 0, s) +
∑
r≥1
c1(n, r, r)G(n, r, r)
−
∑
r≥0
c2(n, r, r + 1)G(n, r, r + 1).
Next we insert the values, from (4.3),
c1(n, 0, s) = 0; c1(n, r, r) =
r(1 + y)
r − n− 1
; c2(n, r, r + 1) =
(r + 1)(n+ 1)
(r − n)(n− r + 1)
,
which gives
g(n+ 1)− (x+ 1)(y + 1)g(n) =∑
r≥1
r(1 + y)
r − n− 1
G(n, r, r)−
∑
r≥0
(r + 1)(n+ 1)
(r − n)(n− r + 1)
G(n, r, r + 1).
Now substitute the values G(n, r, r) =
(
n
r
)2
(xy)r, and G(n, r, r + 1) =
(
n
r
)(
n
r+1
)
xryr+1,
and simplify the result, to obtain
g(n+ 1)−(x+ 1)(y + 1)g(n)
=
∑
r≥1
r(1 + y)
r − n− 1
(
n
r
)2
(xy)r
−
∑
r≥0
(r + 1)(n+ 1)
(r − n)(n− r + 1)
(
n
r
)(
n
r + 1
)
xryr+1
= −(y + 1)
∑
r≥0
(
n
r + 1
)(
n
r
)
xr+1yr+1 +
∑
r≥0
(
n+ 1
r
)(
n
r
)
xryr+1
= yφn(xy)− ψn(xy).
Next, replace g(n) by f(n)/((1− p)n(1− q)n), noting that (x+ 1)(y+ 1) = 1/((1−
V. Addona, S. Wagon and H. Wilf: How to lose as little as possible 35
p)(1− q)), to get the final result, namely that the recurrence for f(n) is
f(n+ 1)− f(n)
((1− p)(1− q))n+1
= yφn(xy)− ψn(xy) =
q
1− q
φn
(
pq
(1− p)(1− q)
)
− ψn
(
pq
(1− p)(1− q)
)
. (4.6)
It is easy to find the generating function of the sequence {f(n)}∞n=0 from the recurrence
(4.6). It is
∑
n≥0
f(n)tn =
1
2(1− t)
(
1− 1− (1− p+ q)t√
(1− (1− p− q)t)2 − 4pqt
)
. (4.7)
Corollary 4.1. (Symmetry) For all p, q we have f(n, q, p) = f(n, 1 − p, 1 − q), and
consequently for all 0 < q < p < 1 we have
N(q, p) = N(1− p, 1− q). (4.8)
Proof 1. For a first proof, replace p by 1−q and q by 1−p in the generating function (4.7),
and check that it remains unchanged.
Proof 2. For a more earthy proof, Alice’s winning of the (q, p) game means she had more
heads. This is identical to Bob’s having more tails. That occurs when Bob wins the tails
game where he has a coin that comes up tails with probability 1 − p and Alice has a coin
that comes up tails with probability 1− q. The probability of the latter is f(n, 1− p, 1− q)
while the former happens with probability f(n, q, p).
4.3 Remarks on the identity (2.1)
The identity in equation (2.1) relates two different forms of the function f(n), namely
the form (1.1), of its original definition, and the form on the right side of (2.1). Our first
comment is that although the identity was discovered by Zeilberger’s algorithm, it can be
given a straightforward human proof, which we will now sketch.
First, if we substitute (1.1) into the left side of (2.1) it takes the form∑
r
∑
s>r
{(
n+ 1
r
)(
n+ 1
s
)
− (1 + x)(1 + y)
(
n
r
)(
n
s
)}
xrys, (4.9)
when we express it solely in terms of the variables x and y.
Now by inspection of the right side of (1.1) we see that only terms xayb appear in which
b − a = 0 or 1. Thus to prove the identity we might show that all monomials xayb on the
left side, i.e., in (4.9) above, vanish if b − a 6= 0 or 1, and that the remaining terms agree
with those on the right. We omit the details.
Our second comment is that from the identity (2.1) we can find a new formula for f(n)
itself, the probability that Alice wins if n tosses are done. To do this, multiply both sides of
(2.1) by the denominator on the left, and sum over n. The left side telescopes and we find
f(n) =
1
2
(1− (1− p− q)nPn(u)− (p− q)
n−1∑
k=0
(1− p− q)kPk(u)), (4.10)
36 Ars Math. Contemp. 4 (2011) 29–62
in which, as usual, we have put u = 1 + 2pq/(1− p− q). This formula is well adapted to
computation of f(n). Let’s define
Yn =
1
1− p− q
+
n−1∑
k=0
(1− p− q)kPk(u).
Then it’s not hard to show that {Yn} satisfies the recurrence
(n− 1)Yn =
(3p+ 3q − 6pq − 4 + n(3− 2p− 2q + 4pq))Yn−1
− (7p− 2p2 + 7q − 10pq − 2q2 − 5 + n(3− 4p+ p2 − 4q + 6pq + q2))Yn−2
+ (n− 2)(1− p− q)2Yn−3, (n ≥ 2),
with Y−1 = 0, Y0 = 1/(1−p−q), Y1 = 1+1/(1−p−q). Using this recurrence in (4.10)
is the only way we know to compute accurate values of the probability when n is large.
5 Proof of the unimodality theorem
In this section we will prove Theorem 2.2, the unimodality theorem for the optimum value
of n.
According to (3.1), we have f(n+1) > f(n) precisely for those n such that yφn(xy)−
ψn(xy) > 0, i.e., as long as(
y +
1
2
(1 + xy)
)
φn(xy)−
1
2
φn+1(xy) > 0,
or equivalently, as long as
φn+1(xy)
φn(xy)
< 1 + (x+ 2)y,
or
(1− xy)
Pn+1
(
1+xy
1−xy
)
Pn
(
1+xy
1−xy
) < 1 + (x+ 2)y. (5.1)
First suppose that xy < 1, i.e., that p+ q < 1. We claim
Theorem 5.1. Fix a number x > 1. Then the ratios
Pn+1(x)
Pn(x)
(n = 0, 1, 2, . . . )
strictly increase with n.
While Theorem 5.1 can be proved by induction on n, we use a more general technique
which shows that the result holds not only for the Legendre polynomials, but for any se-
quence of functions of n that are representable as
∫ b
a
g(t)nh(t)dt, with positive g, h. See
Lemma 5.2 below.
To prove this we start with a definition and lemma.
Definition. A function g(t), defined on an interval a ≤ t ≤ b, is admissible for that
interval if g(t) ≥ 0 for all t ∈ (a, b), and there does not exist a real polynomial P (x) such
that P (g(t)) = 0 for all t ∈ [a, b].
V. Addona, S. Wagon and H. Wilf: How to lose as little as possible 37
Lemma 5.2. Suppose g(t) is admissible for (a, b), and h(t) ≥ 0 for all t ∈ (a, b). Let
µn =
∫ b
a
g(t)nh(t)dt, and suppose that all µi > 0. Then
µi+1
µi
is a strictly increasing
function of i = 0, 1, 2, . . . .
Proof. Let H be the infinite Hankel matrix {µi+j}i,j≥0. Consider the principal subma-
trix formed by the first n rows and columns of H . If x0, x1, . . . , xn−1 are arbitrary real
numbers, not all zero, then the quadratic form
Qn =
n−1∑
i,j=0
xiHi,jxj =
n−1∑
i,j=0
xixj
∫ b
a
g(t)i+jh(t)dt =
∫ b
a
(
n−1∑
i=0
xig(t)
i
)2
h(t)dt,
is clearly positive. Hence H is a positive definite matrix, whence its 2× 2 principal minors
µ2iµ2i+2 − µ22i+1 are all positive, i.e.,
µ1
µ0
<
µ2
µ1
;
µ3
µ2
<
µ4
µ3
; . . . (5.2)
Next replace h(t) by g(t)h(t). Then the sequence {µi}i≥0 is replaced by µi+1 (i ≥ 0)
and the Hankel matrix H is replaced by one whose (i, j) entry is µi+j+1. We apply the
conclusion (5.2) to this new situation and we discover that
µ2
µ1
<
µ3
µ2
;
µ4
µ3
<
µ5
µ4
; . . . (5.3)
If we combine (5.2) and (5.3) we obtain the result stated in Lemma 5.2. 
To prove Theorem 5.1 we have the integral representation
Pn(x) =
1
π
∫ π
0
(x+
√
x2 − 1 cos t)ndt (5.4)
for the Legendre polynomials. We can take g(t) = x +
√
x2 − 1 cos t and h(t) = 1 in
Lemma 5.2 and the conclusion of Theorem 5.1 follows.
We remark that this is the reversal of a celebrated inequality of Turán which holds inside
the interval of orthogonality. Now that the ratio of the Legendre polynomials on the left
side of (5.1) is known to be a strictly increasing function of n, we observe that when n = 0
the left side has the value 1 + xy, and when n→∞, the well known asymptotic behavior
of Pn(x) for fixed x > 1 and large n shows that the left side approaches (1+
√
xy)2, which
is larger than 1 + (x + 2)y. Hence there is a unique n for which the left side of (5.1) is ≤
the right side, but at n+ 1 the inequality is reversed.
The case where xy > 1 is reduced to the case xy < 1, which we have just handled, by
equation (4.8). If xy = 1, i.e., if p + q = 1, we discuss the situation in the next section.
The proof of Theorem 2.2 is now complete. 
6 The interesting special case p + q = 1
Consider the special case in which p + q = 1. Then xy = p(1 − p)/((1 − p)p) = 1, and
we can carry out the calculations analytically in full. Indeed, we now have φn(1) =
(
2n
n
)
38 Ars Math. Contemp. 4 (2011) 29–62
and ψn(1) = n
(
2n
n
)
/(n+ 1), from which we get
Jn(q, 1− q) =
q
1− q
φn(1)− ψn(1) =
q
1− q
(
2n
n
)
− n
n+ 1
(
2n
n
)
=
(
q
1− q
− n
n+ 1
)(
2n
n
)
. (6.1)
This last vanishes iff q/(1 − q) = n/(n + 1), which is q = n/(2n + 1). The sign of Jn
then equals that of q − n/(2n+ 1). This proves the following.
Lemma 6.1.
Jn
(
n
2n+ 1
, 1− n
2n+ 1
)
= 0.
Now the unimodality theorem, Theorem 2.2, gives an explicit formula for N on this
line.
Theorem 6.2. (The Diagonal Formula) For 0 < q < 1/2, we have
N(q, 1− q) =
⌈
q
1− 2q
⌉
.
Proof. By the uniqueness theorem, and the tie-breaking aspect of the definition ofN(q, p),
N(q, p) is always the least integer n such that Jn ≤ 0. Because of the agreement of the
sign of Jn and that of q− n/(2n+ 1), the result follows. Note that when q = n/(2n+ 1),
the J = 0 condition means that there is a tie between the two values dq/(1− 2q)e and
dq/(1− 2q)e+ 1 for the optimal choice.
7 A general lower bound
Theorem 7.1. Let N(q, p) denote the optimum choice of n, i.e., the one that maximizes
Alice’s chance of winning. Then we have
N(q, p) ≥
⌊
1
2(p− q)
− 1
2
⌋
. (7.1)
First we need the following
Lemma 7.2. In the trapezoid τn, defined by the lines q = 0, p = q, p + q = 1, and the
inequality p ≤ 1/(2n+ 1) + q, the indicator function Jn is positive. That is, if p ≤ Ln(q)
and (q, p) ∈ T , then Jn(q, p) > 0.
Proof of the Lemma. We use induction on n. Figure 2 shows the trapezoid. Since τn ⊆
τn−1, the induction is valid. The case n = 1 follows easily from J(1, q, p)(1 − p)(1 −
q)2/q = 1 − q + p(−2 + 3q), L1 = 1/3 + q, and q < 1/2. Suppose that Jn ≤ 0.
Because p < Ln−1(q), the induction hypothesis applies, giving Jn−1 > 0. Therefore
rn+1 ≥ ρ > rn. The ratio recurrence tells us that (n+ 1)rn+1 − (2n+ 1)u+ n/rn = 0.
Therefore (n + 1)ρ − (2n + 1)u + n/ρ < 0, or 1 − (2n + 1)p + 2nq + q < 0. Thus
1/(2n+ 1) + q < p, contradicting p ≤ Ln(q).
V. Addona, S. Wagon and H. Wilf: How to lose as little as possible 39
Τn
0.1 0.2 0.3 0.4 0.5
0.2
0.4
0.6
0.8
1.
q
p
Figure 2: The trapezoid τn
Proof of Theorem 7.1. Again, suppose first that xy < 1, i.e., that p + q < 1. Lemma 7.2
says that if p ≤ Ln(q), then n cannot be N(q, p). Therefore for n ≤ 1/(2(p− q))−1/2, n
is not N(q, p) and so N(q, p) > 1/(2(p− q))− 1/2. More precisely, N ≥ 1/(2(p− q))−
1/2 ≥ b1/(2(p− q)) + 1/2c. But on the p+ q = 1 line,
N(q, p) =
⌈
q
1− 2q
⌉
=
⌈
1
2(1− 2q)
− 1
2
⌉
≥
⌊
1
2(1− 2q)
− 1
2
⌋
=
⌊
1
2(p− q)
− 1
2
⌋
.
So the claimed formula works as a lower bound in both cases. Finally, if p + q > 1, then
the symmetry formula of Corollary 4.1 yields N(q, p) = N(1− p, 1− q), a transformation
that leaves the bound invariant.
8 The upper bound
In this section we study in detail the curves J = 0 and use the results to obtain a simple
upper bound on N(q, p) which is roughly twice the lower bound of Theorem 7.1.
Theorem 8.1. (Upper bound on N )
N(q, p) ≤ max (1− p, q)
p− q
.
8.1 Curves on which Jn vanishes
The key idea underlying our analysis of N(q, p) is an understanding of the vanishing sets
of Jn. Figure 3 shows these curves, together with some of the lines Mn and Ln.
The upper black curve is J1 = 0, and so on down. The blue lines are Mn and the red
ones, Ln. So we know that below the uppermost black curve J1 > 0 and so N ≥ 2. In
40 Ars Math. Contemp. 4 (2011) 29–62
0 0.1 0.2 0.5
1
2
1
3
1
4
1
0
1
7
1
3
2
5
2
3
3
5
Figure 3: The black curves are where Jn(q, p) vanishes. The blue lines are Mn and the red
lines are Ln, for n = 1, 2, 3.
V. Addona, S. Wagon and H. Wilf: How to lose as little as possible 41
fact, the N = 1 region is just the region above the first black curve. But now we need to
prove various properties evident from the diagram.
8.2 The function pn(q)
Our first task will be to define, and to verify the correctness of the definition, of a function
pn(q) which for each q ∈ (0, n/(2n+ 1)], is the unique value of p for which Jn(q, p) = 0.
We will therefore start by proving existence and uniqueness of such a p.
Lemma 8.2. For q ∈ Tn, Jn(q, 1− q) < 0.
Proof. As in eq. (6.1) we have
Jn(q, 1− q) =
(
q
1− q
− n
n+ 1
)(
2n
n
)
,
which is negative iff q < n/(2n+ 1).
Lemma 8.3. For q ∈ Tn, Jn(q, q) > 0.
Proof. The condition that Jn(q, q) > 0 is the same as rn+1 < ρ = 1/(1 − 2q), so we
must prove that rn+1 < 1/(1 − 2q) when q ∈ Tn. We will prove more, namely that
rn < 1/(1− 2q) whenever 0 < q < 1/2.
Let r be the fixed point that is > 1 of the Legendre polynomial recurrence, i.e. the root
of the quadratic equation
r =
(2n+ 1)u
n+ 1
− n
(n+ 1)r
that is > 1. This root is
Wn =
√
(2n+ 1)2u2 − 4n(n+ 1) + u(2n+ 1)
2(n+ 1)
.
It is easy to check that 1 < Wn ≤ 1/(1 − 2q), and tedious, but routine, to check that
Wn < Wn+1 for 0 < q < 1/2.
Now we can prove by induction that rn < Wn whenever 0 < q < 1/2, which suffices.
(Note the change from q ∈ Tn to q ∈ (0, 1/2); this is essential to allow the induction to
carry through.) The base case can be taken to be
W1 − r1 =
1
8
(
2q + 2
√
−9(1− q)q + 9
4(1− 2q)2
− 5
4
− 1
1− 2q
− 1
)
whose positivity is follows from the fact that the expression is positively infinite at q = 1/2
and 0 only when q = 0.
For the inductive step, take the recurrence
rn+1 =
(2n+ 1)u
n+ 1
− n
(n+ 1)rn
and assume inductively that rn < Wn. Then
rn+1 <
(2n+ 1)u
n+ 1
− n
(n+ 1)Wn
,
42 Ars Math. Contemp. 4 (2011) 29–62
but this last equals Wn, a fixed point of the recurrence. Therefore rn+1 < Wn, and the
proof of Lemma 8.3 is complete because Wn < Wn+1 for 0 < q < 1/2.
The algebraic part of the proof actually yields the more general result that rn < 1/(1−
p− q).
Theorem 8.4. (Existence theorem) Given n ≥ 1 and q ≤ n/(2n + 1), there is a value p
with q < p ≤ 1− q such that Jn(q, p) vanishes.
Proof. When q < n/(2(n + 1)) this follows from Lemmas 8.2 and 8.3, which tell us that
Jn(q, p) is a strictly decreasing function of p, going from a positive value to a negative one.
At the endpoint the fact that Jn(q, 1−n/(2(2n+1)) = 0 gives the existence of the desired
p.
Definition: Given n ≥ 1 and q ∈ Tn, we write pn(q) for the largest value of p between q
and 1− q such that Jn(q, p) = 0.
We now turn to the important proof that, in all cases, there is only one value p so that
Jn(q, p) = 0. The key is the following lemma about ∂J/∂p.
Lemma 8.5. We have
∂
∂p
Jn(q, p) < 0 for q ≤ p ≤ 1− q.
Proof. The derivative inequality, after multiplication by
2
q
(1− p)3p(1− q)2
(
−p− q + 1
(1− p)(1− q)
)1−n
,
becomes
Pn(u)
(
n
(
2p2 − p(2q + 1) + q − 1
)
− 2pq + p+ q − 1
)
+
(n+ 1)(−p− q + 1)Pn+1(u) < 0,
and so holds precisely when rn+1 < V , where
V =
n
(
2p2 − p(2q + 1) + q − 1
)
− 2pq + p+ q − 1
(n+ 1)(p+ q − 1)
.
(The inequality was reversed because −(n+ 1)(1− p− q) < 0.)
Now rn+1 < V can be proved by an easy induction, since the domain of truth does
not depend on n. For the base case examine V − r1, which works out to be the positive
quantity
2np(1− p)
(n+ 1)(1− p− q)
.
The induction step uses the usual recurrence. Suppose rn < V ; then
rn+1 =
(2n+ 1)u
n+ 1
− n
(n+ 1)rn
<
(2n+ 1)u
n+ 1
− n
(n+ 1)V
.
V. Addona, S. Wagon and H. Wilf: How to lose as little as possible 43
But this last is less than V because the difference V − ((2n+1)u)/(n+1)−n/((n+1)V )
works out to
2n(p− q)− 2q + 1
n (−2p2 + 2pq + p− q + 1) + 2pq − p− q + 1
in which all the grouped terms are positive.
Theorem 8.6. (Uniqueness Theorem) Given n ≥ 1 and 0 < q ≤ n/(2n + 1). If q ≤ p <
pn(q), then Jn(q, p) > 0. Thus there is only one vanishing value for Jn.
Proof. This follows from Lemma 8.5 which implies that Jn(q, p) is a strictly decreasing
function of q; thus it cannot return to 0 after it has once taken that value.
8.3 Properties of J
Now we have the functions pn defined on Tn, with the property that Jn(q, pn(q)) = 0. We
will call the graph of pn(q) a J-nullcline, and denote it simply by pn, or often just p. We
need several properties of pn. Note that most of the properties below have one-line proofs
thanks to the efficient definition of pn and the uniqueness result (Theorem 8.6).
Lemma 8.7. The function pn is continuously differentiable (C1) on Tn.
Proof. This is a consequence of the fact that Jn(q, p) is differentiable (it is a rational func-
tion) and Lemma 8.5, which states that ∂∂pJn > 0. These facts show that the hypotheses of
the implicit function theorem are satisfied.
The preceding result is about the triangle T . But it also works on the p + q = 1 line,
for on that line
Jn(q, p) = Jn(q, 1− q) =
(
q
1− q
− n
n+ 1
)(
2n
n
)
=
(
q
p
− n
n+ 1
)(
2n
n
)
,
and the partial derivative ∂∂pJn(q, p) is just −q
(
2n
n
)
/p2, which is nonzero. By symmetry
the same proof works on the opposite side of the p+ q = 1 line.
Lemma 8.8. (Derivative formula) For any point (q, p) ∈ T and on the graph of pn, we
have
p′n(q) =
p(1− p)(np− nq + p− 1)
q(1− q)(n(q − p) + q)
.
Proof. By the implicit function theorem,
p′n = −
∂
∂qJn(q, p)
∂
∂pJn(q, p)
. (8.1)
Taking the derivatives, using the recurrence to eliminate Pn+2, and using the Jn(q, p) = 0
relation to eliminate Pn+1 in favor of Pn leads immediately to the formula.
Lemma 8.9. (Linear vanishing condition) p′n(q) = 0 iff pn lies on the line Mn.
Proof. Immediate from the numerator of the derivative formula (Lemma 8.8).
Lemma 8.10. Given n ≥ 1 and q < p, if Jn(q, p) = 0 then the partial derivative
(∂/∂q)Jn(q, p) is not zero.
44 Ars Math. Contemp. 4 (2011) 29–62
Proof. We work first in T . The partial derivative of Jn(q, p) can be taken and simplified
using the standard recurrence for Pn+2, and also the relationship derived from Jn = 0, to
replace Pn+1 by ρPn. This leads to
∂
∂q
Jn(q, p) = −
Pn(u)(n(p− q) + p− 1)
(
−p−q+1
(p−1)(q−1)
)n
(1− q)2(−p− q + 1)
.
The denominator is nonzero and xy ≥ 1 in T so Pn(u) ≥ 1; therefore the partial derivative
vanishes at a point on a J-nullcline iff n(p−q)+p−1 = 0 iff p = 1n+1 +
n
n+1q = Mn(q).
So we must show that this value of p cannot lead to a point at which Jn vanishes. Define
h(q) = Jn(q,Mn(q)). This evaluates to(
−2nq+n−q
n(q−1)2
)n
((n+ q)Pn(u) + (n(2q − 1) + q)Pn+1(u))
2n(1− q)2
.
When q = 0, we have u = 1 and so Pn(u) = 1 and h(0) = 0. We also have (limits are
from the left)
lim
q→n/(2n+1)
h(q) = lim
q→n/(2n+1)
Jn(q,Mn(q)) = Jn
(
n
2n+ 1
,
n+ 1
2n+ 1
)
,
by continuity of Jn. But this last vanishes, by the diagonal formula of Theorem 6.2. Thus
lim
q→n/(2n+1)
h(q) = 0.
Hence h(q) is a differentiable function of q which vanishes at both ends of the interval
(0, n/(2n + 1)). Figure 4 shows the graphs of h for n = 1, 2, 3, 4, 5. It remains to show
that h cannot vanish for any q ∈ Tn.
0 1
3
2
5
3
7
4
9
0.5
1
0
Figure 4: The graphs of Jn(q,Mn(q)) for n = 1, 2, 3, 4, 5.
Suppose h(q) vanishes at some q ∈ Tn. Then, because of the vanishing at the endpoints,
there must be a point q1 such that h′(q1) = 0 and h(q1) ≥ 0. More precisely, if there is
V. Addona, S. Wagon and H. Wilf: How to lose as little as possible 45
a true crossing at q then q1 would be given by the Mean Value Theorem applied to either
[0, q] or [q, n/(2n + 1)]; if there is a tangency to the axis at q (or even if the function is
identically 0), then q1 = q.
So at the point (q1,Mn(q1)) (which we denote by just (q,Mn(q)) in the expressions
below) we would have two relations for rn+1: The h(q1) = Jn(q1,Mn(q1)) ≥ 0 condition
means that
rn+1 ≤ ρ = (1− p+ q1)/(1− p− q1),
where in the last fraction p is to be replaced by Mn(q1); and taking the q-derivative of
h(q) = Jn(q,Mn(q)) and recursively removing Pn+2 leads to the following equation:
−(n+ 1)
(
n− q − 2nq
n(q − 1)2
)n
c1Pn(u) + c2Pn+1(u)
2n(q − 1)3(nq + 1)(n(2q − 1) + q)
= 0,
where
c1 = 2n
3q + n2
(
4q3 − 4q2 + 5q + 1
)
+ n
(
2q3 + q2 + 2q + 1
)
+ q(q + 1),
c2 = (n+ 1)(1 + q + 2nq)(−n+ q + 2nq).
Clearing the nonzero factors tells us that rn+1 = −c1/c2, and therefore
−c1
c2
≤
− nqn+1 −
1
n+1 + q + 1
− nqn+1 −
1
n+1 − q + 1
which reduces to
2n(1− q)q
(n+ 1)(2nq + q + 1)
≤ 0,
a clear contradiction, which establishes the theorem for the triangle T . But the same proof
works on the p+ q = 1 line, for on that line
Jn(q, p) = Jn(q, 1− q) =
(
q
1− q
− n
n+ 1
)(
2n
n
)
=
(
q
p
− n
n+ 1
)(
2n
n
)
,
and the partial derivative (∂/∂q)Jn(q, p) is just
(
2n
n
)
/p which is nonzero. By symmetry
the same proof works on the opposite side of the p+ q = 1 line.
Lemma 8.11. For q ∈ Tn we have p′n(q) 6= 0.
Proof. Because of (8.1), the claim follows from Lemma 8.10, which shows that the numer-
ator does not vanish on the graph of pn.
Lemma 8.12.
pn
(
n
2n+ 1
)
= 1− n
2n+ 1
.
Proof. Follows from the diagonal equation of Lemma 6.1.
Lemma 8.13.
p′n
(
n
2n+ 1
)
= 1.
46 Ars Math. Contemp. 4 (2011) 29–62
Proof. The definition of pn can be carried over by symmetry to the other side of the p+q =
1 line to yield a differentiable function. The implicit function theorem applies on the line
itself as noted in the proof. But then, by symmetry of all the probabilities, and therefore of
the vanishing of Jn, pn is symmetric across the line. Thus differentiability implies that the
derivative must be 1 to avoid a cusp.
Theorem 8.14. (Upper bound on pn) For every q ∈ Tn, we have pn(q) < Mn(q), where
Mn(q) is the line (2.6) above.
Proof. Because pn and Mn agree on the line p+ q = 1 (Lemma 8.12), and because
p′n
(
n
2n+ 1
)
= 1,
(Lemma 8.13) while the slope of Mn is n/(n + 1), the fact that pn is C1 (Lemma 8.7)
means that pn is under Mn when q is just left of the p + q = 1 line. Lemma 8.11 tells us
that p′n is never 0, and therefore, by Lemma 8.9, pn can never cross the line Mn.
Lemma 8.15. N(q, pn(q)) = n.
Proof. When (q, p) ∈ T , this follows from Theorem 2.2 because Jn(q, pn(q)) = 0 and so
Jn−1(q, pn(q)) > 0. Therefore n is the least m such that Jm(q, pn(q)) = 0. If pn(q) =
1−q then it must be that q = n/(2n+1) and, by the proof of Theorem 2.2, n = q/(1−2q)
is the unique integer such that Jn(q, 1− q) = 0, establishing the result.
Lemma 8.16. For every q we have p1(q) > p2(q) > p3(q) > . . . .
Proof. The graphs pn can never cross because if pn(q) = pm(q), N(q, p) would be simul-
taneously n and m by Lemma 8.15. and the right end of pn is above the right end of pn+1
(Lemma 8.12). So continuity of the graphs yields the result.
Lemma 8.17. The graphs of pn(q) determine the value of N(q, p) exactly as follows:
N(q, p) is the least n such that p ≥ pn(q).
Proof. We know this is correct when we are on the graph pn (Lemma 8.15). But if (q, p)
is between pn−1 and pn then Jn−1 > 0 and Jn < 0, by Theorem 8.4, and this means
N(q, p) = n.
Figure 5 shows how the graphs of pn divide triangle T into regions that define the
optimal N -values, and how pn is bounded by the two lines Mn and Ln.
Corollary 8.18. N(q, p) is a nonincreasing function of p. Thus if the p-coin becomes
stronger, then the optimal choice of game length for the holder of the q-coin cannot get
larger.
We can now prove Theorem 8.1, the upper bound on N(q, p).
Proof. Assume first that (q, p) ∈ T , in which case the claimed bound is d(1− p)/(p− q)e.
Let n be the smallest value so that p lies at or above Mn(q). Then n = d(1− p)/(p− q)e
and for this to be a bound we need, by Lemma 8.17, that pn(q) < Mn(q). But this is
exactly what Theorem 8.14 tells us. For the special case on the diagonal line: N(q, 1−q) =
dq/(1− 2q)e, by the diagonal formula. But this is identical to d(1− p)/(p− q)e, which
therefore works in both cases. The case in which p + q > 1 is handled by symmetry (see
Corollary 4.1), with q taking the role of 1− p.
V. Addona, S. Wagon and H. Wilf: How to lose as little as possible 47
1
2
3
p3
M3
L3
0 0.1 0.2 0.53
7
1
2
1
3
1
4
1
0
1
7
4
7
Figure 5: The graphs of pn separate T into regions where N = 1, 2, 3, . . . . The two red
lines M3 and L3 form bounds on p3, and this relationship underlies the lower and upper
bounds on N(q, p).
48 Ars Math. Contemp. 4 (2011) 29–62
The two proved bounds in Theorem 2.3 are equal 47% of the time because they must
agree if it happens that Mn(q) < p < Ln−1(q); these conditions define a collection of
triangles whose area, in proportion to T , is easily computed to be π2/4 − 2. In all such
cases, then, N(q, p) equals d(1− p)/(p− q)e.
9 Deeper analysis of the nullclines
In Section 8 we proved many properties of pn that were evident from the graphs. We
continue that here, gaining information that leads to improved bounds on N and to an
efficient algorithm for computing N (Section 10). We first observe that when n is small,
pn is given by simple formulas. Such formulas are useful as a check on computations.
Lemma 9.1.
p1(q) =
1− q
2− 3q
,
p2(q) =
1− q
3− 12q + 10q2
(2− 4q −
√
1− 4q + 6q2).
Proof. Just solve the polynomial equation Jn(q, p) = 0. There are more complicated
radical expressions for p3 and p4 which arise from cubic and quartic polynomial equations,
respectively.
Because Jn is infinitely differentiable in T , the fact that the hypothesis of the implicit
function theorem is met (Lemmas 8.5 and 8.7) means that pn is infinitely differentiable.
The second derivative is easily computed by implicitly differentiating the derivative for-
mula (Lemma 8.8).
Lemma 9.2. We have
p′′n(q) =
p(1− p)(1− p− q)
q2(q − 1)2(n(p− q)− q)3
Z,
where Z is given by
2n3(p− q)3 + 2n2(2p3 − p2(7q + 1) + pq(7q + 3)− 2q2(q + 1))
+n(2p3 − p2(11q + 2) + pq(11q + 10)− q(2q2 + 7q + 1))− (p− 1)q(3p− 3q − 1).
But we can prove the weaker and still very useful assertion that the slope never rises
beyond its value at the right end.
Lemma 9.3. For q ∈ Tn we have p′n(q) > 0.
Proof. Because the slope is 1 at the right end (Lemma 8.13), but never vanishes (Lemma
8.11), it is always positive.
We next move to a proof of the (computationally evident) fact that the J-nullclines
are convex. Figure 6 shows the second derivatives of pn for n ≤ 15. They are evidently
unbounded at q = 0, converging to 0 at the right, and always positive. We will now prove
positivity (i.e., convexity of the J-nullcline curves), which will be important as a source of
new approximations to N(q, p).
V. Addona, S. Wagon and H. Wilf: How to lose as little as possible 49
n  15
n  1
0 0.1 0.2 0.33 0.4 0.5
2
3
4
5
6
0
34
Figure 6: The graphs of p′′n(q) for n ≤ 15.
Theorem 9.4. (Convexity) p′′n(q) > 0 in Tn.
Proof. By Lemma 7.2, any point (q, pn(q)) lies above Ln(q), so the line connecting (q,
pn(q)) to (n/(2n+1), 1−n/(2n+1)) has slope less than 1. By the Mean Value Theorem,
there is some q1 > q for which p′n(q) < 1. But the slope is 1 at n/(2n + 1), so there is a
point at which the second derivative is positive. Since p′′n is continuous, the proof will be
complete once we show that the second derivative cannot vanish. The second derivative is
given by the formula of Lemma 9.2. We can eliminate nonzero factors, thus reducing its
vanishing to the following equation in p.
(
2n3 + 4n2 + 2n
)
p3 + p2
(
−6n3q − 2n2(7q + 1)− n(11q + 2)− 3q
)
+p
(
6n3q2 + 2n2q(7q + 3) + nq(11q + 10) + 3q2 + 4q
)
− nq
(
2q2 + 7q + 1
)
−3q2 − q − 2n3q3 − 4n2q2(q + 1) = 0
Fixing n and q, the vanishing condition is a cubic in p. The cubic always has a real root
and checking the critical points assuming 0 < q < 1/2 and n ≥ 1 shows that they never
straddle 0, which means that the other two roots are never real. Call the unique real root
p−n (q). The theorem will be proved once we show that Jn(q, p
−
n (q)) 6= 0 in Tn, so that the
inflection point is not on the graph of pn. We can get a closed form for p−n (q) by solving the
cubic, but we do not need the explicit representation as the needed algebra can be worked
out implicitly from the cubic relation. Yet it is instructive to look at p−n (q). Figure 7 shows
its graph for n ≤ 5, together with the graphs of pn in pink, and also the base-10 log of the
difference for n = 1, 2, 3, 10, 20, 100. The inflection curve is barely below the J-nullcline
and the two curves are visually indistinguishable.
We can use implicit differentiation on the defining cubic to get a formula for ddqp
−
n (q).
Since p−n (q) is given explicitly by radicals we know its derivative exists. The implicit
derivative formula p−n (q) = −∂q/∂p then gives the following representation for the deriva-
50 Ars Math. Contemp. 4 (2011) 29–62
0 13 25
12
13
14
16
q
pn and pn

n1
n10
n20
n100
0. 0.1 0.2 0.3 0.4 0.5
10
8
6
4
2
q
log10p p

Figure 7: Left: The graphs of pn (pink) and p−n (q) (black). Right: The difference in the
graphs viewed through a logarithmic lens.
tive ddqp
−
n (q):(
6n2 + 8n+ 3
)
p2 − 2p
(
6n2q + n(8q + 3) + 3q + 2
)
+ 6n(n+ 1)q2 + (8n+ 6)q + 1
6n2(p− q)2 + 2n (3p2 − 2p(4q + 1) + q(4q + 3)) + q(−6p+ 3q + 4)
.
(9.1)
Now we follow the proof idea of Lemma 8.10. We need to show that, given n, the point
(q, p−n (q)) cannot lie on the Jn-nullcline. Define the function h(q) = Jn(q, p
−
n (q)), which
we claim does not vanish in Tn. Figure 8 shows the first few graphs of h.
0 1
3
2
5
1
2
3
7
0.
0.002
0.004
0.006
Figure 8: The graphs of Jn(q, p−n (q)) for n ≤ 5.
We observe that h vanishes at both ends. At the left this is because Jn(0, p) is always 0
(the Legendre terms become just 1). At the right we find that the defining cubic, when q is
V. Addona, S. Wagon and H. Wilf: How to lose as little as possible 51
set to n/(2n+ 1), has the factor 2np−n+ p− 1, which means that p = 1−n/(2n+ 1) is
a root. The cubic has exactly one real root in the domain of interest, so 1− n/(2n+ 1) =
p−n (
n
2n+1 )) and we know (Lemma 6.1) that Jn((n/(2n+ 1), 1− n/(2n+ 1)) = 0.
Now suppose the graph is 0 at some q1 ∈ Tn. Then looking left and right of q1 and
using the Mean Value Theorem, we get a value, call it q, such that h(q) ≤ 0 and h′(q) = 0.
Recall that h(q) ≤ 0 means that rn+1 = Pn+1/Pn ≥ ρ = (1− p+ q)/(1− p− q), where
p denotes p−n (q). When we form h
′(q) leaving p−n (q) undefined and then substitute the
derivative formula (9.1) for the derivatives of p−n (q) that appear, we obtain the familiar form
c1Pn + c2Pn+1 = 0. This becomes −c1/c2 = rn+1, so we have the relation −c1/c2 ≥ ρ.
If we work out c1, c2 in terms of n, p, q we get a rational function of these three variables
which is too large to reproduce here, but which can be found at [7]. A call to Mathematica’s
Reduce function shows, in two seconds, that the five conditions:
1. −c1/c2 ≥ ρ,
2. the vanishing of the second derivative formula at p,
3. 0 < q < p < 1− q,
4. n ≥ 1, and
5. q < n/(2(n+ 1))
are contradictory. For such polynomial systems Reduce uses a cylindrical algebra de-
composition [4]; this example requires showing that there is no solution in each of 1062
cylindrical cells. Working this out by hand might be extremely difficult, if not impossi-
ble.
Corollary 9.5. For q ∈ Tn, p′n(q) < 1.
Proof. The second derivative is positive in Tn, so the first derivative is always less than its
value of 1 at n/(2n+ 1).
While the original probability formulation makes no sense when q = 0 (there is no
optimal choice of N when the underdog loses each play), the limit limq→0 pn(q), evident
from Figure 3, is quite simple.
Lemma 9.6. limq→0+ pn(q) = 1n+1
Proof. Because p′n(q) > 0 (Lemma 9.3), the values pn(q) decrease as q → 0; the values are
bounded and so the claimed limit exists. Thus we will use p(0) to denote limq→0+ pn(q).
Further, 0 < p(0) < 1, for by Lemma 7.2, pn(q) > 1/(2n+ 1) + q so p(0) ≥ 1/(2n+ 1);
and p(0) < 1 − n/(2n + 1) because the derivative is positive and pn(n/(2n + 1)) < 1
(Lemma 8.12). Now, the derivative formula tells us that
((p− 1)p(np− nq + p− 1))
((q − 1)q(n(q − p) + q))
− p′(q) = 0,
where p denotes pn. Multiplying both sides by the denominator turns this into
(p− 1)p(np− nq + p− 1)− (q − 1)q(n(q − p) + q)p′(q) = 0.
But the boundedness of p′n means that the limit of qp
′(q) is 0, giving
lim
q→0+
p(p− 1)(np+ p− 1) = 0,
52 Ars Math. Contemp. 4 (2011) 29–62
or (p(0) − 1)p(0)(p(0)(n + 1) − 1) = 0. Only the last factor can vanish, giving p(0) =
1/(n+ 1).
Corollary 9.7. (Slope convergence) The limit limq→0+ p′n(q) exists.
Proof. The slopes are bounded below (Lemma 9.3) and monotonic by the convexity of
pn.
Corollary 9.8. (Slope-limit formula) limq→0+ p′n(q) = n/(2(n+ 1)).
Proof. Let S denote the limit. Because the derivative formula p′ = ((1 − p)p(np − nq +
p−1))/((1−q)q(nq−np+q)) holds for the slopes, we want the limit of this expression as
q → 0. Now, as q → 0, p→ 1/(n+ 1). So we can look at the numerator and denominator
separately and see that we have a 0/0 form, and we can use l’Hopital’s rule to get the limit,
where we use p(q) for p. Forming the l’Hopital quotient, using lim p(q) = 1/(n + 1),
lim p′(q) = S, and then letting q be 0 yields (n− (n− 1)S)/(n+ 1). Setting this to equal
to S and solving gives the formula.
Corollary 9.8 allows us to think of pn as a C1 function on all of R as follows. Let
Sn be the limit of the slopes that the corollary provides. Define p∗n to agree with pn on
[0, n/(2n + 1)) and to be the linear function through (0, pn(0)) of slope Sn on (−∞, 0],
and the similar tangent-line extension on the right. It is easy to see using the Mean Value
Theorem and the limit definition of Sn that the limit of the slopes of the secants connecting
(0, pn(0)) to (q, pn(q)) is Sn, giving the continuous differentiability of p∗n.
Corollary 9.9. (Slope bound) p′n(q) > n2(n+1) in Tn.
Proof. By the previous corollary, because the slopes are monotonically increasing, by con-
vexity.
Corollary 9.10. (Linear lower bound) In Tn, pn > Kn(q) =
def 1
n+1 +
n
2(n+1)q, and
N(q, p) ≥ d(1− p)/(p− q/2)e.
Proof. The linear function Kn agrees with pn at q = 0 by Lemma 9.6. If pn dipped below
Kn, the Mean Value Theorem would provide a point contradicting Corollary 9.9. Inverting
the bound on p gives a bound on N .
The upper bound on N given by Theorem 8.1 and the piecewise lower bound obtained
by combining Theorem 7.1 and Corollary 9.10 are useful computationally. For when the
two bounds agree we know immediately that N(q, p) = b1/(2(p− q)) + 1/2c. And in
cases where N equals the piecewise lower bound, that can be verified by a single J-
evaluation: just check that Jn(q, p) is negative when n is the lower bound. The subset
of T for which the two bounds agree is a union of triangles – the green region in Figure 9
– and the total area of these triangles can be determined by integrating Mathematica’s
Boole function to get an expression for each level and then summing the results symboli-
cally. The total area of the region in which N equals the lower bound can be approximated
by experimentation using thousands of points. The results are summarized in the next
theorem.
V. Addona, S. Wagon and H. Wilf: How to lose as little as possible 53
Theorem 9.11. 1. For (q, p) chosen uniformly from T , the probability that the upper
bound of Theorem 8.1 and the combined lower bounds of Theorem 7.1 and Corollary
9.9 agree is
(1− i)ψ(3− i) + (1 + i)ψ(3 + i) + π
2
4
+ 2γ − 115
27
,
where ψ is the digamma function Γ′/Γ, and γ is Euler’s constant. This is roughly
0.60.
2. For (q, p) as above, the probability that
N(q, p) = max
(⌈
1− p
p− q2
⌉
,
⌊
1
2(p− q)
− 1
2
⌋)
(9.2)
is approximately 0.87.
Corollary 9.12. (Improved bounds on N) In T , N−(q, p) ≤ N(q, p) ≤ N+(q, p), where
N− andN+ are, respectively, the positive-radical solutions, n, to the quadratic equations:
2n2(p− q)3 + 2n
(
p2 − 3pq − p+ q2 + 2q
)
(p− q)− (1− p)q(1− 3p+ 3q) = 0,
and the equation DF= n/(2(n+ 1)), where DF is the derivative formula of Lemma 8.8.
Proof. For the lower bound, the equation is equivalent to setting the second derivative
formula (Lemma 9.2) to 0 and clearing nonzero terms. First define
ker = 2n2(p− q)3 + 2n
(
p3 − p2(4q + 1) + pq(4q + 3)− q2(q + 2)
)
− (p− 1)q(3p− 3q − 1),
the result of clearing nonzero terms from the second derivative formula. Then the second
derivative formula vanishes iff ker does. Further, ∂nker > 0 in T when we add the
condition Ln(q) < p. Because N(q, p) is not less than the least n such that p′′n(q) = 0, the
result follows. The upper bound is obtained the same way, using the fact that the slope is
not less than n/(2(n+ 1)), which becomes a quadratic relation.
Expanding the rational expression for N− in a series in powers of p − q shows that it
equals 1/(2(p−q))−3/2+1/(4p(1−p))+O(p−q), which relates it nicely to the simpler
bound of Theorem 7.1. Define H(q, p) = d1/(2(p− q))− 3/2 + 1/(4p(1− p))e; while
H is not a lower bound on N , it is a useful approximation when p is close to q and appears
to be asymptotically perfect when the domain is rescaled (see Subsection 9.1).
One can view the quadratic equations that give N± as cubic equations in p, in which
case solving gives radical expressions for p±n , which bracket the curve pn. Such bounds are
useful when generating images such as Figure 10, and also theoretically, as they are used
in the proof of Theorem 9.13.
We can measure how good the improved bounds are by assuming that the point (q, p)
is uniformly distributed in T . Then one can ask: (1) How often does N− = N+? (2) How
often does N = N? The answers are: “remarkably often.” Figure 10 shows points in T
colored green if both bounds agree, red if the lower bound is correct, and yellow otherwise.
The upper bound is sometimes correct, but not often. Of course, when the bounds agree
54 Ars Math. Contemp. 4 (2011) 29–62
we know N immediately, and when the lower bound is correct, then that can be proved by
verifying that JN−(q, p) < 0 (more on computing J in Section 10 below). The blue curves
are the graphs of p±1 ; they bracket p1, defined by the yellow-red boundary. In Figure 10 the
green area where the two bounds agree is 75% of the triangle and the green-plus-red area
where N equals the lower bound is 97% of the area.
0. 0.1 0.2 0.3 0.4 0.5
1
5
1
4
1
3
1
2
1
q
p
Figure 9: The green region is where the two bounds on N agree; the red region is where
N agrees with the lower bound. Thus in the combined red and green regions N(q, p) is
expressible exactly as max
(⌈
1−p
p− q2
⌉
,
⌊
1
2(p−q) −
1
2
⌋)
.
V. Addona, S. Wagon and H. Wilf: How to lose as little as possible 55
0. 0.1 0.2 0.3 0.4 0.5
15
14
13
12
q
p
Figure 10: The green region is where N− = N+; red is where N− = N < N+, and
yellow is the rest. The blue curves are the graphs of p−1 and p
+
1 .
9.1 A harmonic rescaling
The views of Figures 9 and 10 do not show clearly what happens near the p = q line. We
can take a microscope to that area by harmonically rescaling the domain. We do this by
first rotating T clockwise 45◦ and then stretching out the vertical scale: precisely, after
the rotation we change each y-coordinate to −(1/(2
√
2y)) − 1/2. This is essentially just
a change of coordinates from (q, p) to (p + q, 1/(p − q)); Figures 11 and 12 show the
view through this microscope. The two approximationsN− andH exactly equalN a large
percentage of the time. For the region defined by N ≤ 5000 we found that N = N− in
> 99% of the region while N and H agree 96% of the time. Thus we can conjecture that
the probability of either equation holding is asymptotically 1.
9.2 The situation when p and q are close
The various diagrams suggest we look more closely at the situation near the p = q line.
The structure can be discerned by looking at the regions in which the excess of N(q, p)
over the simplest lower bound (Theorem 2.3) is constant. From this we will obtain the
result that for any point (q, q) there is an integer δ such that, close to (q, q), we have
N(q, p)− b 12(p−q) +
1
2c ≤ δ.
56 Ars Math. Contemp. 4 (2011) 29–62

1
2

1
2
0
0
1
2
3
4
5
6
7
8
9
10
1  p  q
1
2 2 pq

1
2
Figure 11: A rotated and vertically stretched view of the (q, p) domain. The colors have
the same meanings as in Figure 10. We see here that the region (yellow) where N is not
equal to N− is very small.
Definition. For (q, p) ∈ T let ∆(q, p) = N(q, p) − b 12(p−q) +
1
2c, the amount by which
the optimal n exceeds the lower bound derived from Ln ≤ pn.
Suppose (q, p) is such that pn+j(q) < p < Ln(q) and also Ln+1(q) < p. The first
inequality tells us that n+ 1 ≤ N ≤ n+ j. The last inequality means that the lower bound
derived from Ln is n + 1. So ∆ ≤ j − 1. Now we can profitably examine the regions
determined by the intersection points of the L-lines with the nullclines pn+j . Getting the
intersection points is easy numerically using the function pn, computed by a differential
equation (Section 10); they are plotted as large black joined points in Figure 13. Note that
each Ln is tangent to pn at its right edge, strikes p2n at its left, and, because the slope of
each nullcline is under 1 (Lemma 9.3), hits each in-between p-graph in exactly one point.
Figure 13 tells the story. In the left image the colored regions correspond to constant
values of ∆. The uppermost black arc connects points common to: L1 and p2; L2 and p3;
and in general Ln and pn+1. The second-highest black arc connects points common to: L2
and p4; L3 and p5; L4 and p6; in general Ln+1 and pn+3. These arcs, which we cannot
compute without computing pn, divide T into regions (right image of Figure 13) in each of
which ∆ takes on only two values.
Definition. For n, j ≥ 1, let qn,j be the q-value of the intersection point of Ln+j+1 and
pn+2j−1 and let q−n,j be the same with p replaced by the lower bound p
−
n+2j−1 derived from
the convexity theorem (Corollary 9.12).
Thus the arcs in Figure 13 are obtained by fixing j and joining the points determined by
{qn,j} as n = 1, 2, 3, . . . . Because we used a lower bound on p, we know that q−n,j < qn,j .
V. Addona, S. Wagon and H. Wilf: How to lose as little as possible 57

1
2

1
2
0
0
20
40
1  p  q
1
2 2 pq

1
2
Figure 12: A rotated and vertically stretched view of the (q, p) domain in the region where
N ≤ 50. Here red is where N(q, p) = H(q, p), yellow where they are not equal. The
proportion of the space in which H is correct is 76%, but it appears to converge to 100%
as N →∞.
Further, we can express q−n,j quite simply by solving the equation ker = 0 (see Corollary
9.12) after substituting p = Ln+j−1(q),
qn,j =
−
√
j(j2+j(2n−1)+(n−1)2)
j+1 + j + n− 1
2j + 2n− 1
.
The limit of this expression is easy to find; we use it to define q∞,j = 12 (1−
√
j/(j + 1)).
A derivative shows easily that qn,j is increasing in n, and therefore qn,j approaches q∞,j
from the left. The points (q∞,j , q∞,j) are the limits of the arcs in Figure 13; we now turn
to a proof of this useful fact. Because q−n,j < qn,j and approaches the limit from the left,
we need only show that qn,j < q∞,j . This inequality is not true in general (though it
does appear to be true when j = 1) but we can show that, for each fixed j, it is true for
sufficiently large n, and that suffices for the limit.
Theorem 9.13. For fixed j and sufficiently large n, qn,j < q∞,j .
Proof. To say that qn,j is to the left of q∞,j is the same as saying that Ln+j−1(q∞,j) lies
above pn+2j−1(q∞,j). This in turn is the same as the assertion that
Jn+2j−1(q∞,j , Ln+j−1(q∞,j)) < 0.
And this is equivalent to rn+2j(u) > ρ, where q∞,j and Ln+j−1(q∞,j) are used for q and
p in ρ and u.
58 Ars Math. Contemp. 4 (2011) 29–62
  0
  1
2
q1,1
0.50 q,1q,3
17
15
14
13
12
1
0    1
1    2
23
0
2 2 4 0.5
17
15
13
12
1
Figure 13: The left image shows regions where ∆ is constant (between the red J-nullclines
and black lines Ln). The right image shows how the dividing arcs separate T into regions
where the discrepancy from the lower bound takes on one of two values.
V. Addona, S. Wagon and H. Wilf: How to lose as little as possible 59
Now define g(n) =
(
2n
n
)
4−n and let
P ∗n,m(z) = z
ng(n)
m∑
ν=0
g(ν)
 ν∏
j=1
2j − 1
2n− 1j + 1
 z−2ν(1− z−2)−ν− 12 .
This is the asymptotic series of Laplace-Heine [5, Thm. 8.21.3] for the Legendre polyno-
mials Pn(u) for u > 1. Here z = u+
√
u2 − 1. Thus
Pn(u) = P
∗
n,m(u) +O(n
−m− 32 zn).
Therefore, forming the ratio, we have
rn(u) =
P ∗n,m(u)
P ∗n−1,m(u)
+O(n−m−
3
2 ).
To show that rn+2j(u)− ρ > 0 we must show that
rn+2j
(
2n+ 4jn− 1 + 4j2 − 2
√
j(j + 1)
2(1 + j +
√
j(j + 1)(2n+ 2j − 1))
)
>
2(j + 1)(n+ j − 1)√
j(j + 1)(2n+ 2j − 1)− (j + 1)
.
We can now apply the asymptotic series using three terms (m = 3), and take three terms
of the Taylor series of the result, centered at∞. Fewer than three terms are not sufficient.
When we do this using Mathematica’s Series command and some further simplification
we get the following expression, whose positivity concludes the proof-
rn+2j(u)− ρ =
1
2
(j + 1)2
(
1−
√
j/(j + 1)
)
n−3 +O(n−4).
Given q; let j0(q) = d(1− 2q)2/(4q(1− q))e, the largest j such that q∞,j ≤ q, let
n0(j) be the least integer n guaranteed by the theorem, i.e., qn,j < q∞,j for n ≥ n0. It
appears that n0 < 3j2, but we have no proved bound.
Corollary 9.14. If q < p < Ln0(j0(q)) then N(q, p) ≤ b 12(p−q) +
1
2c+ d
(1−2q)2
4q(1−q)e.
Proof. The definition of n0 means that as one starts at the point (q, Ln0(j0(q))) and moves
down, one strikes, in alternating order, the graphs of pn and the lines Ln. This means that
for these points ∆ equals its value just under (q, Ln0(j0(q))), or 1 greater. Since the value
of ∆ just below the line Ln0 is at most d(1− 2q)2/(4(1− q)q)e−1 (see comments at start
of this section), the result is proved.
Corollary 9.15. Given q0, with 0 < q0 ≤ 12 , we have lim(p,q)→(q0,q0)(q − p)N(q, p) =
1
2 .
Proof. Immediate from the previous corollary, which implies that
1
(2(p− q))
− 1
2
≤ N(q, p) ≤ 1
2(p− q)
+
(1− 2q)2
4(1− q)q
+ 2.
60 Ars Math. Contemp. 4 (2011) 29–62
If in Theorem 9.13 one replaces the sharp q∞,j by simply 1/(2j + 1) one obtains a
much weaker theorem, which can be proved in the same way; i.e., qn,j < 1/(2j + 1) for
sufficiently large n. However, in this case it appears that the result is true for all n, as is
evident from Figure 13. A proof of this would yield a new upper bound onN(q, p), weaker
than the one in Corollary 9.14, but with the advantage of being true for all (q, p).
Open problems. Prove that rn+2j(u) < ρ, where 1/(2j+1) and Ln+j−1(1/(2j+1)) are,
respectively, used for q and p in ρ and u. Find estimates for n0(j). Show that n0(1) = 1.
10 An algorithm for computing N(q, p)
A straightforward algorithm to compute N(q, p) first uses symmetry to restrict to T and
then finds the smallest integer n such that p ≥ pn(q); that value of n is N(q, p) by Lemma
8.17. One can start with the simple or improved bounds and then use either bisection or the
secant method, repeatedly checking whether Jn(q, p) is positive or negative. This method
works fine whenN is of modest size, but whenN is large the Legendre polynomials cannot
be explicitly computed. A solution is to use the integral formula given in (5.4), which is a
fine substitute for Pn. That formula means that we can determine the sign of Jn for each
trial by using numerical integration on
(1− p+ q)
∫ π
0
(
u+
√
u2 − 1 cos t
)n
dt− (1− p− q)
∫ π
0
(
u+
√
u2 − 1 cos t
)n+1
dt.
Of course, high-precision must be used as appropriate. One needs enough accuracy to
account for the full precision of n which will be used as a trial in the root-finding process.
Further, the expression
√
u2 − 1 can be numerically unstable for extreme values of p and q
and one should use the equivalent form 2
√
(1− p)p(1− q)q/(1− p− q).
But one needs only the sign of the integral above. In Mathematica this means that when
computing the integral numerically one needs a large working precision, but the accuracy
goal can be quite small. The method is robust and takes only a few seconds to compute
N(10−100, 2 · 10−100), which is
72768 90317 94675 98852 95987 53552 38752 84521 10838 88022 00705 28794 63897
19626 49789 77512 24788 32188 39061 36928.
q p N(q, p)
10−5 2 · 10−5 72768
10−10 2 · 10−10 7276890317
10−15 2 · 10−15 727689031794675
10−20 2 · 10−20 72768903179467598852
10−25 2 · 10−25 7276890317946759885295987
10−30 2 · 10−30 727689031794675988529598753552
Table 1. The integration algorithm allows one to get giant values of N(q, p).
V. Addona, S. Wagon and H. Wilf: How to lose as little as possible 61
When one wants not just the sign of Jn, but a numerical approximation to the full
Jn-nullcline – the graph of pn – one can use a numerical differential equation approach.
Because of the derivative formula and the known values at 0, we can set up the initial-value
problem as p(0) = 1/(n+ 1) and
p′(q) =
p(1− p)(np− nq + p− 1)
q(1− q)(n(q − p) + q)
if q > 0 and n/(2(n+ 1)) if q = 0. This approach is a quick way to generate graphs of pn,
such as those shown in various figures in this paper. It can also be used in an algorithm for
computing N(q, p) where it can sometimes be faster than the use of numerical integration
because the solution is needed only on the interval [0, q], while the integrals are computed
from 0 to π.
11 Some open questions
1. Improve the bounds on N(q, p).
2. Find a more efficient algorithm for computing N(q, p) when (q, p) is near the origin.
3. Generalize, in a natural way, these results to the case of three players.
4. Prove the second derivative conjectures, which we obtained heuristically by manip-
ulating Taylor polynomials:
(a) lim
q→0+
p′′n(q) =
2n2 + 5n+ 2
6(n+ 1)
, and
(b) lim
q→( n2n+1 )
−
p′′n(q) =
4(2n+ 1)
n(2n2 + n− 1)
.
5. (Asymptotic closed form conjectures)
(a) If q and p are chosen uniformly from the harmonically rescaled domain then
the probability that N(q, p) = N−(q, p) approaches 1 as N →∞.
(b) Same as above, but with the conclusion that
N = d1/(2(p− q))− 3/2 + 1/(4p(1− p))e
with asymptotic probability 1.
References
[1] M. Apagodu and D. Zeilberger, Multi-Variable Zeilberger and Almkvist-Zeilberger algorithms
and the sharpening of Wilf-Zeilberger theory, Adv. Appl. Math. 37 (2006), 139–152.
[2] T. Lengyel, On approximating point spread distributions, J. Stat. Computation and Simulation,
to appear, 2010.
[3] E. D. Rainville, Special Functions, Macmillan, New York, 1960.
[4] A. Strzebonski, Cylindrical algebraic decomposition using validated numerics, J. Symbolic
Comp. 41 (2006), 1021–1038.
[5] G. Szegö, Orthogonal Polynomials, American Mathematical Society, Providence, R.I., 1939.
62 Ars Math. Contemp. 4 (2011) 29–62
[6] S. Wagon, Macalester College Problem of the Week 1128, Dec. 2009, http://mathforum.
org/wagon/fall09/p1128.html.
[7] www.stanwagon.com/public/HowToLoseAsLittleAsPossibleSupplement.
nb.