Maximizing Win Probability in the Game of Farkle

Image by Matt Busche using modified povray source by Piotr Borys

Neller and Presser modeled a simple dice game called pig as a Markov Decision Process (MDP) and used value iteration to find the optimal game winning strategy1. Inspired by their approach, I've constructed a variant of an MDP which can be used to calculate the strategy that maximizes the chances of winning 2-player farkle. Due to the three consecutive farkle penalty, an unfortunate or foolish player can farkle repeatedly to achieve an arbitrarily large negative score. For this reason the number of game states is unbounded and a complete MDP model of farkle is not possible. To bound the problem, a limit on the lowest possible banked score is enforced. The calculated strategy is shown to converge exponentially to the optimal strategy as this bound on banked scores is lowered.

Each farkle turn proceeds by iteratively making a pre-roll banking decision, a (contingent) roll of the dice, and a post-roll scoring decision. I modified the classic MDP to include a secondary (post-roll) action to fit this turn model. A reward function that incentivizes winning the game is applied. A similarly modified version of value-iteration (that maximizes the value function for both the pre-roll banking decision, and the post-roll scoring decision) is then used to find an optimal farkle strategy.

With a lower bound of -2500 points for banked scores, there are 423,765,000 distinct game states and so it is not convenient to share the entire strategy in printed form. Instead, I provide some general characterizations of the strategy. For example, if both players use this same strategy, the player going first will win 53.487% of the time. I also provide samples of complete single-turn strategies for various initial banked scores. Currently, only the strategy for Facebook Farkle has been calculated, but the strategy for other scoring variants of farkle could easily be deduced using the same software.

The Rules of Farkle

Farkle rules differ only slightly from the rules of Zilch, but are provided here for completeness.

Farkle is played with two or more players and six six-sided dice. Each player takes turns rolling the dice. The dice in a roll can be worth points either individually or in combination. If any points are available from the roll, the player must set aside some or all of those scoring dice, adding the score from those dice to their point total for the turn. After each roll, a player may either re-roll the remaining dice to try for more points or may bank the points accumulated this turn (though you can never bank less than 300 points). When a player banks his points, the player's turn is ended and the dice are passed to the next player.

If no dice in a roll score, then the player loses all points accumulated this turn and their turn is ended. This is called a farkle, a sorrowful event indeed.

If all dice in a roll score, the player gets to continue his turn with all six dice. This is called hot dice and is guaranteed to brighten your day.

A player may continue rolling again and again accumulating ever more points until he either decides to bank those points or loses them all to a farkle.

If a player ends three consecutive turns with a farkle, they not only lose their points from the turn but also lose 500 points from their banked game score. (This is the only way to lose banked points.) After a triple farkle, your consecutive farkle count is reset to zero so you're safe from another triple farkle penalty for at least three more turns.

The game ends when one player has banked a total of 10,000. Unlike zilch, other players do not get a final turn.

Scoring is as follows:

• Each 1 is worth 100 points.
• Each 5 is worth 50 points.
• A set of three 1s is worth 1000 points.
A set of three 2s is worth 200 points.
A set of three 3s is worth 300 points.
A set of three 4s is worth 400 points.
A set of three 5s is worth 500 points.
A set of three 6s is worth 600 points.
Unlike zilch, each extra die in a set increases the value of the set by a like amount. So four 4s are worth 800 points, five 5s are worth 1500, and six 1s are worth 4000.
• Three pair is worth 750 points. Unlike Zilch, a set of four 2s and two 4s may not be scored as three pairs.
• A six die straight is worth 1500 points.
• Each die can only be used once when scoring. (If you roll two 1s, two 2s, and two 3s you can either count the two 1s for 200 or use all six dice for three-pair and 1500 points — you can't use the ones both ways for 1700 points.)

Markov Decision Processes and Value Iteration

Note: those familiar with MDPs may find the non-standard variable names I use to present this standard subject distracting. My aim is to use variable names most meaningful in the final value-iteration equation. I beg your indulgence. As a courtesy, I've included mouse-hover popups where each such non-standard variable name is introduced (highlighted in blue) explaining the motivation for the change.

A Markov Decision Process (MDP)2 is a system having a finite set of states $S$. For each state $s \in S$, there are a set of actions that may be taken $A(s)$. For each action $a \in A(s)$, there is a set of transition probabilities $P_a(s, s')$ defining the probability of transitioning from $s$ to each state $s' \in S$ given that action $a$ was taken while in state $s$. When action $a$ is taken from state $s$, the MDP responds by randomly moving to a new state $s'$ as governed by the transition probabilities $P_a(s, s')$ and then assigning the decision maker a reward $D_a(s, s')$.

The objective is to find a strategy function $G(s)$ that returns the particular action at each state $s \in S$ that will maximize the expected cumulative reward given by:

$$\sum_{k=0}^{\infty} \gamma^k D_{a_k}(s_k, s_{k+1})$$

where $k$ is a discreet time variable, $s_k$ is the game state at time $k$, $a_k$ is the player action taken from from state $s_k$, and $\gamma \in [0,1)$ is a constant discount factor for future rewards. The expectation must be taken over all possible state transition paths, and maximized over all possible choices for the actions $a_k$ taken in each state $s_k$. Then $G(s)$ will be defined by the $a_k$ taken from each state $s_k$ that maximizes this expectation.

One technique for solving this problem is value iteration. With this technique, each state $s$ is given a decimal value $W(s)$ which is an estimate of the expected discounted sum of all future rewards gained from state $s$. The estimate for $W(s)$ is iteratively refined for all $s$ by applying this update equation sequentially to all states $s$: $$W_{i+1}(s) := \max_{a \in A(s)} \left[ \sum_{s'} P_a(s, s')(D_a(s, s') + \gamma W_i(s'))\right]$$

Note that at each state $s$, the action which provides maximum total reward is selected. So not only is the estimate of $W(s)$ iteratively refined, but the best action $a$ taken for each state $s$ is also simultaneously improved. Iteration continues until $W_{i+1}(s)$ and $W_i(s)$ converge for all $s$, and $G(s)$ is then the set of actions $a$ selected for each state $s$ in the final iteration.

Extending the MDP to support Farkle

During a farkle turn, a player must iteratively

1. decide whether to bank his current points,
2. then, if he decides not to bank, roll the dice,
3. then decide how to score the dice.
So that's potentially two decisions (actions) for each state transition: a pre-roll banking decision (to bank, or to roll), and a post-roll scoring decision (how to score the dice just rolled). But an MDP (as described above) consists of only one action followed by a random transition. So one may reasonably ask if farkle can even be modeled with an MDP.

To fit farkle to an MDP model, one need only consider the result of a roll as part of the game state, and then reorder the turn sequence to that of a combined scoring-and-bank/roll action followed by either an end-of-turn-event or another random-roll-event. But this increases the number of game states by at least a few orders of magnitude and makes the problem unsolvable without a commensurate increase in computer resources.

Alternatively, the roll-decision could be splintered to include any conceivable combination of scoring instructions, directing the MDP how to score each potential roll before the roll is even made, thereby allowing the state machine to transition without additional input from the player once the roll is made. Aside from being a painful way of thinking about the problem, the number of possible scoring instructions is enormous, and the approach is again not feasible.

Rather than forcing the game to fit the MDP model, I instead define an extended MDP (EMDP) to more naturally model the game. Like an MDP, an EMDP has a finite set of states $S$. For each state $s \in S$, there are a set of primary actions that may be taken $A(s)$. For each primary action $a \in A(s)$, there are a set of sets of secondary actions $R_a(s)$. Once primary action $a$ is taken, one set of secondary actions $r \in R_a(s)$ is selected randomly by the EMDP according to a probability distribution $P_a(s)$. So instead of transitioning, the EMDP responds to action $a$ by offering a randomly chosen set of secondary actions that may be taken. For each secondary action $c \in r$, a deterministic transition state $s'$ is defined by a transition matrix $s' = X_{a,c}(s)$. Selection of action $c$ causes the EMDP to transition to $s'$ and reward $D_{a,c}(s, s')$ is granted.

Because there are two actions to be taken in a turn, the optimal strategy also has two parts: $G_A(s)$ is the optimal primary action in state $s$, and $G_C(s, r)$ is the optimal secondary action to take in state $s$ given that secondary action set $r$ was randomly offered by the EMDP in response to action $a$.

Value iteration processing is also extended to account for the secondary action: $$W_{i+1}(s) := \max_{a \in A(s)} \left[ \sum_{r \in R_a(s)} P_{a,r}(s) \max_{c \in r} \Big[ D_{a,c}(s, s') + \gamma W_i(s') \Big] \right]$$

Neller and Presser chose a reward function to incentivize winning the game1. For any transition from a non-winning game state $s$ to a winning game state $s_w$, $D_{a,c}(s, s_w) = 1$ (where $s_w$ is any state where the player's banked points plus his turn points meets or exceeds the game goal of 10,000 points, and where his turn points meets or exceeds the minimum banking threshold). For transitions to any other non-winning state $s_v$, $D_{a,c}(s, s_v) = 0$. Because all game winning states $s_w$ are terminal, all future rewards from such states must be zero, so $W(s_w) = 0$. Because $W$ is known for all game winning states $s_w$, $W(s_w)$ is never updated during value iteration. (I.e., although a game winning state can appear on the right side of the value iteration update equation, it never appears on the left.)

Normally for an MDP, $0 \le \gamma < 1$, but we do not wish to value a game you win in 30 turns, less than a game you win in 10. Following Neller and Presser's approach, I instead set $\gamma = 1$. In general this can prevent value iteration from converging, but it does not cause a problem for farkle. (I think this is because there is no circular state transition path offering unbounded rewards, and because only the player that gets to a game winning state first actually wins, which ensures that the optimal strategy can't be attracted to some infinitely long path to a game winning state.) With $\gamma = 1$, $W(s)$ converges to the probability of winning from any non-terminal state $s$ when using an optimal strategy; the $a$ selected in the $\max_a$ operation converges to the optimal banking strategy from state $s$: $a = G_A(s)$; and the $c$ selected by the $\max_c$ operation converges to the optimal scoring strategy from state $s$ given that roll $r$ was thrown: $c = G_C(s,r)$.

Given that $\gamma = 1$, simplifications can be achieved if you move the reward for transitions to a game winning state out of the reward function, and into the value function for those same game winning states. That is, instead of defining $W(s_w) = 0$, define $W(s_w) = 1$, and set $D_{a,c}(s, s') = 0$ everywhere, allowing the reward function to be completely eliminated from the update equation. This definition also results in $W(s)$ consistently being the probability of winning the game from any state $s$ (including terminal game winning states).

Applying the simplifications of $\gamma = 1$, $D_{a,c}(s, s') = 0$, and $W(s_w) = 1$ reduces the extended value iteration update formula to:

$$W_{i+1}(s) := \max_{a \in A(s)} \left[ \sum_{r \in R_a(s)} P_{a,r}(s) \max_{c \in r} W_i(s') \right]$$

This general approach for finding the optimal play strategy for games having both pre-roll and post-roll actions, is further detailed for the specifics of 2-player farkle in the sections below.

Game State Characterization

The current game state $s$ is characterized by six component state variables:

$$s = (t, n, b, d, f, e)$$

where $t$ is the number of points accumulated only from your current turn. Once $b + t \gt 9950$ (which means you've hit the goal of 10,000 points) and you've met the minimum requirement to bank $t > 250$, the game is over, so for non-terminal game states we have:

$$t \in \{0, 50, 100, ..., \max [250, 9950-b]\}\text{.}$$

$n$ is the number of dice you have to roll

$$n \in \{1, 2, 3, 4, 5, 6\}\text{,}$$

$b$ is your banked score for which I enforce a lower bound $L$

$$b \in \{L, L+50, ..., -100, -50, 0, 50, 100, ..., 9950\}\text{,}$$

$d$ is your opponent's banked score which is also lower bounded to $L$

$$d \in \{L, L+50, ..., -100, -50, 0, 50, 100, ..., 9950\}\text{,}$$

$f$ is your consecutive farkle count (from previous turns)

$$f \in \{0, 1, 2\}\text{, and}$$

$e$ is your opponent's consecutive farkle count

$$e \in \{0, 1, 2\}\text{.}$$

The Farkle Value Iteration Equation

In this section we apply the farkle component state variables and rules defined in previous sections to the EMDP value iteration equation:

$$W_{i+1}(s) := \max_{a \in A(s)} \left[ \sum_{r \in R_a(s)} P_{a,r}(s) \max_{c \in r} W_i(s') \right]$$

Let's detail $A(s)$ first. For farkle, $A(s)$ is the set of available banking actions having at most two members: BANK and ROLL. For game states where you have so far accumulated less than 300 points, you have only one pre-roll (primary) action available to you: ROLL the dice. But for all other game states you have two pre-roll actions available: BANK, or ROLL. This yields:

$$W_{i+1}(s) := \left\{ \begin{array}{ll} \sum\limits_{r \in R_{\text{ROLL}}(s)} P_{{\text{ROLL}},r}(s) \max\limits_{c \in r} W_i(s'), & \text{if t < 300}.\\ \\ \max \bigg[\sum\limits_{r \in R_{\text{BANK}}(s)} P_{{\text{BANK}},r}(s) \max\limits_{c \in r} W_i(s'), \\ \hspace{30pt} \sum\limits_{r \in R_{\text{ROLL}}(s)} P_{{\text{ROLL}},r}(s) \max\limits_{c \in r} W_i(s') \bigg], & \text{if t \ge 300}.\\ \\ \end{array} \right.$$

In the case of a bank action the equation collapses. $R_{\text{BANK}}(s)$ has only one entry: $r_{\text{BANK}}$. There's only one member of the probability distribution: $P_{{\text{BANK}}, r_{\text{BANK}}}(s) = 1$. Also, $r_{\text{BANK}}$ has only one entry: $c_{\text{BANK}}$. So for the case of $t \ge 300$ , the first member of the outer max operation reduces to just $W_i(s')$. All the mathematical machinery in this case is flexible enough to handle the banking case, but is entirely unnecessary. But what exactly is $s'$ after you bank?

Looking back at the game state characterization from the previous section, there is no variable that encodes whose turn it is. Everything I've written so far is from the perspective of the player who controls the dice, and there is no $s'$ expressible in terms of our six component state variables that identifies your state after a banking operation. (This is by design and is consistent with Neller and Presser's approach for optimizing pig game play strategy.) After you bank, it is your opponent's turn who we assume is also playing the optimal strategy, and we can express his state after you bank. If your game state just before you banked was $s = (t, n, b, d, f, e)$, then your opponent's state after you bank will be $o' = (0, 6, d, b + t, e, 0)$. To be clear, after you bank your opponent will have 0 turn points, 6 dice to roll, a banked score of $d$, an opponent's banked score of $b + t$ (which is your new banked score), $e$ consecutive farkles, and his opponent will have $0$ consecutive farkles (your farkle count reverting to zero having just banked). Your opponent's win probability is $W(o')$, which means our win probability after the bank must be: $W_i(s') = 1 - W_i(o') = 1 - W_i(0, 6, d, b + t, e, 0)$, which yields: $$W_{i+1}(s) := \left\{ \begin{array}{ll} \sum\limits_{r \in R_{\text{ROLL}}(s)} P_{{\text{ROLL}},r}(s) \max\limits_{c \in r} W_i(s'), & \text{if t < 300}.\\ \\ \max \bigg[\Big(1 - W_i(0,6,d,b+t,e,0)\Big), \\ \hspace{30pt} \sum\limits_{r \in R_{\text{ROLL}}(s)} P_{{\text{ROLL}},r}(s) \max\limits_{c \in r} W_i(s') \bigg], & \text{if t \ge 300}.\\ \\ \end{array} \right.$$

In the case of a ROLL action, $R_{\text{ROLL}}(s)$ corresponds to the set of all possible rolls of the dice from state $s$. More precisely, each roll $r$ is defined as a set of possible scoring decisions for some permutation of thrown dice. The $\sum_r$ operation is summing over each possible roll $r \in R_{\text{ROLL}}(s)$. $P_{\text{ROLL},r}(s)$ is the probability of making roll $r$ from game state $s$. And each $c \in r$ is one possible scoring decision given that roll $r$ was thrown. Given scoring decision $c$, the new game state $s'$ is determined. The $\max_c$ operation maximizes the expected win probability $W_i(s')$ given that roll $r$ was thrown over these possible scoring decisions.

First note that the set of potential rolls from game state $s = (t, n, b, d, f, e)$ is only dependent on the number of dice you are rolling, so:

$$R_{\text{ROLL}}(s) = R_{\text{ROLL}}(n)$$

Second note that

$$P_{\text{ROLL},r}(s) = {1 \over {6^n}}$$

Third, because the expression for $W_i(s')$ is fundamentally different for the case of farkling rolls vs. scoring rolls, it is convenient to partition $R_{\text{ROLL}}(n)$ into two subsets: the subset of all farkling rolls $R_{\text{FARKLE}}(n)$, and the subset of all scoring rolls $R_{\text{SCORE}}(n)$:

$$R_{\text{ROLL}}(n) = R_{\text{FARKLE}}(n) \bigcup R_{\text{SCORE}}(n)$$

Fourth, if roll $r$ is a farkle, then $r$ will have only one member: a zero point farkle scoring decision and the $max_c$ operation can be dropped.

Applying these four observations gives:

$$\begin{array}{rl} \sum\limits_{r \in R_{\text{ROLL}}(s)} P_{{\text{ROLL}},r}(s) \max\limits_{c \in r} W_i(s') &= \sum\limits_{r \in R_{\text{FARKLE}}(n)} {1 \over {6^n}} W_i(s') + \sum\limits_{r \in R_{\text{SCORE}}(n)} {1 \over {6^n}} \max\limits_{c \in r} W_i(s') \\ &= {1 \over {6^n}} \sum\limits_{r \in R_{\text{FARKLE}}(n)} W_i(s') + {1 \over {6^n}} \sum\limits_{r \in R_{\text{SCORE}}(n)} \max\limits_{c \in r} W_i(s') \end{array}$$

After a farkle, it becomes your opponent's turn and there is again no expression for the new game state. We again instead express $W_i(s')$ in terms of your opponents win probability after your farkle:

$$W_i(s') = 1 - W_i(0, 6, d, b', e, f')$$

where $b'$ is your new banked score after your farkle, which may have decreased due to the three consecutive farkle penalty, but is enforced to always be at least $L$ to keep the problem tractable:

$$b' = \max [ L, b - Y_f ]\text{;}$$ where $Y_f$ is the number of points you lose from your banked score when you farkle while already having $f$ consecutive farkles

\begin{align*} Y_0 &= 0 \\ Y_1 &= 0 \\ Y_2 &= 500\text{;} \end{align*}

and where $f'$ is your new consecutive farkle count which normally just increments, but is reset back to zero if you just had your third consecutive farkle

$$f' = (f+1) \mod 3\text{.}$$

Note also that for all farkling rolls the expression inside the sum is the same, so we can replace the sum with a multiplicative factor equaling the number of ways to roll a farkle with $n$ dice. Combining that count with the ${1 \over {6^n}}$ simplifies to the probability of farkling with $n$ dice3:

\begin{align*} F_1 &= 2/3 \\ F_2 &= 4/9 \\ F_3 &= 5/18 \\ F_4 &= 17/108 \\ F_5 &= 25/324 \\ F_6 &= 5/216 \\ \end{align*}

Combining the above observations gives this substitution:

$${1 \over {6^n}} \sum\limits_{r \in R_{\text{FARKLE}}(n)} W_i(s') = F_n (1 - W_i(0,6,d,b',e,f'))$$

To detail the expression for the case of scoring rolls, first let $C_T(c)$ be the number of points taken with scoring combination $c$, and let $C_N(c)$ be the number of dice used with scoring combination $c$. So after rolling roll $r \in R_{\text{SCORE}}(n)$ and selecting scoring action $c \in r$ the state transitions from $s = (t, n, b, d, f, e)$ to

$$s' = (t', n', b, d, f, e)$$

where

\begin{align*} t' &= t+C_T(c) \\ n' &= h(n-C_N(c)) \end{align*}

and where $h(x)$ is a hot-dice function for resetting the number of available dice back to $6$ when all dice are successfully scored:

$$h(n)=\begin{cases}6, & \text{for n = 0.} \\ n, & \text{otherwise.}\end{cases}$$

This gives our final value iteration equation, where (below) I repeat all the supporting equations for convenience:

$$W_{i+1}(t,n,b,d,f,e) := \left\{ \begin{array}{ll} F_n (1 - W_i(0,6,d,b-Y_f,e,f')) + \\ \hspace{30pt} {1 \over 6^n} \sum\limits_{r \in R_{\text{SCORE}}(n)} \max\limits_{c \in r} \left[ W_i(t',n',b,d,f,e)\right], & \text{if t < 300}.\\ \\ \max \bigg[ \Big(1 - W_i(0,6,d,b+t,e,0)\Big), \\ \hspace{30pt} \Big( F_n (1 - W_i(0,6,d,b',e,f')) + \\ \hspace{60pt} {1 \over 6^n} \sum\limits_{r \in R_{\text{SCORE}}(n)} \max\limits_{c \in r} \left[ W_i(t',n',b,d,f,e) \right] \Big) \bigg], & \text{if t \ge 300}. \end{array} \right.$$

where

\begin{align*} b' &= \max [ L, b - Y_f ] \\ f' &= (f+1) \mod 3 \\ t' &= t+C_T(c) \\ n' &= h(n-C_N(c)) \\ h(n) &=\begin{cases}6, & \text{for n = 0.} \\ n, & \text{otherwise.}\end{cases} \\ Y_0 &= 0 \\ Y_1 &= 0 \\ Y_2 &= 500 \\ F_1 &= 2/3 \\ F_2 &= 4/9 \\ F_3 &= 5/18 \\ F_4 &= 17/108 \\ F_5 &= 25/324 \\ F_6 &= 5/216 \\ \end{align*}

Performance

With a lower bound of $L=-2500$ there are 423,765,000 game states. Each state is modeled with a double precision floating point number requiring 8 bytes, so the entire state matrix requires 3.39012 billion bytes of RAM (plus some array overhead). With game goal $g$ the number of states grows as $(g-L)^3$. Lowering $L$ from -2500 to, say, -10,000 (to eliminate all reasonable doubt that you'll ever venture into portions of the calculated strategy that are non-optimal) will increase the number of game states to over 1.7 billion and memory requirements to 14 GB (which is more than I have on any of my computers).

The value iteration software to solve the optimal farkle problem was written in C++. Obvious optimizations were made. For example, I don't actually iterate over possible rolls, but only over a precalculated set of unique score sets (which is orders of magnitude smaller in count than rolls), and weighting each score set by the number of rolls that share that same set. I let the program iterate over all game states until the maximum relative change over all states from one iteration to the next was less than 1 part in a billion. (I.e., iterations continued until no state had a change in value from one iteration to the next of more than 1 part in a billion, so that even those states with extremely small win probabilities were calculated with precision.) With the floor set to -2500 points, it took 62 iterations for the matrix to converge. Using one core of a dual-core Intel I3 4130T, the software performed 1.30 million state updates per second, each pass of the matrix taking 5 minutes 26 seconds, and convergence taking 5 hours 37 minutes.

The Strategy

It is not practical to list the win probabilities and banking rules for all of the half billion game states in two player farkle. Here I provide only a limited view into the complete strategy by means of five examples. Each of the five subsections below provide a 2 dimensional slice of the 6 dimensional strategy matrix, sufficient for optimal play of a single turn for a particular start-of-turn game state.

Optimal Strategy Case: $b = 0, d = 0, f = 0, e = 0$

Table 1 shows the win probabilities and banking actions needed to play an opening turn optimally. Each cell shows the win probability for different turn scores (starting at $t=0$ in the top row and increasing down the page) and/or different number of dice to roll (starting at $n=6$ in the first column and decreasing as you move to the right). For all entries in this table, the other four component state variables are fixed: your banked score (b) is fixed at 0 points, your opponent's banked score (d) at 0 points, your consecutive farkle count (f) at 0, and your opponent's consecutive farkle count (e) at zero. States shaded green are states from which your optimal banking action is to roll. States shaded red are states from which your optimal banking action is to bank. States with an asterisk are inaccessible and can be ignored (although the software still calculates your win probability in-, and optimal play out of- these states which would be useful if, say through a disruption in the laws of the universe, you somehow find yourself in such a state).

Table 1. Win probabilities and banking actions for optimal turn play for b=0, d=0, f=0, and e=0.

tn
654321
00.5348700.506721*0.493622*0.487801*0.486163*0.489950*
500.540099*0.5110050.495849*0.489177*0.487586*0.491718*
1000.545359*0.5157760.4993740.490798*0.489034*0.493525*
1500.550709*0.520616*0.5038150.4931350.490494*0.495375*
2000.556198*0.525472*0.5085640.4971580.4924620.497250*
2500.561811*0.530484*0.513285*0.5020740.4954390.499128
3000.5674480.535760*0.518054*0.5066800.5032900.503290
3500.5730820.5411810.523166*0.511296*0.5097110.509711
4000.5787100.5466040.5285790.5161460.5161460.516146
4500.5843320.5520280.5339960.5225920.5225920.522592
5000.5899660.5574520.5394180.5290480.5290480.529048
5500.5956750.5628730.5448420.5355130.5355130.535513
6000.6014360.5683380.5502680.5419860.5419860.541986
6500.6071950.5739370.5556930.5484630.5484630.548463
7000.6129410.5795880.5611150.5549440.5549440.554944
7500.6186710.5852320.5665340.5614270.5614270.561427
8000.6244000.5908660.5719480.5679100.5679100.567910
8500.6301730.5964890.5773540.5743920.5743920.574392
9000.6359870.6021370.5827530.5808700.5808700.580870
9500.6417930.6079140.5881420.5873430.5873430.587343
10000.6476150.6137820.5938090.5938090.5938090.593809
10500.6534170.6196330.6002660.6002660.6002660.600266
11000.6591920.6254660.6067130.6067130.6067130.606713
11500.6649390.6312800.6131470.6131470.6131470.613147
12000.6706570.6370720.6195670.6195670.6195670.619567
12500.6763450.6428420.6259700.6259700.6259700.625970
13000.6820020.6485870.6323560.6323560.6323560.632356
13500.6876250.6543060.6387210.6387210.6387210.638721
14000.6932120.6599980.6450650.6450650.6450650.645065
14500.6987620.6656600.6513850.6513850.6513850.651385
15000.7042890.6712920.6576810.6576810.6576810.657681
15500.7098400.6768920.6639500.6639500.6639500.663950
16000.7153500.6824580.6701900.6701900.6701900.670190
16500.7208180.6879890.6764000.6764000.6764000.676400
17000.7262430.6934830.6825790.6825790.6825790.682579
17500.7316220.6989380.6887230.6887230.6887230.688723
18000.7369550.7043540.6948320.6948320.6948320.694832
18500.7422390.7097270.7009040.7009040.7009040.700904
19000.7474720.7150580.7069360.7069360.7069360.706936
19500.7526550.7203450.7129270.7129270.7129270.712927
20000.7578460.7255870.7188750.7188750.7188750.718875
20500.7630280.7307820.7247810.7247810.7247810.724781
21000.7681620.7359290.7306400.7306400.7306400.730640
21500.7732410.7410260.7364530.7364530.7364530.736453
22000.7782620.7460720.7422170.7422170.7422170.742217
22500.7832240.7510660.7479320.7479320.7479320.747932
23000.7881270.7560060.7535940.7535940.7535940.753594
23500.7929690.7608920.7592040.7592040.7592040.759204
24000.7977880.7657220.7647590.7647590.7647590.764759
24500.8026000.7704950.7702580.7702580.7702580.770258
25000.8073670.7757000.7757000.7757000.7757000.775700
25500.8120690.7810840.7810840.7810840.7810840.781084
26000.8167070.7864080.7864080.7864080.7864080.786408
26500.8212780.7916710.7916710.7916710.7916710.791671
27000.8257810.7968710.7968710.7968710.7968710.796871
27500.8302180.8020080.8020080.8020080.8020080.802008
28000.8345860.8070810.8070810.8070810.8070810.807081
28500.8388860.8120870.8120870.8120870.8120870.812087
29000.8431160.8170250.8170250.8170250.8170250.817025
29500.8472770.8218940.8218940.8218940.8218940.821894
30000.8513670.8266960.8266960.8266960.8266960.826696
30500.8553870.8314280.8314280.8314280.8314280.831428
31000.8593360.8360900.8360900.8360900.8360900.836090
31500.8632140.8406820.8406820.8406820.8406820.840682
32000.8670190.8452020.8452020.8452020.8452020.845202
32500.8707530.8496500.8496500.8496500.8496500.849650
33000.8744150.8540250.8540250.8540250.8540250.854025
33500.8780050.8583270.8583270.8583270.8583270.858327
34000.8815220.8625550.8625550.8625550.8625550.862555
34500.8849670.8667090.8667090.8667090.8667090.866709
35000.8883430.8707880.8707880.8707880.8707880.870788
35500.8916520.8747910.8747910.8747910.8747910.874791
36000.8948880.8787190.8787190.8787190.8787190.878719
36500.8980520.8825710.8825710.8825710.8825710.882571
37000.9011450.8863470.8863470.8863470.8863470.886347
37500.9041670.8900470.8900470.8900470.8900470.890047
38000.9071180.8936710.8936710.8936710.8936710.893671
38500.9099990.8972190.8972190.8972190.8972190.897219
39000.9128090.9006910.9006910.9006910.9006910.900691
39500.9155480.9040880.9040880.9040880.9040880.904088
40000.9182190.9074090.9074090.9074090.9074090.907409
40500.9208200.9106550.9106550.9106550.9106550.910655
41000.9233530.9138260.9138260.9138260.9138260.913826
41500.9258190.9169200.9169200.9169200.9169200.916920
42000.9282190.9199390.9199390.9199390.9199390.919939
42500.9305600.9228840.9228840.9228840.9228840.922884
43000.9328470.9257560.9257560.9257560.9257560.925756
43500.9350680.9285550.9285550.9285550.9285550.928555
44000.9372260.9312820.9312820.9312820.9312820.931282
44500.9393210.9339370.9339370.9339370.9339370.933937
45000.9413540.9365210.9365210.9365210.9365210.936521
45500.9433260.9390350.9390350.9390350.9390350.939035
46000.9452360.9414790.9414790.9414790.9414790.941479
46500.9470870.9438540.9438540.9438540.9438540.943854
47000.9488790.9461600.9461600.9461600.9461600.946160
47500.9506120.9484000.9484000.9484000.9484000.948400
48000.9522870.9505730.9505730.9505730.9505730.950573
48500.9539050.9526790.9526790.9526790.9526790.952679
49000.9554680.9547210.9547210.9547210.9547210.954721
49500.9569770.9566990.9566990.9566990.9566990.956699
50000.9586140.9586140.9586140.9586140.9586140.958614

The game begins in the upper left cell at $t=0$ and $n=6$. The win probability listed here means that if both players play optimally, the player going first wins 53.487% of the time. To play a turn, first you must choose your banking action. Since cell (t=0, n=6) is green, the optimal banking action is to roll.

Second, you roll the dice. Let's say for your opening roll, you throw:

$$6, 5, 3, 3, 3, 2$$

Third, you must choose how to score the dice. Here you have three choices:

• take the 5 for 50 points,
• take the three 3 for 300 points, or
• take the 5 and the three 3 for 350 points.

To determine your optimal scoring choice, you must look at the win probabilities from the three states corresponding to each scoring option:

\begin{align*} W(t=50, n=5) &= 0.511005\\ W(t=300, n=5) &= 0.506680\\ W(t=350, n=5) &= 0.509711 \end{align*}

So in this case, you should score only the 5 since it moves you to the state with the best probability of ultimately winning the game. This ends one primary-action, random-response, secondary-action sequence effecting one state change in the EMDP. This process is then simply repeated (starting with another banking decision from the new state (t=50, n=5) until you either roll a farkle, or end in a state where you're supposed to bank.

It is interesting to see how the optimal play strategy for the opening turn differs from the strategy that simply maximizes your expected farkle turn score. To maximize expected score you wouldn't bank with 6 dice available to roll unless you had 16,400 or more points on the turn. In contrast, to optimize your chances of winning a game, you would bank 5000 or more points on your opening turn even when you have six dice available to roll. The banking threshold for 5 dice is similarly reduced from 3050 to a more conservative 2500. The banking threshold for other available dice counts do not differ.

Note that if you manage to bank 1000 points on your opening turn, your chances of winning increase to almost 60%; banking 1850 points on your first turn leaves you with better than 70% chance of ultimately winning; bank 2750 points and your chances of winning are above 80%; and manage to put up 3900 points on the opening turn and you're win probability is over 90%.

Optimal Strategy Case: $b = 6000, d = 8000, f = 0, e = 0$

As a second example, consider the case of Table 2 which shows how to optimally play a turn where you start with 6000 points and are 2000 points behind to your opponent who has 8000 points.

Table 2. Win probabilities and banking actions for optimal turn play for b=6000, d=8000, f=0, and e=0.

tn
654321
00.1623650.135452*0.124658*0.118916*0.116823*0.120172*
500.167053*0.1380980.126521*0.120462*0.118340*0.121980*
1000.172282*0.1411840.1284850.122068*0.119912*0.123894*
1500.177898*0.144859*0.1305130.1237380.121540*0.125881*
2000.183766*0.149237*0.1331830.1254730.1232370.127935*
2500.189820*0.153963*0.136653*0.1272640.1250050.130069
3000.1960920.158915*0.140895*0.1300170.1268380.132298
3500.2026230.1640300.145391*0.133939*0.1287280.134616
4000.2094670.1693630.1500240.1381630.1339910.137010
4500.2165330.1749750.1547980.1425030.1394570.139475
5000.2238310.1808970.1598280.1469750.1451150.145115
5500.2313220.1870280.1652480.1516030.1508760.150876
6000.2390350.1933620.1710120.1568430.1568430.156843
6500.2469180.1999760.1769750.1630150.1630150.163015
7000.2550370.2068170.1831540.1694530.1694530.169453
7500.2633570.2138300.1895330.1761090.1761090.176109
8000.2719450.2210700.1961230.1830590.1830590.183059
8500.2807410.2284600.2028740.1902420.1902420.190242
9000.2897900.2360970.2098240.1977120.1977120.197712
9500.2989540.2439180.2169380.2052810.2052810.205281
10000.3083580.2519430.2242420.2132450.2132450.213245
10500.3178950.2600870.2317400.2211460.2211460.221146
11000.3277850.2685090.2394620.2294040.2294040.229404
11500.3379370.2772740.2473800.2379230.2379230.237923
12000.3483930.2863860.2555290.2467640.2467640.246764
12500.3589850.2957330.2638970.2557510.2557510.255751
13000.3697430.3053370.2725440.2649890.2649890.264989
13500.3805410.3150570.2814030.2744040.2744040.274404
14000.3916420.3249610.2904660.2843530.2843530.284353
14500.4029330.3349640.2996330.2945030.2945030.294503
15000.4146600.3452570.3089580.3050910.3050910.305091
15500.4264840.3557490.3184230.3156210.3156210.315621
16000.4385860.3665900.3281540.3263600.3263600.326360
16500.4507060.3776960.3381630.3371120.3371120.337112
17000.4630150.3892780.3485760.3481730.3481730.348173
17500.4753290.4011170.3595730.3595730.3595730.359573
18000.4880070.4132440.3717590.3717590.3717590.371759
18500.5008480.4254570.3841340.3841340.3841340.384134
19000.5141510.4379070.3972730.3972730.3972730.397273
19500.5275460.4502910.4103460.4103460.4103460.410346
20000.5411690.4630320.4239100.4239100.4239100.423910
20500.5538920.4756500.4358480.4358480.4358480.435848
21000.5666870.4885320.4486710.4486710.4486710.448671
21500.5795440.5013720.4621120.4621120.4621120.462112
22000.5930750.5144120.4766470.4766470.4766470.476647
22500.6068450.5275680.4908600.4908600.4908600.490860
23000.6208740.5412350.5050440.5050440.5050440.505044
23500.6342300.5549620.5185250.5185250.5185250.518525
24000.6473800.5689600.5326480.5326480.5326480.532648
24500.6598300.5824040.5471600.5471600.5471600.547160
25000.6729220.5955740.5635710.5635710.5635710.563571
25500.6855770.6080650.5794710.5794710.5794710.579471
26000.6990460.6206400.5948280.5948280.5948280.594828
26500.7126770.6333580.6083150.6083150.6083150.608315
27000.7272730.6473720.6213040.6213040.6213040.621304
27500.7405660.6619980.6327810.6327810.6327810.632781
28000.7528720.6775390.6465470.6465470.6465470.646547
28500.7637400.6917310.6619280.6619280.6619280.661928
29000.7752670.7046450.6835750.6835750.6835750.683575
29500.7874280.7165950.7040220.7040220.7040220.704022
30000.8012930.7296170.7220260.7220260.7220260.722026
30500.8124680.7405740.7329150.7329150.7329150.732915
31000.8240090.7524520.7443210.7443210.7443210.744321
31500.8343120.7640490.7546370.7546370.7546370.754637
32000.8444650.7757800.7685520.7685520.7685520.768552
32500.8543920.7859900.7826120.7826120.7826120.782612
33000.8652090.8006090.8006090.8006090.8006090.800609
33500.8751450.8117120.8117120.8117120.8117120.811712
34000.8874170.8263570.8263570.8263570.8263570.826357
34500.8960880.8334810.8334810.8334810.8334810.833481
35000.9070110.8472630.8472630.8472630.8472630.847263
35500.9129960.8643400.8643400.8643400.8643400.864340
36000.9180910.8839700.8839700.8839700.8839700.883970
36500.9206040.8948320.8948320.8948320.8948320.894832
37000.9253510.9030170.9030170.9030170.9030170.903017
37500.9339560.9030450.9030450.9030450.9030450.903045
38000.9504360.9030780.9030780.9030780.9030780.903078
38500.9628530.9030980.9030980.9030980.9030980.903098
39000.9736750.9174970.9031240.9031240.9031240.903124
39500.9790610.9302030.9031360.9031360.9031360.903136

Given your 2000 point defecit, it is interesting to see how much more agressively this turn should be played compared to an opening turn: the banking thresholds here for 6, 5, 4, 3, 2 and 1 dice are, respectively, 4000, 3350, 1750, 600, 400, and 500 points. Compare that to the banking thresholds for an opening turn at 5000, 2500, 1000, 350, 300, and 300 points.

Another interesting observation about this strategy is that although you should bank between 3300 and 3850 points when you have 5 dice to roll, you should instead roll if you have 3900 to 3950 points (because you have such a high probability of closing out the game).

Optimal Strategy Case: $b = 8000, d = 6000, f = 0, e = 0$

Now consider the reverse of the situation in the previous example. Table 3 shows how to optimally play a turn where you start with 8000 points and are 2000 points ahead of your opponent who has 6000 points.

Table 3. Win probabilities and banking actions for optimal turn play for b=8000, d=6000, f=0, and e=0.

tn
654321
00.9034220.878090*0.862982*0.857184*0.856210*0.860061*
500.907970*0.8835200.866464*0.858208*0.857345*0.861539*
1000.912448*0.8885200.8720040.860637*0.858466*0.863007*
1500.916837*0.893230*0.8777340.8649940.859560*0.864460*
2000.921258*0.898001*0.8828990.8712130.8636500.865889*
2500.925550*0.902725*0.887727*0.8771800.8688290.867273
3000.9298350.907504*0.892777*0.8822760.8822760.882276
3500.9338510.9121270.897734*0.888668*0.8886680.888668
4000.9376760.9167960.9027260.8949050.8949050.894905
4500.9412870.9212500.9076160.9008240.9008240.900824
5000.9449220.9254130.9122870.9072680.9072680.907268
5500.9483550.9293700.9165320.9135950.9135950.913595
6000.9517840.9333910.9204300.9199220.9199220.919922
6500.9550730.9371770.9254070.9254070.9254070.925407
7000.9582900.9409610.9304500.9304500.9304500.930450
7500.9613680.9447160.9345540.9345540.9345540.934554
8000.9642390.9485160.9389140.9389140.9389140.938914
8500.9665940.9520080.9431510.9431510.9431510.943151
9000.9689860.9552110.9483120.9483120.9483120.948312
9500.9713350.9580000.9532390.9532390.9532390.953239
10000.9737800.9607160.9581240.9581240.9581240.958124
10500.9759450.9631650.9612020.9612020.9612020.961202
11000.9780760.9656820.9642910.9642910.9642910.964291
11500.9799770.9680290.9668120.9668120.9668120.966812
12000.9817630.9701750.9701140.9701140.9701140.970114
12500.9833280.9734080.9734080.9734080.9734080.973408
13000.9847400.9770980.9770980.9770980.9770980.977098
13500.9860910.9791250.9791250.9791250.9791250.979125
14000.9876920.9813440.9813440.9813440.9813440.981344
14500.9889930.9822260.9822260.9822260.9822260.982226
15000.9903730.9840290.9840290.9840290.9840290.984029
15500.9912180.9859990.9859990.9859990.9859990.985999
16000.9917370.9899140.9899140.9899140.9899140.989914
16500.9920730.9920730.9920730.9920730.9920730.992073
17000.9929470.9929470.9929470.9929470.9929470.992947
17500.9929640.9929640.9929640.9929640.9929640.992964
18000.9939110.9929810.9929810.9929810.9929810.992981
18500.9947220.9929900.9929900.9929900.9929900.992990
19000.9956400.9929980.9929980.9929980.9929980.992998
19500.9961800.9930000.9930000.9930000.9930000.993000

In this example, you have a 2000 point lead and are yourself only 2000 points from winning the game. It is interesting to see how much more conservatively this turn should be played compared to an opening turn: the banking thresholds here for the different number of dice to throw are 1650, 1250, 650, 300, 300, and 300 points. Compare that to the banking thresholds for an opening turn at 5000, 2500, 1000, 350, 300, and 300 points.

As in the previous example, there is another banking rule inversion in this table, but this time for the case of 6 dice, where at point values above 1750 it again becomes advantageous to roll to try to end the game.

Optimal Strategy Case: $b = 9000, d = 9500, f = 0, e = 0$

As a fourth example, consider the case of Table 4 which shows how to optimally play a turn where you start with 9000 points and are 500 points behind your opponent who has 9500 points.

Table 4. Win probabilities and banking actions for optimal turn play for b=9000, d=9500, f=0, and e=0.

tn
654321
00.4543660.351125*0.303308*0.277271*0.270069*0.285659*
500.463700*0.3585760.309820*0.283131*0.274544*0.289393*
1000.475303*0.3668470.3163690.289394*0.280312*0.294362*
1500.486102*0.376478*0.3232860.2953740.286679*0.300901*
2000.505119*0.388392*0.3317700.3014900.2929060.309194*
2500.525335*0.399404*0.342403*0.3083350.2989480.317272
3000.5548790.417884*0.354687*0.3177570.3051970.325207
3500.5738000.4322250.366843*0.329936*0.3135160.332809
4000.6024930.4546550.3828380.3437460.3255620.341277
4500.6194050.4624710.3916610.3567920.3406600.354200
5000.6533070.4811580.4045180.3678970.3563450.372251
5500.6969390.4894520.4083920.3760250.3703890.393025
6000.7616170.5191260.4166280.3813810.3815050.412584
6500.8215800.5779040.4190500.3844550.3891210.429136
7000.8789740.6718730.4555510.3911370.3937010.441170
7500.9208890.7594740.5383520.3911410.3963260.448721
8000.9511790.8382370.6587730.4537550.3988290.452947
8500.9661920.8860610.7529050.5636890.4015590.455471
9000.9765360.9211410.8316140.6964070.5222150.459381
9500.9813360.9377880.8730880.7760380.6416610.462492

I find it interesting that the only time you should bank during such a turn is if you have exactly 3 dice to roll and either 700 or 750 points on the turn. Why does it make sense to roll just 1 or 2 dice with those same number of points? And why does it make sense to roll three dice when you have 800 or more points?

To answer the first question, observe that it's easier to ultimately hot-dice (score all dice thrown) when you are rolling only 1 or 2 dice than when you are rolling 3. If you can manage to get back to 6 dice, your chances of ending the game this turn increase dramatically. Furthermore, any additional game points you gain by rolling 3 dice are meaningless unless you hot-dice. If only 1 or 2 of the 3 thrown dice score, you will still find yourself below the game goal and (if you then decide to bank these few extra points) will not have even reduced the number of points you have to accumulate on your subsequent turn due to the minimum banking threshold.

To answer the second question, note that it's possible to reach 10,000 points without hot-dice: in particular there's a 7% chance of rolling two ones and a third non-scoring die. These rolls end the game if you have 9800 points.

Optimal Strategy Case: $b = 9500, d = 9000, f = 0, e = 0$

Now consider the reverse of the situation in the previous example. Table 3 shows how to optimally play a turn where you start with 9500 points and are 500 points ahead of your opponent who has 9000 points.

Table 5. Win probabilities and banking actions for optimal turn play for b=9500, d=9000, f=0, and e=0.

tn
654321
00.8010160.702303*0.658594*0.637598*0.630975*0.640081*
500.826173*0.7068460.660815*0.642258*0.639026*0.652003*
1000.863321*0.7240400.6650630.645329*0.645400*0.663218*
1500.897707*0.757952*0.6662170.6470910.649767*0.672708*
2000.930612*0.811876*0.6876170.6483620.6523920.679607*
2500.954643*0.862100*0.735325*0.6496570.6538970.683936
3000.9720090.907257*0.804365*0.6868230.6553320.686359
3500.9806170.9346750.858333*0.749851*0.6568970.687806
4000.9865480.9547880.9034600.8259420.7260730.690048
4500.9893000.9643320.9272380.8715970.7945550.691832

Here I found it surprising that even though you have a 500 point lead and control of the dice, the optimal strategy requires that you never bank until you've reached the game-winning 500 points. This is surprisingly more aggressive play than on an opening turn where the score is tied. I would have guessed that banking 300 points when faced with rolling 1 or 2 dice would have been better to avoid the high-probability farkle, then allow your opponent a chance to force his way to a 1000 point turn, and assuming he fails, follow up with a high-probability minimum bank turn to win. It turns out if your opponent plays perfectly, he has about a 1/3 chance of reaching 1000 points to steal the win, and so playing a little more aggressively to attempt to end the game and deny him that possible steal is advantageous.

Also note how for many point totals, you are better off with fewer dice. This is often the case when playing turns where your best play is to try to reach 10,000 points and end the game, and also in cases where you are just really far behind an opponent that is fast approaching a game winning score.

Strategy Validation

In this section I share the efforts I've made to sanity check the calculated strategies for reasonableness, and to determine the deviation between the calculated strategy and the truly optimal strategy. This section definitely gets into the weeds and is not for the faint of heart, but may be of interest to skeptics or the particularly nerdy (of which I am both).

Comparison to strategy that maximizes expected turn score

Strategies that maximize expected turn score for farkle variants have been independently calculated with consistent results4,5,6. These strategies do not consider your current banked score, your opponent's banked score or farkle count, or the end of game scoring goal. Still, one would expect that for an opening turn (where the score is tied at zero and where the end-of-game scoring goal is distant) that the game winning strategy documented here and the strategy that maximizes expected turn score would be similar for low turn score states. Furthermore, the two strategies should converge if the game scoring goal is increased from 10,000 points towards infinity. Unfortunately, the 10,000 point goal was already stretching both the memory limits and processing power of my computer. To reduce the size of the state space, I eliminated the three consecutive farkle penalty. This effectively eliminates the f and e state variables, reducing the size of the state space by a factor of 9, and also eliminates the need to model banked score states of less than 0 points, which shrinks things even more.

My Farkle Strategy Generator (FSG)6 produces farkle strategies that maximize expected turn score. The FSG outputs strategy tables that prioritize each turn state with a sequential preference number. Those states associated with higher expected turn scores have a higher preference number. To follow the strategy, you score thrown dice to move to the state with the highest number. This is exactly the same process you use to follow the game winning strategy presented in this document: you score your dice to move to the state with the highest probability of winning. By indexing the states from lowest win probability to highest, the two strategies can easily be compared.

First compare the two strategies for the case of a 10,000 point goal. Table 6W shows the strategy that maximizes your chances of winning the game (produced by value iteration) for an opening turn. Table 6E shows the strategy that maximizes expected turn score (produced by the FSG). Turn scores in both cases are truncated to a maximum of 1500. States with different preference numbers are highlighted. There are numerous slight differences, but they are obviously very similar: the banking thresholds are almost identical, and the state preference numbers deviate only slightly.

Table 6W. Preferred states for strategy that maximizes win probability for an opening turn of a 2-player game with a 10,000 point goal. Descrepancies with the strategy that maximizes expected score (shown in Table 6E) are highlighted.
Table 6E. Preferred states for strategy that maximizes expected score. Descrepancies with the strategy that maximizes win probability for an opening turn of a 2-player game with a 10,000 point goal (shown in Table 6W) are highlighted.

tn
654321
021*****
50*13****
100*155***
150**92**
200**1141*
250***736
30038**1088
3504224**1212
400462718161414
450503020171717
500543323191919
550573626222222
600604029252525
650634332282828
700654734313131
750685137353535
800715541393939
850745845444444
900776149484848
950806453525252
1000836756565656
1050867059595959
1100897262626262
1150927566666666
1200957869696969
1250978173737373
13001008476767676
13501038779797979
14001069082828282
14501099385858585
15001129688888888

tn
654321
021*****
50*13****
100*145***
150**92**
200**1141*
250***736
30039**1088
3504224**1212
400462718161515
450503020171717
500543323191919
550583626222222
600614029252525
650644332282828
700674735313131
750705137343434
800725541383838
850755945444444
900786249484848
950816553525252
1000846857565656
1050877160606060
1100907463636363
1150947666666666
1200977969696969
12501008273737373
13001038577777777
13501068880808080
14001099183838383
14501129386868686
15001159689898989

Tables 7W and 7E compare these same two strategies but with the scoring goal increased to 30,000 points. Here the two strategies are seen to be almost identical for these low turn score states. The two strategies do appear to be converging as the scoring goal increases. While this doesn't assure strategy correctness, it offers convincing evidence of correctness in at least this limiting case.

Table 7W. Preferred states for strategy that maximizes win probability for an opening turn of a 2-player game with a 30,000 point goal. Descrepancies with the strategy that maximizes expected score (shown in Table 7E) are highlighted.
Table 7E. Preferred states for strategy that maximizes expected score. Descrepancies with the strategy that maximizes win probability for an opening turn of a 2-player game with a 30,000 point goal (shown in Table 7W) are highlighted.

tn
654321
021*****
50*13****
100*145***
150**92**
200**1141*
250***736
30039**1088
3504224**1212
400462718161515
450503020171717
500543323191919
550583626222222
600614029252525
650644332282828
700674734313131
750705137353535
800725541383838
850755945444444
900786249484848
950816553525252
1000846857565656
1050877160606060
1100907463636363
1150937666666666
1200967969696969
1250998273737373
13001038577777777
13501068880808080
14001099183838383
14501119486868686
15001149789898989

tn
654321
021*****
50*13****
100*145***
150**92**
200**1141*
250***736
30039**1088
3504224**1212
400462718161515
450503020171717
500543323191919
550583626222222
600614029252525
650644332282828
700674735313131
750705137343434
800725541383838
850755945444444
900786249484848
950816553525252
1000846857565656
1050877160606060
1100907463636363
1150947666666666
1200977969696969
12501008273737373
13001038577777777
13501068880808080
14001099183838383
14501129386868686
15001159689898989

Effects of the banked score lower bound

The calculated strategy is non-optimal when playing in regions of the state space where either your banked score, or your opponent's banked score is near the lower bound. This discrepancy between the calculated strategy and the truly optimal strategy is further exacerbated as the consecutive farkle count associated with a banked score near the bound increases.

To estimate the extent of the discrepancy, I lowered the bound on banked scores from -2500 to -3000 and regenerated the strategy matrix. One expects the relative deviation between the win probabilities of corresponding states in the two strategies to decrease as the minimum of the player's banked scores increases. To see the rate of convergence, for each possible banking score $B$ ($-2500 <= B < 10,000$), I found the maximum relative deviation between win probabilities (calculated as ${{2|w1-w2|}\over{w1+w2}}$) over all game states satisfying $min(b,d) >= B$. Table 8 lists selected results. Note that each time B increases by 500 points (which just happens to be the triple farkle penalty), the maximum relative deviation consistently decreased by 2 orders of magnitude (with one of those two orders of magnitude coming with the last 50 point increase). Given this exponential convergence rate, for any given $B$, the maximum relative deviation between the strategies for $L=-2500$ and $L=-\infty$ must be only slightly greater (on the order of 1%) than the maximum relative deviation between the strategies for $L=-2500$ and $L=-3000$.

Table 8. For two different farkle strategies calculated with $L=-2500$ and $L=-3000$, the maximum relative deviation of win probabilities over all states satisfying $min(b,d) >= B$ is shown.

 B Maximum Relative Deviation -2500 0.296463834 -2050 0.028572943 -2000 0.003438359 -1550 0.000249319 -1500 0.000029516 -1050 0.000002074 -1000 0.000000234 -550 0.000000016 -500 0.000000001

Note that the state win probabilities for two calculated strategies do not have to be identical for the rules about when to bank, and when to roll, and how to score the dice in each situation to be to be identical. So how close do the calculated win probabilities and the optimal win probabilities need to be to make it likely we are playing with a truly optimal strategy? Using state preference charts like those shown in the previous section, I compared complete turn strategies for all combinations of state variables $b$, $d$, $f$, and $e$. If two state preference charts are identical, then the strategies are identical. Once I found a bizarre state preference discrepancy associated with a relative difference in win probability on the same order as the convergence threshold for the entire state matrix. I ignored this and any others like it that might be lurking in the tables. Table 9 shows the banked score and consecutive farkle count thresholds above which no more strategy discrepancies otherwise appeared. So for example if both you and your opponent have at most 1 consecutive farkle, and you both have banked scores of at least -1500 points, then there are no discrepancies between the two strategies.

So although there's no way to pick a lower bound $L$ that will assure all turn strategies will be truly optimal for a given working range of banked scores and consecutive farkle counts, driving the maximum relative deviation down to within about 1 part in 108 seems to assure that the vast majority of turn strategies (and likely all of them) are optimal. Given the convergence rate seen above, this means that as long as both players are about 2000 points above the lower bound on banked scores, all turn strategies are likely to be optimal.

Table 8. When both players banked scores and consecutive farkle counts satisfy the listed constraints the two strategies that maximize win probability for -2500 and -3000 point lower bound on banked scores are identical. The maximum relative deviation over all game states satisfying these constraints is also listed.

Minimum Banked ScoreMaximum Farkle CountMaximum Relative Deviation
-150000.000001175
-120010.000001529
-70020.000000070

Conclusions

Extended forms of a Markov Decision Process (MDP) and value iteration were devised that allow more efficient modeling of 2-player Farkle than is possible with a standard MDP. Using this model, the strategy that maximizes win probability using Facebook scoring rules has been determined under the constraint that banked scores are lower bounded to a large negative score. The strategy is shown to be unaffected by this bound in regions where banked scores are at least 2000 points above the bound. The strategy appears to converge (as expected) to the strategy that maximizes expected turn score as the end-of-game scoring goal is increased.

With nearly a half billion game states, and so many different dice rolls and scoring options possible from each of these different states, it wasn't immediately obvious whether the problem was solvable with a home computer. I was quite pleased when I saw the first value iteration update complete, and even more pleased when I saw the matrix was actually converging! It was a nice problem in the sense that it's complexity puts finding the solution at the limits of what can be done with a home computer (or at least at the limits of what I can do with a home computer).

The optimal strategy showed many expected behaviors including more aggressive play when behind, and more conservative play when ahead. But it also showed many behaviors that were unexpected to me, such as playing aggressively even with a sizable lead when nearing the game goal of 10,000 points. Numerous times I was convinced I had discovered a flaw in the strategy (and therefore either a bug in my code or an error in my analysis) only to realize later that the strategy was actually reasonable.

Next Steps

The very limited view I've offered into the game winning strategy is unsatisfying. Time permitting, I intend to develop a web page that will allow you to explore the gargantuan optimal Farkle strategy in detail, and perhaps even play a game of farkle against a perfect computer opponent. I am motivated by your interest, so please leave a comment if you'd like to see it.

A Facebook Farkle game is won by the first player to bank 10,000 or more points, but there are other variations to end-of-game rules. In Zilch, for example, the other player always gets a final go at the dice. Some people like to assure that both players get the same number of turns. In this latter case, the optimal strategies played by the two players are not even the same — with the player going second clearly having a strategic advantage. I think any of these could be modeled, but they are each very different, and not currently supported by my solver. These are ickier problems, and I don't intend to work on them anytime soon.

References

1. T Neller and C Presser, "Optimal Play of the Dice Game Pig," The UMAP Journal 25(1) (2004), pp. 25-47.
2. R. E. Bellman. Dynamic Programming. Princeton University Press, Princeton, NJ, 1957.
3. Cap Khoury, Multinomial Coefficients and Farkle, Jul. 2009.
4. M Busche, Maximizing Expected Scores in the Game of Zilch, Matt's Maniacal Musings, Aug. 2010.
5. E Farmer, Analysis of Farkle, Possibly Wrong, Apr. 2013.
6. Farkle Strategy Generator, Matt's Maniacal Musings.