Neller and Presser modeled a simple dice game called pig as a Markov Decision Process (MDP) and used value iteration to find the optimal game winning strategy1. Inspired by their approach, I've constructed a variant of an MDP which can be used to calculate the strategy that maximizes the chances of winning 2-player farkle. Due to the three consecutive farkle penalty, an unfortunate or foolish player can farkle repeatedly to achieve an arbitrarily large negative score. For this reason the number of game states is unbounded and a complete MDP model of farkle is not possible. To bound the problem, a limit on the lowest possible banked score is enforced. The calculated strategy is shown to converge exponentially to the optimal strategy as this bound on banked scores is lowered.
Each farkle turn proceeds by iteratively making a pre-roll banking decision, a (contingent) roll of the dice, and a post-roll scoring decision. I modified the classic MDP to include a secondary (post-roll) action to fit this turn model. A reward function that incentivizes winning the game is applied. A similarly modified version of value-iteration (that maximizes the value function for both the pre-roll banking decision, and the post-roll scoring decision) is then used to find an optimal farkle strategy.
With a lower bound of -2500 points for banked scores, there are 423,765,000 distinct game states and so it is not convenient to share the entire strategy in printed form. Instead, I provide some general characterizations of the strategy. For example, if both players use this same strategy, the player going first will win 53.487% of the time. I also provide samples of complete single-turn strategies for various initial banked scores. Currently, only the strategy for Facebook Farkle has been calculated, but the strategy for other scoring variants of farkle could easily be deduced using the same software.
Contents
The Rules of Farkle
Farkle rules differ only slightly from the rules of Zilch, but are provided here for completeness.
Farkle is played with two or more players and six six-sided dice. Each player takes turns rolling the dice. The dice in a roll can be worth points either individually or in combination. If any points are available from the roll, the player must set aside some or all of those scoring dice, adding the score from those dice to their point total for the turn. After each roll, a player may either re-roll the remaining dice to try for more points or may bank the points accumulated this turn (though you can never bank less than 300 points). When a player banks his points, the player's turn is ended and the dice are passed to the next player.
If no dice in a roll score, then the player loses all points accumulated this turn and their turn is ended. This is called a farkle, a sorrowful event indeed.
If all dice in a roll score, the player gets to continue his turn with all six dice. This is called hot dice and is guaranteed to brighten your day.
A player may continue rolling again and again accumulating ever more points until he either decides to bank those points or loses them all to a farkle.
If a player ends three consecutive turns with a farkle, they not only lose their points from the turn but also lose 500 points from their banked game score. (This is the only way to lose banked points.) After a triple farkle, your consecutive farkle count is reset to zero so you're safe from another triple farkle penalty for at least three more turns.
The game ends when one player has banked a total of 10,000. Unlike zilch, other players do not get a final turn.
Scoring is as follows:
- Each 1 is worth 100 points.
- Each 5 is worth 50 points.
- A set of three 1s is worth 1000 points.
A set of three 2s is worth 200 points.
A set of three 3s is worth 300 points.
A set of three 4s is worth 400 points.
A set of three 5s is worth 500 points.
A set of three 6s is worth 600 points.
Unlike zilch, each extra die in a set increases the value of the set by a like amount. So four 4s are worth 800 points, five 5s are worth 1500, and six 1s are worth 4000. - Three pair is worth 750 points. Unlike Zilch, a set of four 2s and two 4s may not be scored as three pairs.
- A six die straight is worth 1500 points.
- Each die can only be used once when scoring. (If you roll two 1s, two 2s, and two 3s you can either count the two 1s for 200 or use all six dice for three-pair and 1500 points — you can't use the ones both ways for 1700 points.)
Markov Decision Processes and Value Iteration
Note: those familiar with MDPs may find the non-standard variable names I use to present this standard subject distracting. My aim is to use variable names most meaningful in the final value-iteration equation. I beg your indulgence. As a courtesy, I've included mouse-hover popups where each such non-standard variable name is introduced (highlighted in blue) explaining the motivation for the change.
A Markov Decision Process (MDP)2 is a system having a finite set of states $S$. For each state $s \in S$, there are a set of actions that may be taken $A(s)$. For each action $a \in A(s)$, there is a set of transition probabilities $P_a(s, s')$ defining the probability of transitioning from $s$ to each state $s' \in S$ given that action $a$ was taken while in state $s$. When action $a$ is taken from state $s$, the MDP responds by randomly moving to a new state $s'$ as governed by the transition probabilities $P_a(s, s')$ and then assigning the decision maker a reward $D_a(s, s')$.
The objective is to find a strategy function $G(s)$ that returns the particular action at each state $s \in S$ that will maximize the expected cumulative reward given by:
$$\sum_{k=0}^{\infty} \gamma^k D_{a_k}(s_k, s_{k+1})$$where $k$ is a discreet time variable, $s_k$ is the game state at time $k$, $a_k$ is the player action taken from from state $s_k$, and $\gamma \in [0,1)$ is a constant discount factor for future rewards. The expectation must be taken over all possible state transition paths, and maximized over all possible choices for the actions $a_k$ taken in each state $s_k$. Then $G(s)$ will be defined by the $a_k$ taken from each state $s_k$ that maximizes this expectation.
One technique for solving this problem is value iteration. With this technique, each state $s$ is given a decimal value $W(s)$ which is an estimate of the expected discounted sum of all future rewards gained from state $s$. The estimate for $W(s)$ is iteratively refined for all $s$ by applying this update equation sequentially to all states $s$: $$W_{i+1}(s) := \max_{a \in A(s)} \left[ \sum_{s'} P_a(s, s')(D_a(s, s') + \gamma W_i(s'))\right]$$
Note that at each state $s$, the action which provides maximum total reward is selected. So not only is the estimate of $W(s)$ iteratively refined, but the best action $a$ taken for each state $s$ is also simultaneously improved. Iteration continues until $W_{i+1}(s)$ and $W_i(s)$ converge for all $s$, and $G(s)$ is then the set of actions $a$ selected for each state $s$ in the final iteration.
Extending the MDP to support Farkle
During a farkle turn, a player must iteratively
- decide whether to bank his current points,
- then, if he decides not to bank, roll the dice,
- then decide how to score the dice.
To fit farkle to an MDP model, one need only consider the result of a roll as part of the game state, and then reorder the turn sequence to that of a combined scoring-and-bank/roll action followed by either an end-of-turn-event or another random-roll-event. But this increases the number of game states by at least a few orders of magnitude and makes the problem unsolvable without a commensurate increase in computer resources.
Alternatively, the roll-decision could be splintered to include any conceivable combination of scoring instructions, directing the MDP how to score each potential roll before the roll is even made, thereby allowing the state machine to transition without additional input from the player once the roll is made. Aside from being a painful way of thinking about the problem, the number of possible scoring instructions is enormous, and the approach is again not feasible.
Rather than forcing the game to fit the MDP model, I instead define an extended MDP (EMDP) to more naturally model the game. Like an MDP, an EMDP has a finite set of states $S$. For each state $s \in S$, there are a set of primary actions that may be taken $A(s)$. For each primary action $a \in A(s)$, there are a set of sets of secondary actions $R_a(s)$. Once primary action $a$ is taken, one set of secondary actions $r \in R_a(s)$ is selected randomly by the EMDP according to a probability distribution $P_a(s)$. So instead of transitioning, the EMDP responds to action $a$ by offering a randomly chosen set of secondary actions that may be taken. For each secondary action $c \in r$, a deterministic transition state $s'$ is defined by a transition matrix $s' = X_{a,c}(s)$. Selection of action $c$ causes the EMDP to transition to $s'$ and reward $D_{a,c}(s, s')$ is granted.
Because there are two actions to be taken in a turn, the optimal strategy also has two parts: $G_A(s)$ is the optimal primary action in state $s$, and $G_C(s, r)$ is the optimal secondary action to take in state $s$ given that secondary action set $r$ was randomly offered by the EMDP in response to action $a$.
Value iteration processing is also extended to account for the secondary action: $$W_{i+1}(s) := \max_{a \in A(s)} \left[ \sum_{r \in R_a(s)} P_{a,r}(s) \max_{c \in r} \Big[ D_{a,c}(s, s') + \gamma W_i(s') \Big] \right]$$
Neller and Presser chose a reward function to incentivize winning the game1. For any transition from a non-winning game state $s$ to a winning game state $s_w$, $D_{a,c}(s, s_w) = 1$ (where $s_w$ is any state where the player's banked points plus his turn points meets or exceeds the game goal of 10,000 points, and where his turn points meets or exceeds the minimum banking threshold). For transitions to any other non-winning state $s_v$, $D_{a,c}(s, s_v) = 0$. Because all game winning states $s_w$ are terminal, all future rewards from such states must be zero, so $W(s_w) = 0$. Because $W$ is known for all game winning states $s_w$, $W(s_w)$ is never updated during value iteration. (I.e., although a game winning state can appear on the right side of the value iteration update equation, it never appears on the left.)
Normally for an MDP, $0 \le \gamma < 1$, but we do not wish to value a game you win in 30 turns, less than a game you win in 10. Following Neller and Presser's approach, I instead set $\gamma = 1$. In general this can prevent value iteration from converging, but it does not cause a problem for farkle. (I think this is because there is no circular state transition path offering unbounded rewards, and because only the player that gets to a game winning state first actually wins, which ensures that the optimal strategy can't be attracted to some infinitely long path to a game winning state.) With $\gamma = 1$, $W(s)$ converges to the probability of winning from any non-terminal state $s$ when using an optimal strategy; the $a$ selected in the $\max_a$ operation converges to the optimal banking strategy from state $s$: $a = G_A(s)$; and the $c$ selected by the $\max_c$ operation converges to the optimal scoring strategy from state $s$ given that roll $r$ was thrown: $c = G_C(s,r)$.
Given that $\gamma = 1$, simplifications can be achieved if you move the reward for transitions to a game winning state out of the reward function, and into the value function for those same game winning states. That is, instead of defining $W(s_w) = 0$, define $W(s_w) = 1$, and set $D_{a,c}(s, s') = 0$ everywhere, allowing the reward function to be completely eliminated from the update equation. This definition also results in $W(s)$ consistently being the probability of winning the game from any state $s$ (including terminal game winning states).
Applying the simplifications of $\gamma = 1$, $D_{a,c}(s, s') = 0$, and $W(s_w) = 1$ reduces the extended value iteration update formula to:
$$W_{i+1}(s) := \max_{a \in A(s)} \left[ \sum_{r \in R_a(s)} P_{a,r}(s) \max_{c \in r} W_i(s') \right]$$This general approach for finding the optimal play strategy for games having both pre-roll and post-roll actions, is further detailed for the specifics of 2-player farkle in the sections below.
Game State Characterization
The current game state $s$ is characterized by six component state variables:
$$s = (t, n, b, d, f, e)$$where $t$ is the number of points accumulated only from your current turn. Once $b + t \gt 9950$ (which means you've hit the goal of 10,000 points) and you've met the minimum requirement to bank $t > 250$, the game is over, so for non-terminal game states we have:
$$t \in \{0, 50, 100, ..., \max [250, 9950-b]\}\text{.}$$$n$ is the number of dice you have to roll
$$n \in \{1, 2, 3, 4, 5, 6\}\text{,}$$$b$ is your banked score for which I enforce a lower bound $L$
$$b \in \{L, L+50, ..., -100, -50, 0, 50, 100, ..., 9950\}\text{,}$$$d$ is your opponent's banked score which is also lower bounded to $L$
$$d \in \{L, L+50, ..., -100, -50, 0, 50, 100, ..., 9950\}\text{,}$$$f$ is your consecutive farkle count (from previous turns)
$$f \in \{0, 1, 2\}\text{, and}$$$e$ is your opponent's consecutive farkle count
$$e \in \{0, 1, 2\}\text{.}$$The Farkle Value Iteration Equation
In this section we apply the farkle component state variables and rules defined in previous sections to the EMDP value iteration equation:
$$W_{i+1}(s) := \max_{a \in A(s)} \left[ \sum_{r \in R_a(s)} P_{a,r}(s) \max_{c \in r} W_i(s') \right]$$Let's detail $A(s)$ first. For farkle, $A(s)$ is the set of available banking actions having at most two members: BANK and ROLL. For game states where you have so far accumulated less than 300 points, you have only one pre-roll (primary) action available to you: ROLL the dice. But for all other game states you have two pre-roll actions available: BANK, or ROLL. This yields:
\begin{equation} W_{i+1}(s) := \left\{ \begin{array}{ll} \sum\limits_{r \in R_{\text{ROLL}}(s)} P_{{\text{ROLL}},r}(s) \max\limits_{c \in r} W_i(s'), & \text{if $t < 300$}.\\ \\ \max \bigg[\sum\limits_{r \in R_{\text{BANK}}(s)} P_{{\text{BANK}},r}(s) \max\limits_{c \in r} W_i(s'), \\ \hspace{30pt} \sum\limits_{r \in R_{\text{ROLL}}(s)} P_{{\text{ROLL}},r}(s) \max\limits_{c \in r} W_i(s') \bigg], & \text{if $t \ge 300$}.\\ \\ \end{array} \right. \end{equation}In the case of a bank action the equation collapses. $R_{\text{BANK}}(s)$ has only one entry: $r_{\text{BANK}}$. There's only one member of the probability distribution: $P_{{\text{BANK}}, r_{\text{BANK}}}(s) = 1$. Also, $r_{\text{BANK}}$ has only one entry: $c_{\text{BANK}}$. So for the case of $t \ge 300$ , the first member of the outer max operation reduces to just $W_i(s')$. All the mathematical machinery in this case is flexible enough to handle the banking case, but is entirely unnecessary. But what exactly is $s'$ after you bank?
Looking back at the game state characterization from the previous section, there is no variable that encodes whose turn it is. Everything I've written so far is from the perspective of the player who controls the dice, and there is no $s'$ expressible in terms of our six component state variables that identifies your state after a banking operation. (This is by design and is consistent with Neller and Presser's approach for optimizing pig game play strategy.) After you bank, it is your opponent's turn who we assume is also playing the optimal strategy, and we can express his state after you bank. If your game state just before you banked was $s = (t, n, b, d, f, e)$, then your opponent's state after you bank will be $o' = (0, 6, d, b + t, e, 0)$. To be clear, after you bank your opponent will have 0 turn points, 6 dice to roll, a banked score of $d$, an opponent's banked score of $b + t$ (which is your new banked score), $e$ consecutive farkles, and his opponent will have $0$ consecutive farkles (your farkle count reverting to zero having just banked). Your opponent's win probability is $W(o')$, which means our win probability after the bank must be: $W_i(s') = 1 - W_i(o') = 1 - W_i(0, 6, d, b + t, e, 0)$, which yields: \begin{equation} W_{i+1}(s) := \left\{ \begin{array}{ll} \sum\limits_{r \in R_{\text{ROLL}}(s)} P_{{\text{ROLL}},r}(s) \max\limits_{c \in r} W_i(s'), & \text{if $t < 300$}.\\ \\ \max \bigg[\Big(1 - W_i(0,6,d,b+t,e,0)\Big), \\ \hspace{30pt} \sum\limits_{r \in R_{\text{ROLL}}(s)} P_{{\text{ROLL}},r}(s) \max\limits_{c \in r} W_i(s') \bigg], & \text{if $t \ge 300$}.\\ \\ \end{array} \right. \end{equation}
In the case of a ROLL action, $R_{\text{ROLL}}(s)$ corresponds to the set of all possible rolls of the dice from state $s$. More precisely, each roll $r$ is defined as a set of possible scoring decisions for some permutation of thrown dice. The $\sum_r$ operation is summing over each possible roll $r \in R_{\text{ROLL}}(s)$. $P_{\text{ROLL},r}(s)$ is the probability of making roll $r$ from game state $s$. And each $c \in r$ is one possible scoring decision given that roll $r$ was thrown. Given scoring decision $c$, the new game state $s'$ is determined. The $\max_c$ operation maximizes the expected win probability $W_i(s')$ given that roll $r$ was thrown over these possible scoring decisions.
First note that the set of potential rolls from game state $s = (t, n, b, d, f, e)$ is only dependent on the number of dice you are rolling, so:
$$R_{\text{ROLL}}(s) = R_{\text{ROLL}}(n)$$Second note that
$$P_{\text{ROLL},r}(s) = {1 \over {6^n}}$$Third, because the expression for $W_i(s')$ is fundamentally different for the case of farkling rolls vs. scoring rolls, it is convenient to partition $R_{\text{ROLL}}(n)$ into two subsets: the subset of all farkling rolls $R_{\text{FARKLE}}(n)$, and the subset of all scoring rolls $R_{\text{SCORE}}(n)$:
$$R_{\text{ROLL}}(n) = R_{\text{FARKLE}}(n) \bigcup R_{\text{SCORE}}(n)$$Fourth, if roll $r$ is a farkle, then $r$ will have only one member: a zero point farkle scoring decision and the $max_c$ operation can be dropped.
Applying these four observations gives:
\begin{equation} \begin{array}{rl} \sum\limits_{r \in R_{\text{ROLL}}(s)} P_{{\text{ROLL}},r}(s) \max\limits_{c \in r} W_i(s') &= \sum\limits_{r \in R_{\text{FARKLE}}(n)} {1 \over {6^n}} W_i(s') + \sum\limits_{r \in R_{\text{SCORE}}(n)} {1 \over {6^n}} \max\limits_{c \in r} W_i(s') \\ &= {1 \over {6^n}} \sum\limits_{r \in R_{\text{FARKLE}}(n)} W_i(s') + {1 \over {6^n}} \sum\limits_{r \in R_{\text{SCORE}}(n)} \max\limits_{c \in r} W_i(s') \end{array} \end{equation}After a farkle, it becomes your opponent's turn and there is again no expression for the new game state. We again instead express $W_i(s')$ in terms of your opponents win probability after your farkle:
$$W_i(s') = 1 - W_i(0, 6, d, b', e, f')$$where $b'$ is your new banked score after your farkle, which may have decreased due to the three consecutive farkle penalty, but is enforced to always be at least $L$ to keep the problem tractable:
$$b' = \max [ L, b - Y_f ]\text{;}$$ where $Y_f$ is the number of points you lose from your banked score when you farkle while already having $f$ consecutive farkles $$\begin{align*} Y_0 &= 0 \\ Y_1 &= 0 \\ Y_2 &= 500\text{;} \end{align*}$$and where $f'$ is your new consecutive farkle count which normally just increments, but is reset back to zero if you just had your third consecutive farkle
$$f' = (f+1) \mod 3\text{.}$$Note also that for all farkling rolls the expression inside the sum is the same, so we can replace the sum with a multiplicative factor equaling the number of ways to roll a farkle with $n$ dice. Combining that count with the ${1 \over {6^n}}$ simplifies to the probability of farkling with $n$ dice3:
$$\begin{align*} F_1 &= 2/3 \\ F_2 &= 4/9 \\ F_3 &= 5/18 \\ F_4 &= 17/108 \\ F_5 &= 25/324 \\ F_6 &= 5/216 \\ \end{align*}$$Combining the above observations gives this substitution:
$${1 \over {6^n}} \sum\limits_{r \in R_{\text{FARKLE}}(n)} W_i(s') = F_n (1 - W_i(0,6,d,b',e,f'))$$To detail the expression for the case of scoring rolls, first let $C_T(c)$ be the number of points taken with scoring combination $c$, and let $C_N(c)$ be the number of dice used with scoring combination $c$. So after rolling roll $r \in R_{\text{SCORE}}(n)$ and selecting scoring action $c \in r$ the state transitions from $s = (t, n, b, d, f, e)$ to
$$s' = (t', n', b, d, f, e)$$where
$$ \begin{align*} t' &= t+C_T(c) \\ n' &= h(n-C_N(c)) \end{align*} $$and where $h(x)$ is a hot-dice function for resetting the number of available dice back to $6$ when all dice are successfully scored:
$$h(n)=\begin{cases}6, & \text{for $n = 0$.} \\ n, & \text{otherwise.}\end{cases}$$This gives our final value iteration equation, where (below) I repeat all the supporting equations for convenience:
$$ W_{i+1}(t,n,b,d,f,e) := \left\{ \begin{array}{ll} F_n (1 - W_i(0,6,d,b-Y_f,e,f')) + \\ \hspace{30pt} {1 \over 6^n} \sum\limits_{r \in R_{\text{SCORE}}(n)} \max\limits_{c \in r} \left[ W_i(t',n',b,d,f,e)\right], & \text{if $t < 300$}.\\ \\ \max \bigg[ \Big(1 - W_i(0,6,d,b+t,e,0)\Big), \\ \hspace{30pt} \Big( F_n (1 - W_i(0,6,d,b',e,f')) + \\ \hspace{60pt} {1 \over 6^n} \sum\limits_{r \in R_{\text{SCORE}}(n)} \max\limits_{c \in r} \left[ W_i(t',n',b,d,f,e) \right] \Big) \bigg], & \text{if $t \ge 300$}. \end{array} \right. $$where
$$ \begin{align*} b' &= \max [ L, b - Y_f ] \\ f' &= (f+1) \mod 3 \\ t' &= t+C_T(c) \\ n' &= h(n-C_N(c)) \\ h(n) &=\begin{cases}6, & \text{for $n = 0$.} \\ n, & \text{otherwise.}\end{cases} \\ Y_0 &= 0 \\ Y_1 &= 0 \\ Y_2 &= 500 \\ F_1 &= 2/3 \\ F_2 &= 4/9 \\ F_3 &= 5/18 \\ F_4 &= 17/108 \\ F_5 &= 25/324 \\ F_6 &= 5/216 \\ \end{align*} $$Performance
With a lower bound of $L=-2500$ there are 423,765,000 game states. Each state is modeled with a double precision floating point number requiring 8 bytes, so the entire state matrix requires 3.39012 billion bytes of RAM (plus some array overhead). With game goal $g$ the number of states grows as $(g-L)^3$. Lowering $L$ from -2500 to, say, -10,000 (to eliminate all reasonable doubt that you'll ever venture into portions of the calculated strategy that are non-optimal) will increase the number of game states to over 1.7 billion and memory requirements to 14 GB (which is more than I have on any of my computers).
The value iteration software to solve the optimal farkle problem was written in C++. Obvious optimizations were made. For example, I don't actually iterate over possible rolls, but only over a precalculated set of unique score sets (which is orders of magnitude smaller in count than rolls), and weighting each score set by the number of rolls that share that same set. I let the program iterate over all game states until the maximum relative change over all states from one iteration to the next was less than 1 part in a billion. (I.e., iterations continued until no state had a change in value from one iteration to the next of more than 1 part in a billion, so that even those states with extremely small win probabilities were calculated with precision.) With the floor set to -2500 points, it took 62 iterations for the matrix to converge. Using one core of a dual-core Intel I3 4130T, the software performed 1.30 million state updates per second, each pass of the matrix taking 5 minutes 26 seconds, and convergence taking 5 hours 37 minutes.
The Strategy
It is not practical to list the win probabilities and banking rules for all of the half billion game states in two player farkle. Here I provide only a limited view into the complete strategy by means of five examples. Each of the five subsections below provide a 2 dimensional slice of the 6 dimensional strategy matrix, sufficient for optimal play of a single turn for a particular start-of-turn game state.
Optimal Strategy Case: $b = 0, d = 0, f = 0, e = 0$
Table 1 shows the win probabilities and banking actions needed to play an opening turn optimally. Each cell shows the win probability for different turn scores (starting at $t=0$ in the top row and increasing down the page) and/or different number of dice to roll (starting at $n=6$ in the first column and decreasing as you move to the right). For all entries in this table, the other four component state variables are fixed: your banked score (b) is fixed at 0 points, your opponent's banked score (d) at 0 points, your consecutive farkle count (f) at 0, and your opponent's consecutive farkle count (e) at zero. States shaded green are states from which your optimal banking action is to roll. States shaded red are states from which your optimal banking action is to bank. States with an asterisk are inaccessible and can be ignored (although the software still calculates your win probability in-, and optimal play out of- these states which would be useful if, say through a disruption in the laws of the universe, you somehow find yourself in such a state).
t | n | |||||
---|---|---|---|---|---|---|
6 | 5 | 4 | 3 | 2 | 1 | |
0 | 0.534870 | 0.506721* | 0.493622* | 0.487801* | 0.486163* | 0.489950* |
50 | 0.540099* | 0.511005 | 0.495849* | 0.489177* | 0.487586* | 0.491718* |
100 | 0.545359* | 0.515776 | 0.499374 | 0.490798* | 0.489034* | 0.493525* |
150 | 0.550709* | 0.520616* | 0.503815 | 0.493135 | 0.490494* | 0.495375* |
200 | 0.556198* | 0.525472* | 0.508564 | 0.497158 | 0.492462 | 0.497250* |
250 | 0.561811* | 0.530484* | 0.513285* | 0.502074 | 0.495439 | 0.499128 |
300 | 0.567448 | 0.535760* | 0.518054* | 0.506680 | 0.503290 | 0.503290 |
350 | 0.573082 | 0.541181 | 0.523166* | 0.511296* | 0.509711 | 0.509711 |
400 | 0.578710 | 0.546604 | 0.528579 | 0.516146 | 0.516146 | 0.516146 |
450 | 0.584332 | 0.552028 | 0.533996 | 0.522592 | 0.522592 | 0.522592 |
500 | 0.589966 | 0.557452 | 0.539418 | 0.529048 | 0.529048 | 0.529048 |
550 | 0.595675 | 0.562873 | 0.544842 | 0.535513 | 0.535513 | 0.535513 |
600 | 0.601436 | 0.568338 | 0.550268 | 0.541986 | 0.541986 | 0.541986 |
650 | 0.607195 | 0.573937 | 0.555693 | 0.548463 | 0.548463 | 0.548463 |
700 | 0.612941 | 0.579588 | 0.561115 | 0.554944 | 0.554944 | 0.554944 |
750 | 0.618671 | 0.585232 | 0.566534 | 0.561427 | 0.561427 | 0.561427 |
800 | 0.624400 | 0.590866 | 0.571948 | 0.567910 | 0.567910 | 0.567910 |
850 | 0.630173 | 0.596489 | 0.577354 | 0.574392 | 0.574392 | 0.574392 |
900 | 0.635987 | 0.602137 | 0.582753 | 0.580870 | 0.580870 | 0.580870 |
950 | 0.641793 | 0.607914 | 0.588142 | 0.587343 | 0.587343 | 0.587343 |
1000 | 0.647615 | 0.613782 | 0.593809 | 0.593809 | 0.593809 | 0.593809 |
1050 | 0.653417 | 0.619633 | 0.600266 | 0.600266 | 0.600266 | 0.600266 |
1100 | 0.659192 | 0.625466 | 0.606713 | 0.606713 | 0.606713 | 0.606713 |
1150 | 0.664939 | 0.631280 | 0.613147 | 0.613147 | 0.613147 | 0.613147 |
1200 | 0.670657 | 0.637072 | 0.619567 | 0.619567 | 0.619567 | 0.619567 |
1250 | 0.676345 | 0.642842 | 0.625970 | 0.625970 | 0.625970 | 0.625970 |
1300 | 0.682002 | 0.648587 | 0.632356 | 0.632356 | 0.632356 | 0.632356 |
1350 | 0.687625 | 0.654306 | 0.638721 | 0.638721 | 0.638721 | 0.638721 |
1400 | 0.693212 | 0.659998 | 0.645065 | 0.645065 | 0.645065 | 0.645065 |
1450 | 0.698762 | 0.665660 | 0.651385 | 0.651385 | 0.651385 | 0.651385 |
1500 | 0.704289 | 0.671292 | 0.657681 | 0.657681 | 0.657681 | 0.657681 |
1550 | 0.709840 | 0.676892 | 0.663950 | 0.663950 | 0.663950 | 0.663950 |
1600 | 0.715350 | 0.682458 | 0.670190 | 0.670190 | 0.670190 | 0.670190 |
1650 | 0.720818 | 0.687989 | 0.676400 | 0.676400 | 0.676400 | 0.676400 |
1700 | 0.726243 | 0.693483 | 0.682579 | 0.682579 | 0.682579 | 0.682579 |
1750 | 0.731622 | 0.698938 | 0.688723 | 0.688723 | 0.688723 | 0.688723 |
1800 | 0.736955 | 0.704354 | 0.694832 | 0.694832 | 0.694832 | 0.694832 |
1850 | 0.742239 | 0.709727 | 0.700904 | 0.700904 | 0.700904 | 0.700904 |
1900 | 0.747472 | 0.715058 | 0.706936 | 0.706936 | 0.706936 | 0.706936 |
1950 | 0.752655 | 0.720345 | 0.712927 | 0.712927 | 0.712927 | 0.712927 |
2000 | 0.757846 | 0.725587 | 0.718875 | 0.718875 | 0.718875 | 0.718875 |
2050 | 0.763028 | 0.730782 | 0.724781 | 0.724781 | 0.724781 | 0.724781 |
2100 | 0.768162 | 0.735929 | 0.730640 | 0.730640 | 0.730640 | 0.730640 |
2150 | 0.773241 | 0.741026 | 0.736453 | 0.736453 | 0.736453 | 0.736453 |
2200 | 0.778262 | 0.746072 | 0.742217 | 0.742217 | 0.742217 | 0.742217 |
2250 | 0.783224 | 0.751066 | 0.747932 | 0.747932 | 0.747932 | 0.747932 |
2300 | 0.788127 | 0.756006 | 0.753594 | 0.753594 | 0.753594 | 0.753594 |
2350 | 0.792969 | 0.760892 | 0.759204 | 0.759204 | 0.759204 | 0.759204 |
2400 | 0.797788 | 0.765722 | 0.764759 | 0.764759 | 0.764759 | 0.764759 |
2450 | 0.802600 | 0.770495 | 0.770258 | 0.770258 | 0.770258 | 0.770258 |
2500 | 0.807367 | 0.775700 | 0.775700 | 0.775700 | 0.775700 | 0.775700 |
2550 | 0.812069 | 0.781084 | 0.781084 | 0.781084 | 0.781084 | 0.781084 |
2600 | 0.816707 | 0.786408 | 0.786408 | 0.786408 | 0.786408 | 0.786408 |
2650 | 0.821278 | 0.791671 | 0.791671 | 0.791671 | 0.791671 | 0.791671 |
2700 | 0.825781 | 0.796871 | 0.796871 | 0.796871 | 0.796871 | 0.796871 |
2750 | 0.830218 | 0.802008 | 0.802008 | 0.802008 | 0.802008 | 0.802008 |
2800 | 0.834586 | 0.807081 | 0.807081 | 0.807081 | 0.807081 | 0.807081 |
2850 | 0.838886 | 0.812087 | 0.812087 | 0.812087 | 0.812087 | 0.812087 |
2900 | 0.843116 | 0.817025 | 0.817025 | 0.817025 | 0.817025 | 0.817025 |
2950 | 0.847277 | 0.821894 | 0.821894 | 0.821894 | 0.821894 | 0.821894 |
3000 | 0.851367 | 0.826696 | 0.826696 | 0.826696 | 0.826696 | 0.826696 |
3050 | 0.855387 | 0.831428 | 0.831428 | 0.831428 | 0.831428 | 0.831428 |
3100 | 0.859336 | 0.836090 | 0.836090 | 0.836090 | 0.836090 | 0.836090 |
3150 | 0.863214 | 0.840682 | 0.840682 | 0.840682 | 0.840682 | 0.840682 |
3200 | 0.867019 | 0.845202 | 0.845202 | 0.845202 | 0.845202 | 0.845202 |
3250 | 0.870753 | 0.849650 | 0.849650 | 0.849650 | 0.849650 | 0.849650 |
3300 | 0.874415 | 0.854025 | 0.854025 | 0.854025 | 0.854025 | 0.854025 |
3350 | 0.878005 | 0.858327 | 0.858327 | 0.858327 | 0.858327 | 0.858327 |
3400 | 0.881522 | 0.862555 | 0.862555 | 0.862555 | 0.862555 | 0.862555 |
3450 | 0.884967 | 0.866709 | 0.866709 | 0.866709 | 0.866709 | 0.866709 |
3500 | 0.888343 | 0.870788 | 0.870788 | 0.870788 | 0.870788 | 0.870788 |
3550 | 0.891652 | 0.874791 | 0.874791 | 0.874791 | 0.874791 | 0.874791 |
3600 | 0.894888 | 0.878719 | 0.878719 | 0.878719 | 0.878719 | 0.878719 |
3650 | 0.898052 | 0.882571 | 0.882571 | 0.882571 | 0.882571 | 0.882571 |
3700 | 0.901145 | 0.886347 | 0.886347 | 0.886347 | 0.886347 | 0.886347 |
3750 | 0.904167 | 0.890047 | 0.890047 | 0.890047 | 0.890047 | 0.890047 |
3800 | 0.907118 | 0.893671 | 0.893671 | 0.893671 | 0.893671 | 0.893671 |
3850 | 0.909999 | 0.897219 | 0.897219 | 0.897219 | 0.897219 | 0.897219 |
3900 | 0.912809 | 0.900691 | 0.900691 | 0.900691 | 0.900691 | 0.900691 |
3950 | 0.915548 | 0.904088 | 0.904088 | 0.904088 | 0.904088 | 0.904088 |
4000 | 0.918219 | 0.907409 | 0.907409 | 0.907409 | 0.907409 | 0.907409 |
4050 | 0.920820 | 0.910655 | 0.910655 | 0.910655 | 0.910655 | 0.910655 |
4100 | 0.923353 | 0.913826 | 0.913826 | 0.913826 | 0.913826 | 0.913826 |
4150 | 0.925819 | 0.916920 | 0.916920 | 0.916920 | 0.916920 | 0.916920 |
4200 | 0.928219 | 0.919939 | 0.919939 | 0.919939 | 0.919939 | 0.919939 |
4250 | 0.930560 | 0.922884 | 0.922884 | 0.922884 | 0.922884 | 0.922884 |
4300 | 0.932847 | 0.925756 | 0.925756 | 0.925756 | 0.925756 | 0.925756 |
4350 | 0.935068 | 0.928555 | 0.928555 | 0.928555 | 0.928555 | 0.928555 |
4400 | 0.937226 | 0.931282 | 0.931282 | 0.931282 | 0.931282 | 0.931282 |
4450 | 0.939321 | 0.933937 | 0.933937 | 0.933937 | 0.933937 | 0.933937 |
4500 | 0.941354 | 0.936521 | 0.936521 | 0.936521 | 0.936521 | 0.936521 |
4550 | 0.943326 | 0.939035 | 0.939035 | 0.939035 | 0.939035 | 0.939035 |
4600 | 0.945236 | 0.941479 | 0.941479 | 0.941479 | 0.941479 | 0.941479 |
4650 | 0.947087 | 0.943854 | 0.943854 | 0.943854 | 0.943854 | 0.943854 |
4700 | 0.948879 | 0.946160 | 0.946160 | 0.946160 | 0.946160 | 0.946160 |
4750 | 0.950612 | 0.948400 | 0.948400 | 0.948400 | 0.948400 | 0.948400 |
4800 | 0.952287 | 0.950573 | 0.950573 | 0.950573 | 0.950573 | 0.950573 |
4850 | 0.953905 | 0.952679 | 0.952679 | 0.952679 | 0.952679 | 0.952679 |
4900 | 0.955468 | 0.954721 | 0.954721 | 0.954721 | 0.954721 | 0.954721 |
4950 | 0.956977 | 0.956699 | 0.956699 | 0.956699 | 0.956699 | 0.956699 |
5000 | 0.958614 | 0.958614 | 0.958614 | 0.958614 | 0.958614 | 0.958614 |
The game begins in the upper left cell at $t=0$ and $n=6$. The win probability listed here means that if both players play optimally, the player going first wins 53.487% of the time. To play a turn, first you must choose your banking action. Since cell (t=0, n=6) is green, the optimal banking action is to roll.
Second, you roll the dice. Let's say for your opening roll, you throw:
$$6, 5, 3, 3, 3, 2$$Third, you must choose how to score the dice. Here you have three choices:
- take the 5 for 50 points,
- take the three 3 for 300 points, or
- take the 5 and the three 3 for 350 points.
To determine your optimal scoring choice, you must look at the win probabilities from the three states corresponding to each scoring option:
$$ \begin{align*} W(t=50, n=5) &= 0.511005\\ W(t=300, n=3) &= 0.506680\\ W(t=350, n=2) &= 0.509711 \end{align*} $$So in this case, you should score only the 5 since it moves you to the state with the best probability of ultimately winning the game. This ends one primary-action, random-response, secondary-action sequence effecting one state change in the EMDP. This process is then simply repeated (starting with another banking decision from the new state (t=50, n=5) until you either roll a farkle, or end in a state where you're supposed to bank.
It is interesting to see how the optimal play strategy for the opening turn differs from the strategy that simply maximizes your expected farkle turn score. To maximize expected score you wouldn't bank with 6 dice available to roll unless you had 16,400 or more points on the turn. In contrast, to optimize your chances of winning a game, you would bank 5000 or more points on your opening turn even when you have six dice available to roll. The banking threshold for 5 dice is similarly reduced from 3050 to a more conservative 2500. The banking threshold for other available dice counts do not differ.
Note that if you manage to bank 1000 points on your opening turn, your chances of winning increase to almost 60%; banking 1850 points on your first turn leaves you with better than 70% chance of ultimately winning; bank 2750 points and your chances of winning are above 80%; and manage to put up 3900 points on the opening turn and you're win probability is over 90%.
Optimal Strategy Case: $b = 6000, d = 8000, f = 0, e = 0$
As a second example, consider the case of Table 2 which shows how to optimally play a turn where you start with 6000 points and are 2000 points behind to your opponent who has 8000 points.
t | n | |||||
---|---|---|---|---|---|---|
6 | 5 | 4 | 3 | 2 | 1 | |
0 | 0.162365 | 0.135452* | 0.124658* | 0.118916* | 0.116823* | 0.120172* |
50 | 0.167053* | 0.138098 | 0.126521* | 0.120462* | 0.118340* | 0.121980* |
100 | 0.172282* | 0.141184 | 0.128485 | 0.122068* | 0.119912* | 0.123894* |
150 | 0.177898* | 0.144859* | 0.130513 | 0.123738 | 0.121540* | 0.125881* |
200 | 0.183766* | 0.149237* | 0.133183 | 0.125473 | 0.123237 | 0.127935* |
250 | 0.189820* | 0.153963* | 0.136653* | 0.127264 | 0.125005 | 0.130069 |
300 | 0.196092 | 0.158915* | 0.140895* | 0.130017 | 0.126838 | 0.132298 |
350 | 0.202623 | 0.164030 | 0.145391* | 0.133939* | 0.128728 | 0.134616 |
400 | 0.209467 | 0.169363 | 0.150024 | 0.138163 | 0.133991 | 0.137010 |
450 | 0.216533 | 0.174975 | 0.154798 | 0.142503 | 0.139457 | 0.139475 |
500 | 0.223831 | 0.180897 | 0.159828 | 0.146975 | 0.145115 | 0.145115 |
550 | 0.231322 | 0.187028 | 0.165248 | 0.151603 | 0.150876 | 0.150876 |
600 | 0.239035 | 0.193362 | 0.171012 | 0.156843 | 0.156843 | 0.156843 |
650 | 0.246918 | 0.199976 | 0.176975 | 0.163015 | 0.163015 | 0.163015 |
700 | 0.255037 | 0.206817 | 0.183154 | 0.169453 | 0.169453 | 0.169453 |
750 | 0.263357 | 0.213830 | 0.189533 | 0.176109 | 0.176109 | 0.176109 |
800 | 0.271945 | 0.221070 | 0.196123 | 0.183059 | 0.183059 | 0.183059 |
850 | 0.280741 | 0.228460 | 0.202874 | 0.190242 | 0.190242 | 0.190242 |
900 | 0.289790 | 0.236097 | 0.209824 | 0.197712 | 0.197712 | 0.197712 |
950 | 0.298954 | 0.243918 | 0.216938 | 0.205281 | 0.205281 | 0.205281 |
1000 | 0.308358 | 0.251943 | 0.224242 | 0.213245 | 0.213245 | 0.213245 |
1050 | 0.317895 | 0.260087 | 0.231740 | 0.221146 | 0.221146 | 0.221146 |
1100 | 0.327785 | 0.268509 | 0.239462 | 0.229404 | 0.229404 | 0.229404 |
1150 | 0.337937 | 0.277274 | 0.247380 | 0.237923 | 0.237923 | 0.237923 |
1200 | 0.348393 | 0.286386 | 0.255529 | 0.246764 | 0.246764 | 0.246764 |
1250 | 0.358985 | 0.295733 | 0.263897 | 0.255751 | 0.255751 | 0.255751 |
1300 | 0.369743 | 0.305337 | 0.272544 | 0.264989 | 0.264989 | 0.264989 |
1350 | 0.380541 | 0.315057 | 0.281403 | 0.274404 | 0.274404 | 0.274404 |
1400 | 0.391642 | 0.324961 | 0.290466 | 0.284353 | 0.284353 | 0.284353 |
1450 | 0.402933 | 0.334964 | 0.299633 | 0.294503 | 0.294503 | 0.294503 |
1500 | 0.414660 | 0.345257 | 0.308958 | 0.305091 | 0.305091 | 0.305091 |
1550 | 0.426484 | 0.355749 | 0.318423 | 0.315621 | 0.315621 | 0.315621 |
1600 | 0.438586 | 0.366590 | 0.328154 | 0.326360 | 0.326360 | 0.326360 |
1650 | 0.450706 | 0.377696 | 0.338163 | 0.337112 | 0.337112 | 0.337112 |
1700 | 0.463015 | 0.389278 | 0.348576 | 0.348173 | 0.348173 | 0.348173 |
1750 | 0.475329 | 0.401117 | 0.359573 | 0.359573 | 0.359573 | 0.359573 |
1800 | 0.488007 | 0.413244 | 0.371759 | 0.371759 | 0.371759 | 0.371759 |
1850 | 0.500848 | 0.425457 | 0.384134 | 0.384134 | 0.384134 | 0.384134 |
1900 | 0.514151 | 0.437907 | 0.397273 | 0.397273 | 0.397273 | 0.397273 |
1950 | 0.527546 | 0.450291 | 0.410346 | 0.410346 | 0.410346 | 0.410346 |
2000 | 0.541169 | 0.463032 | 0.423910 | 0.423910 | 0.423910 | 0.423910 |
2050 | 0.553892 | 0.475650 | 0.435848 | 0.435848 | 0.435848 | 0.435848 |
2100 | 0.566687 | 0.488532 | 0.448671 | 0.448671 | 0.448671 | 0.448671 |
2150 | 0.579544 | 0.501372 | 0.462112 | 0.462112 | 0.462112 | 0.462112 |
2200 | 0.593075 | 0.514412 | 0.476647 | 0.476647 | 0.476647 | 0.476647 |
2250 | 0.606845 | 0.527568 | 0.490860 | 0.490860 | 0.490860 | 0.490860 |
2300 | 0.620874 | 0.541235 | 0.505044 | 0.505044 | 0.505044 | 0.505044 |
2350 | 0.634230 | 0.554962 | 0.518525 | 0.518525 | 0.518525 | 0.518525 |
2400 | 0.647380 | 0.568960 | 0.532648 | 0.532648 | 0.532648 | 0.532648 |
2450 | 0.659830 | 0.582404 | 0.547160 | 0.547160 | 0.547160 | 0.547160 |
2500 | 0.672922 | 0.595574 | 0.563571 | 0.563571 | 0.563571 | 0.563571 |
2550 | 0.685577 | 0.608065 | 0.579471 | 0.579471 | 0.579471 | 0.579471 |
2600 | 0.699046 | 0.620640 | 0.594828 | 0.594828 | 0.594828 | 0.594828 |
2650 | 0.712677 | 0.633358 | 0.608315 | 0.608315 | 0.608315 | 0.608315 |
2700 | 0.727273 | 0.647372 | 0.621304 | 0.621304 | 0.621304 | 0.621304 |
2750 | 0.740566 | 0.661998 | 0.632781 | 0.632781 | 0.632781 | 0.632781 |
2800 | 0.752872 | 0.677539 | 0.646547 | 0.646547 | 0.646547 | 0.646547 |
2850 | 0.763740 | 0.691731 | 0.661928 | 0.661928 | 0.661928 | 0.661928 |
2900 | 0.775267 | 0.704645 | 0.683575 | 0.683575 | 0.683575 | 0.683575 |
2950 | 0.787428 | 0.716595 | 0.704022 | 0.704022 | 0.704022 | 0.704022 |
3000 | 0.801293 | 0.729617 | 0.722026 | 0.722026 | 0.722026 | 0.722026 |
3050 | 0.812468 | 0.740574 | 0.732915 | 0.732915 | 0.732915 | 0.732915 |
3100 | 0.824009 | 0.752452 | 0.744321 | 0.744321 | 0.744321 | 0.744321 |
3150 | 0.834312 | 0.764049 | 0.754637 | 0.754637 | 0.754637 | 0.754637 |
3200 | 0.844465 | 0.775780 | 0.768552 | 0.768552 | 0.768552 | 0.768552 |
3250 | 0.854392 | 0.785990 | 0.782612 | 0.782612 | 0.782612 | 0.782612 |
3300 | 0.865209 | 0.800609 | 0.800609 | 0.800609 | 0.800609 | 0.800609 |
3350 | 0.875145 | 0.811712 | 0.811712 | 0.811712 | 0.811712 | 0.811712 |
3400 | 0.887417 | 0.826357 | 0.826357 | 0.826357 | 0.826357 | 0.826357 |
3450 | 0.896088 | 0.833481 | 0.833481 | 0.833481 | 0.833481 | 0.833481 |
3500 | 0.907011 | 0.847263 | 0.847263 | 0.847263 | 0.847263 | 0.847263 |
3550 | 0.912996 | 0.864340 | 0.864340 | 0.864340 | 0.864340 | 0.864340 |
3600 | 0.918091 | 0.883970 | 0.883970 | 0.883970 | 0.883970 | 0.883970 |
3650 | 0.920604 | 0.894832 | 0.894832 | 0.894832 | 0.894832 | 0.894832 |
3700 | 0.925351 | 0.903017 | 0.903017 | 0.903017 | 0.903017 | 0.903017 |
3750 | 0.933956 | 0.903045 | 0.903045 | 0.903045 | 0.903045 | 0.903045 |
3800 | 0.950436 | 0.903078 | 0.903078 | 0.903078 | 0.903078 | 0.903078 |
3850 | 0.962853 | 0.903098 | 0.903098 | 0.903098 | 0.903098 | 0.903098 |
3900 | 0.973675 | 0.917497 | 0.903124 | 0.903124 | 0.903124 | 0.903124 |
3950 | 0.979061 | 0.930203 | 0.903136 | 0.903136 | 0.903136 | 0.903136 |
Given your 2000 point defecit, it is interesting to see how much more agressively this turn should be played compared to an opening turn: the banking thresholds here for 6, 5, 4, 3, 2 and 1 dice are, respectively, 4000, 3350, 1750, 600, 400, and 500 points. Compare that to the banking thresholds for an opening turn at 5000, 2500, 1000, 350, 300, and 300 points.
Another interesting observation about this strategy is that although you should bank between 3300 and 3850 points when you have 5 dice to roll, you should instead roll if you have 3900 to 3950 points (because you have such a high probability of closing out the game).
Optimal Strategy Case: $b = 8000, d = 6000, f = 0, e = 0$
Now consider the reverse of the situation in the previous example. Table 3 shows how to optimally play a turn where you start with 8000 points and are 2000 points ahead of your opponent who has 6000 points.
t | n | |||||
---|---|---|---|---|---|---|
6 | 5 | 4 | 3 | 2 | 1 | |
0 | 0.903422 | 0.878090* | 0.862982* | 0.857184* | 0.856210* | 0.860061* |
50 | 0.907970* | 0.883520 | 0.866464* | 0.858208* | 0.857345* | 0.861539* |
100 | 0.912448* | 0.888520 | 0.872004 | 0.860637* | 0.858466* | 0.863007* |
150 | 0.916837* | 0.893230* | 0.877734 | 0.864994 | 0.859560* | 0.864460* |
200 | 0.921258* | 0.898001* | 0.882899 | 0.871213 | 0.863650 | 0.865889* |
250 | 0.925550* | 0.902725* | 0.887727* | 0.877180 | 0.868829 | 0.867273 |
300 | 0.929835 | 0.907504* | 0.892777* | 0.882276 | 0.882276 | 0.882276 |
350 | 0.933851 | 0.912127 | 0.897734* | 0.888668* | 0.888668 | 0.888668 |
400 | 0.937676 | 0.916796 | 0.902726 | 0.894905 | 0.894905 | 0.894905 |
450 | 0.941287 | 0.921250 | 0.907616 | 0.900824 | 0.900824 | 0.900824 |
500 | 0.944922 | 0.925413 | 0.912287 | 0.907268 | 0.907268 | 0.907268 |
550 | 0.948355 | 0.929370 | 0.916532 | 0.913595 | 0.913595 | 0.913595 |
600 | 0.951784 | 0.933391 | 0.920430 | 0.919922 | 0.919922 | 0.919922 |
650 | 0.955073 | 0.937177 | 0.925407 | 0.925407 | 0.925407 | 0.925407 |
700 | 0.958290 | 0.940961 | 0.930450 | 0.930450 | 0.930450 | 0.930450 |
750 | 0.961368 | 0.944716 | 0.934554 | 0.934554 | 0.934554 | 0.934554 |
800 | 0.964239 | 0.948516 | 0.938914 | 0.938914 | 0.938914 | 0.938914 |
850 | 0.966594 | 0.952008 | 0.943151 | 0.943151 | 0.943151 | 0.943151 |
900 | 0.968986 | 0.955211 | 0.948312 | 0.948312 | 0.948312 | 0.948312 |
950 | 0.971335 | 0.958000 | 0.953239 | 0.953239 | 0.953239 | 0.953239 |
1000 | 0.973780 | 0.960716 | 0.958124 | 0.958124 | 0.958124 | 0.958124 |
1050 | 0.975945 | 0.963165 | 0.961202 | 0.961202 | 0.961202 | 0.961202 |
1100 | 0.978076 | 0.965682 | 0.964291 | 0.964291 | 0.964291 | 0.964291 |
1150 | 0.979977 | 0.968029 | 0.966812 | 0.966812 | 0.966812 | 0.966812 |
1200 | 0.981763 | 0.970175 | 0.970114 | 0.970114 | 0.970114 | 0.970114 |
1250 | 0.983328 | 0.973408 | 0.973408 | 0.973408 | 0.973408 | 0.973408 |
1300 | 0.984740 | 0.977098 | 0.977098 | 0.977098 | 0.977098 | 0.977098 |
1350 | 0.986091 | 0.979125 | 0.979125 | 0.979125 | 0.979125 | 0.979125 |
1400 | 0.987692 | 0.981344 | 0.981344 | 0.981344 | 0.981344 | 0.981344 |
1450 | 0.988993 | 0.982226 | 0.982226 | 0.982226 | 0.982226 | 0.982226 |
1500 | 0.990373 | 0.984029 | 0.984029 | 0.984029 | 0.984029 | 0.984029 |
1550 | 0.991218 | 0.985999 | 0.985999 | 0.985999 | 0.985999 | 0.985999 |
1600 | 0.991737 | 0.989914 | 0.989914 | 0.989914 | 0.989914 | 0.989914 |
1650 | 0.992073 | 0.992073 | 0.992073 | 0.992073 | 0.992073 | 0.992073 |
1700 | 0.992947 | 0.992947 | 0.992947 | 0.992947 | 0.992947 | 0.992947 |
1750 | 0.992964 | 0.992964 | 0.992964 | 0.992964 | 0.992964 | 0.992964 |
1800 | 0.993911 | 0.992981 | 0.992981 | 0.992981 | 0.992981 | 0.992981 |
1850 | 0.994722 | 0.992990 | 0.992990 | 0.992990 | 0.992990 | 0.992990 |
1900 | 0.995640 | 0.992998 | 0.992998 | 0.992998 | 0.992998 | 0.992998 |
1950 | 0.996180 | 0.993000 | 0.993000 | 0.993000 | 0.993000 | 0.993000 |
In this example, you have a 2000 point lead and are yourself only 2000 points from winning the game. It is interesting to see how much more conservatively this turn should be played compared to an opening turn: the banking thresholds here for the different number of dice to throw are 1650, 1250, 650, 300, 300, and 300 points. Compare that to the banking thresholds for an opening turn at 5000, 2500, 1000, 350, 300, and 300 points.
As in the previous example, there is another banking rule inversion in this table, but this time for the case of 6 dice, where at point values above 1750 it again becomes advantageous to roll to try to end the game.
Optimal Strategy Case: $b = 9000, d = 9500, f = 0, e = 0$
As a fourth example, consider the case of Table 4 which shows how to optimally play a turn where you start with 9000 points and are 500 points behind your opponent who has 9500 points.
t | n | |||||
---|---|---|---|---|---|---|
6 | 5 | 4 | 3 | 2 | 1 | |
0 | 0.454366 | 0.351125* | 0.303308* | 0.277271* | 0.270069* | 0.285659* |
50 | 0.463700* | 0.358576 | 0.309820* | 0.283131* | 0.274544* | 0.289393* |
100 | 0.475303* | 0.366847 | 0.316369 | 0.289394* | 0.280312* | 0.294362* |
150 | 0.486102* | 0.376478* | 0.323286 | 0.295374 | 0.286679* | 0.300901* |
200 | 0.505119* | 0.388392* | 0.331770 | 0.301490 | 0.292906 | 0.309194* |
250 | 0.525335* | 0.399404* | 0.342403* | 0.308335 | 0.298948 | 0.317272 |
300 | 0.554879 | 0.417884* | 0.354687* | 0.317757 | 0.305197 | 0.325207 |
350 | 0.573800 | 0.432225 | 0.366843* | 0.329936* | 0.313516 | 0.332809 |
400 | 0.602493 | 0.454655 | 0.382838 | 0.343746 | 0.325562 | 0.341277 |
450 | 0.619405 | 0.462471 | 0.391661 | 0.356792 | 0.340660 | 0.354200 |
500 | 0.653307 | 0.481158 | 0.404518 | 0.367897 | 0.356345 | 0.372251 |
550 | 0.696939 | 0.489452 | 0.408392 | 0.376025 | 0.370389 | 0.393025 |
600 | 0.761617 | 0.519126 | 0.416628 | 0.381381 | 0.381505 | 0.412584 |
650 | 0.821580 | 0.577904 | 0.419050 | 0.384455 | 0.389121 | 0.429136 |
700 | 0.878974 | 0.671873 | 0.455551 | 0.391137 | 0.393701 | 0.441170 |
750 | 0.920889 | 0.759474 | 0.538352 | 0.391141 | 0.396326 | 0.448721 |
800 | 0.951179 | 0.838237 | 0.658773 | 0.453755 | 0.398829 | 0.452947 |
850 | 0.966192 | 0.886061 | 0.752905 | 0.563689 | 0.401559 | 0.455471 |
900 | 0.976536 | 0.921141 | 0.831614 | 0.696407 | 0.522215 | 0.459381 |
950 | 0.981336 | 0.937788 | 0.873088 | 0.776038 | 0.641661 | 0.462492 |
I find it interesting that the only time you should bank during such a turn is if you have exactly 3 dice to roll and either 700 or 750 points on the turn. Why does it make sense to roll just 1 or 2 dice with those same number of points? And why does it make sense to roll three dice when you have 800 or more points?
To answer the first question, observe that it's easier to ultimately hot-dice (score all dice thrown) when you are rolling only 1 or 2 dice than when you are rolling 3. If you can manage to get back to 6 dice, your chances of ending the game this turn increase dramatically. Furthermore, any additional game points you gain by rolling 3 dice are meaningless unless you hot-dice. If only 1 or 2 of the 3 thrown dice score, you will still find yourself below the game goal and (if you then decide to bank these few extra points) will not have even reduced the number of points you have to accumulate on your subsequent turn due to the minimum banking threshold.
To answer the second question, note that it's possible to reach 10,000 points without hot-dice: in particular there's a 7% chance of rolling two ones and a third non-scoring die. These rolls end the game if you have 9800 points.
Optimal Strategy Case: $b = 9500, d = 9000, f = 0, e = 0$
Now consider the reverse of the situation in the previous example. Table 3 shows how to optimally play a turn where you start with 9500 points and are 500 points ahead of your opponent who has 9000 points.
t | n | |||||
---|---|---|---|---|---|---|
6 | 5 | 4 | 3 | 2 | 1 | |
0 | 0.801016 | 0.702303* | 0.658594* | 0.637598* | 0.630975* | 0.640081* |
50 | 0.826173* | 0.706846 | 0.660815* | 0.642258* | 0.639026* | 0.652003* |
100 | 0.863321* | 0.724040 | 0.665063 | 0.645329* | 0.645400* | 0.663218* |
150 | 0.897707* | 0.757952* | 0.666217 | 0.647091 | 0.649767* | 0.672708* |
200 | 0.930612* | 0.811876* | 0.687617 | 0.648362 | 0.652392 | 0.679607* |
250 | 0.954643* | 0.862100* | 0.735325* | 0.649657 | 0.653897 | 0.683936 |
300 | 0.972009 | 0.907257* | 0.804365* | 0.686823 | 0.655332 | 0.686359 |
350 | 0.980617 | 0.934675 | 0.858333* | 0.749851* | 0.656897 | 0.687806 |
400 | 0.986548 | 0.954788 | 0.903460 | 0.825942 | 0.726073 | 0.690048 |
450 | 0.989300 | 0.964332 | 0.927238 | 0.871597 | 0.794555 | 0.691832 |
Here I found it surprising that even though you have a 500 point lead and control of the dice, the optimal strategy requires that you never bank until you've reached the game-winning 500 points. This is surprisingly more aggressive play than on an opening turn where the score is tied. I would have guessed that banking 300 points when faced with rolling 1 or 2 dice would have been better to avoid the high-probability farkle, then allow your opponent a chance to force his way to a 1000 point turn, and assuming he fails, follow up with a high-probability minimum bank turn to win. It turns out if your opponent plays perfectly, he has about a 1/3 chance of reaching 1000 points to steal the win, and so playing a little more aggressively to attempt to end the game and deny him that possible steal is advantageous.
Also note how for many point totals, you are better off with fewer dice. This is often the case when playing turns where your best play is to try to reach 10,000 points and end the game, and also in cases where you are just really far behind an opponent that is fast approaching a game winning score.
Strategy Validation
In this section I share the efforts I've made to sanity check the calculated strategies for reasonableness, and to determine the deviation between the calculated strategy and the truly optimal strategy. This section definitely gets into the weeds and is not for the faint of heart, but may be of interest to skeptics or the particularly nerdy (of which I am both).
Comparison to strategy that maximizes expected turn score
Strategies that maximize expected turn score for farkle variants have been independently calculated with consistent results4,5,6. These strategies do not consider your current banked score, your opponent's banked score or farkle count, or the end of game scoring goal. Still, one would expect that for an opening turn (where the score is tied at zero and where the end-of-game scoring goal is distant) that the game winning strategy documented here and the strategy that maximizes expected turn score would be similar for low turn score states. Furthermore, the two strategies should converge if the game scoring goal is increased from 10,000 points towards infinity. Unfortunately, the 10,000 point goal was already stretching both the memory limits and processing power of my computer. To reduce the size of the state space, I eliminated the three consecutive farkle penalty. This effectively eliminates the f and e state variables, reducing the size of the state space by a factor of 9, and also eliminates the need to model banked score states of less than 0 points, which shrinks things even more.
My Farkle Strategy Generator (FSG)6 produces farkle strategies that maximize expected turn score. The FSG outputs strategy tables that prioritize each turn state with a sequential preference number. Those states associated with higher expected turn scores have a higher preference number. To follow the strategy, you score thrown dice to move to the state with the highest number. This is exactly the same process you use to follow the game winning strategy presented in this document: you score your dice to move to the state with the highest probability of winning. By indexing the states from lowest win probability to highest, the two strategies can easily be compared.
First compare the two strategies for the case of a 10,000 point goal. Table 6W shows the strategy that maximizes your chances of winning the game (produced by value iteration) for an opening turn. Table 6E shows the strategy that maximizes expected turn score (produced by the FSG). Turn scores in both cases are truncated to a maximum of 1500. States with different preference numbers are highlighted. There are numerous slight differences, but they are obviously very similar: the banking thresholds are almost identical, and the state preference numbers deviate only slightly.
|
|
Tables 7W and 7E compare these same two strategies but with the scoring goal increased to 30,000 points. Here the two strategies are seen to be almost identical for these low turn score states. The two strategies do appear to be converging as the scoring goal increases. While this doesn't assure strategy correctness, it offers convincing evidence of correctness in at least this limiting case.
|
|
Effects of the banked score lower bound
The calculated strategy is non-optimal when playing in regions of the state space where either your banked score, or your opponent's banked score is near the lower bound. This discrepancy between the calculated strategy and the truly optimal strategy is further exacerbated as the consecutive farkle count associated with a banked score near the bound increases.
To estimate the extent of the discrepancy, I lowered the bound on banked scores from -2500 to -3000 and regenerated the strategy matrix. One expects the relative deviation between the win probabilities of corresponding states in the two strategies to decrease as the minimum of the player's banked scores increases. To see the rate of convergence, for each possible banking score $B$ ($-2500 <= B < 10,000$), I found the maximum relative deviation between win probabilities (calculated as ${{2|w1-w2|}\over{w1+w2}}$) over all game states satisfying $min(b,d) >= B$. Table 8 lists selected results. Note that each time B increases by 500 points (which just happens to be the triple farkle penalty), the maximum relative deviation consistently decreased by 2 orders of magnitude (with one of those two orders of magnitude coming with the last 50 point increase). Given this exponential convergence rate, for any given $B$, the maximum relative deviation between the strategies for $L=-2500$ and $L=-\infty$ must be only slightly greater (on the order of 1%) than the maximum relative deviation between the strategies for $L=-2500$ and $L=-3000$.
B | Maximum Relative Deviation |
---|---|
-2500 | 0.296463834 |
-2050 | 0.028572943 |
-2000 | 0.003438359 |
-1550 | 0.000249319 |
-1500 | 0.000029516 |
-1050 | 0.000002074 |
-1000 | 0.000000234 |
-550 | 0.000000016 |
-500 | 0.000000001 |
Note that the state win probabilities for two calculated strategies do not have to be identical for the rules about when to bank, and when to roll, and how to score the dice in each situation to be to be identical. So how close do the calculated win probabilities and the optimal win probabilities need to be to make it likely we are playing with a truly optimal strategy? Using state preference charts like those shown in the previous section, I compared complete turn strategies for all combinations of state variables $b$, $d$, $f$, and $e$. If two state preference charts are identical, then the strategies are identical. Once I found a bizarre state preference discrepancy associated with a relative difference in win probability on the same order as the convergence threshold for the entire state matrix. I ignored this and any others like it that might be lurking in the tables. Table 9 shows the banked score and consecutive farkle count thresholds above which no more strategy discrepancies otherwise appeared. So for example if both you and your opponent have at most 1 consecutive farkle, and you both have banked scores of at least -1500 points, then there are no discrepancies between the two strategies.
So although there's no way to pick a lower bound $L$ that will assure all turn strategies will be truly optimal for a given working range of banked scores and consecutive farkle counts, driving the maximum relative deviation down to within about 1 part in 108 seems to assure that the vast majority of turn strategies (and likely all of them) are optimal. Given the convergence rate seen above, this means that as long as both players are about 2000 points above the lower bound on banked scores, all turn strategies are likely to be optimal.
Minimum Banked Score | Maximum Farkle Count | Maximum Relative Deviation |
---|---|---|
-1500 | 0 | 0.000001175 |
-1200 | 1 | 0.000001529 |
-700 | 2 | 0.000000070 |
Conclusions
Extended forms of a Markov Decision Process (MDP) and value iteration were devised that allow more efficient modeling of 2-player Farkle than is possible with a standard MDP. Using this model, the strategy that maximizes win probability using Facebook scoring rules has been determined under the constraint that banked scores are lower bounded to a large negative score. The strategy is shown to be unaffected by this bound in regions where banked scores are at least 2000 points above the bound. The strategy appears to converge (as expected) to the strategy that maximizes expected turn score as the end-of-game scoring goal is increased.
With nearly a half billion game states, and so many different dice rolls and scoring options possible from each of these different states, it wasn't immediately obvious whether the problem was solvable with a home computer. I was quite pleased when I saw the first value iteration update complete, and even more pleased when I saw the matrix was actually converging! It was a nice problem in the sense that it's complexity puts finding the solution at the limits of what can be done with a home computer (or at least at the limits of what I can do with a home computer).
The optimal strategy showed many expected behaviors including more aggressive play when behind, and more conservative play when ahead. But it also showed many behaviors that were unexpected to me, such as playing aggressively even with a sizable lead when nearing the game goal of 10,000 points. Numerous times I was convinced I had discovered a flaw in the strategy (and therefore either a bug in my code or an error in my analysis) only to realize later that the strategy was actually reasonable.
Next Steps
The very limited view I've offered into the game winning strategy is unsatisfying. Time permitting, I intend to develop a web page that will allow you to explore the gargantuan optimal Farkle strategy in detail, and perhaps even play a game of farkle against a perfect computer opponent. I am motivated by your interest, so please leave a comment if you'd like to see it.
A Facebook Farkle game is won by the first player to bank 10,000 or more points, but there are other variations to end-of-game rules. In Zilch, for example, the other player always gets a final go at the dice. Some people like to assure that both players get the same number of turns. In this latter case, the optimal strategies played by the two players are not even the same — with the player going second clearly having a strategic advantage. I think any of these could be modeled, but they are each very different, and not currently supported by my solver. These are ickier problems, and I don't intend to work on them anytime soon.
References
- T Neller and C Presser, "Optimal Play of the Dice Game Pig," The UMAP Journal 25(1) (2004), pp. 25-47.
- R. E. Bellman. Dynamic Programming. Princeton University Press, Princeton, NJ, 1957.
- Cap Khoury, Multinomial Coefficients and Farkle, Jul. 2009.
- M Busche, Maximizing Expected Scores in the Game of Zilch, Matt's Maniacal Musings, Aug. 2010.
- E Farmer, Analysis of Farkle, Possibly Wrong, Apr. 2013.
- Farkle Strategy Generator, Matt's Maniacal Musings.
I would love to see a website with all the statistics available!!
I love delving into these kind of games! 😀
One of the problems is that everyone plays with slightly different rules. It takes several hours of computation time to solve for a single rule set, and with a half billion game states, each solution would eat up 4 gigs of my server’s hard disk too. I guess I could dedicate a couple of terra-byte hard drives to storing (and backing up) solutions to various game rules.
Does the 53.487 percent to win for the first player take into account the second player’s rebuttal after the first passes 10,000 points?
No. The first player to bank 10,000 or more points wins in this version of the game (consistent with Facebook Farkle game rules).
It would be fairly straight-forward, however, to analyze the game rule variant you describe. It’s something I’ve thought about doing for some time. The best way, I think, is to start by precalculating the chance of a “rebuttal” succeeding for all possible point deficits that must be overcome in a single (final) turn. If, for example, the first player banks 10,200 points whilst the second player had 9,000 points, then you are interested in the probability of the second player (given a final turn) to successfully walk the safest path to 1,250 or more points in a single turn. (Incidentally, my FSG can be configured to do this — I describe how in one of my other comments on this website somewhere.) Do this for all possible deficits from 100, to something absurdly large (although a few thousand points would probably be sufficient as it is surely suboptimal to continue rolling with 15,000 points, so higher point game states would never be explored by an optimal player).
Then the win probability matrix would then be extended to include game states at and above 10,000 points for the active player (i.e., the player whose turn it is). And the win probability associated with banking in these states would be fixed with the rebuttal probabilities we just calculated. The win probability for these super-10,000 point game states could be calculated in a single downward pass (since the win probabilities for banking from each them are fixed and known).
So, anyway, I think this would be fairly straight-forward to do; and would require computation time that is negligible compared to what’s already being done. It’s a flexibility I’ve thought about adding to my Farkle analysis tools, but haven’t, as yet, had sufficient motivation to pursue it.
Just a heads up to anyone else who’s confused by this:
There’s an error below “Optimal Strategy Case: b=0,d=0,f=0,e=0”, where the win probabilities are listed as:
W(t=50,n=5)W(t=300,n=5)W(t=350,n=5)=0.511005=0.506680=0.509711
I felt really stupid when I couldn’t figure out how n could be 5 for all three actions, but I looked up the numbers in the table and it’s simply supposed to be:
W(t=50,n=5)W(t=300,n=3)W(t=350,n=2)=0.511005=0.506680=0.509711
Hey Kasi,
Good job finding this cut-n-paste error. I’ve corrected the document.
So in conclusion, I do what?
If what you’re asking is, “how the devil am I supposed to use all this to improve my farkle game play?”, then you’re right: although this article may be technically interesting to some, it is not terribly accessible if you’re interest is improving your farkle play.
Todd Neller and I wrote about a technique where you can closely approximate a maximum scoring strategy. For the rule set we used, the technique actually perfectly reproduces the maximum scoring strategy. Depending on the scoring rules, I suspect our technique may sometimes introduce small deviations from the true maximum scoring strategy, but these will certainly be rare, and can only introduce small reductions in expected score when they occur.
See section 4 of our article on optimal farkle play. Skip down to the paragraph that starts, “We have devised a means by which a human can play the maximum scoring
strategy perfectly…” That table V(n,t) provided is only good for the simple (and rarely used) rules described in our paper. But you can use my farkle strategy generator to generate max-score-strategies for your exact rule set. Then click on the button E(S,N). The table revealed is the same table as V(n,t) described in our paper, except that is tuned for your rule set. To generate the estimation table, simply round each entry UP to the next 50-point score point. So for example if E(S,N) is 139.951, round it up to 150. Replace all banking states with the value 0. Then as you scan down each column, write down the scores where the value changes from one 50 point value to the next. These are the break points you have to memorize. Put this information in the same layout as described in our paper, memorize that table and follow the instructions we give there.
This will easily let you play the max-score strategy (or at least something very close to it). In our paper, we also describe go-for-it regions where you or you’re opponent are so close to going out, that it pays to just keep rolling until you’ve won or you’ve farkled. I currently do not provide a tool to identify these regions for you. I could possibly work on this if there’s significant interest, but my time is currently very limited.
Absolutely wonderful! I’m really happy to read this and to see the page with the strategy generator. Would love to see more things on Farkle. It’s quite fascinating. Thank you for this.
Love it. We’re nearby Denver metro area residents playing this with family remotely during COVID. I suspected some of the strategic optima, but some were quite surprising! I’d love a site for it. I can help with 3-years-newer computer resources than when you first started work on this problem, lol.
Thanks for your appreciation.
I’m not sure exactly what you’re proposing. Something to build strategies for different game rule variations? Or a place to actually play the game, perhaps against an optimal opponent?
I took a real job about a year ago, so I have no time currently to pursue these hobbies of mine.
six dice for three-pair and 1500 points
I think this should be 750.
Awesome job applying this method to the game of Farkle! I recently rediscovered this classic game and was wondering if there was an optimal play strategy. This seems to be the most in-depth. Thanks for taking the time to summarize all your work in great detail and share it openly!