Neller and Presser modeled a simple dice game called pig as a Markov Decision Process (MDP) and used value iteration to find the optimal game winning strategy^{1}. Inspired by their approach, I've constructed a variant of an MDP which can be used to calculate the strategy that maximizes the chances of winning 2player farkle. Due to the three consecutive farkle penalty, an unfortunate or foolish player can farkle repeatedly to achieve an arbitrarily large negative score. For this reason the number of game states is unbounded and a complete MDP model of farkle is not possible. To bound the problem, a limit on the lowest possible banked score is enforced. The calculated strategy is shown to converge exponentially to the optimal strategy as this bound on banked scores is lowered.
Each farkle turn proceeds by iteratively making a preroll banking decision, a (contingent) roll of the dice, and a postroll scoring decision. I modified the classic MDP to include a secondary (postroll) action to fit this turn model. A reward function that incentivizes winning the game is applied. A similarly modified version of valueiteration (that maximizes the value function for both the preroll banking decision, and the postroll scoring decision) is then used to find an optimal farkle strategy.
With a lower bound of 2500 points for banked scores, there are 423,765,000 distinct game states and so it is not convenient to share the entire strategy in printed form. Instead, I provide some general characterizations of the strategy. For example, if both players use this same strategy, the player going first will win 53.487% of the time. I also provide samples of complete singleturn strategies for various initial banked scores. Currently, only the strategy for Facebook Farkle has been calculated, but the strategy for other scoring variants of farkle could easily be deduced using the same software.
Contents
The Rules of Farkle
Farkle rules differ only slightly from the rules of Zilch, but are provided here for completeness.
Farkle is played with two or more players and six sixsided dice. Each player takes turns rolling the dice. The dice in a roll can be worth points either individually or in combination. If any points are available from the roll, the player must set aside some or all of those scoring dice, adding the score from those dice to their point total for the turn. After each roll, a player may either reroll the remaining dice to try for more points or may bank the points accumulated this turn (though you can never bank less than 300 points). When a player banks his points, the player's turn is ended and the dice are passed to the next player.
If no dice in a roll score, then the player loses all points accumulated this turn and their turn is ended. This is called a farkle, a sorrowful event indeed.
If all dice in a roll score, the player gets to continue his turn with all six dice. This is called hot dice and is guaranteed to brighten your day.
A player may continue rolling again and again accumulating ever more points until he either decides to bank those points or loses them all to a farkle.
If a player ends three consecutive turns with a farkle, they not only lose their points from the turn but also lose 500 points from their banked game score. (This is the only way to lose banked points.) After a triple farkle, your consecutive farkle count is reset to zero so you're safe from another triple farkle penalty for at least three more turns.
The game ends when one player has banked a total of 10,000. Unlike zilch, other players do not get a final turn.
Scoring is as follows:
 Each 1 is worth 100 points.
 Each 5 is worth 50 points.
 A set of three 1s is worth 1000 points.
A set of three 2s is worth 200 points.
A set of three 3s is worth 300 points.
A set of three 4s is worth 400 points.
A set of three 5s is worth 500 points.
A set of three 6s is worth 600 points.
Unlike zilch, each extra die in a set increases the value of the set by a like amount. So four 4s are worth 800 points, five 5s are worth 1500, and six 1s are worth 4000.  Three pair is worth 750 points. Unlike Zilch, a set of four 2s and two 4s may not be scored as three pairs.
 A six die straight is worth 1500 points.
 Each die can only be used once when scoring. (If you roll two 1s, two 2s, and two 3s you can either count the two 1s for 200 or use all six dice for threepair and 1500 points — you can't use the ones both ways for 1700 points.)
Markov Decision Processes and Value Iteration
Note: those familiar with MDPs may find the nonstandard variable names I use to present this standard subject distracting. My aim is to use variable names most meaningful in the final valueiteration equation. I beg your indulgence. As a courtesy, I've included mousehover popups where each such nonstandard variable name is introduced (highlighted in blue) explaining the motivation for the change.
A Markov Decision Process (MDP)^{2} is a system having a finite set of states $S$. For each state $s \in S$, there are a set of actions that may be taken $A(s)$. For each action $a \in A(s)$, there is a set of transition probabilities $P_a(s, s')$ defining the probability of transitioning from $s$ to each state $s' \in S$ given that action $a$ was taken while in state $s$. When action $a$ is taken from state $s$, the MDP responds by randomly moving to a new state $s'$ as governed by the transition probabilities $P_a(s, s')$ and then assigning the decision maker a reward $D_a(s, s')$.
The objective is to find a strategy function $G(s)$ that returns the particular action at each state $s \in S$ that will maximize the expected cumulative reward given by:
$$\sum_{k=0}^{\infty} \gamma^k D_{a_k}(s_k, s_{k+1})$$where $k$ is a discreet time variable, $s_k$ is the game state at time $k$, $a_k$ is the player action taken from from state $s_k$, and $\gamma \in [0,1)$ is a constant discount factor for future rewards. The expectation must be taken over all possible state transition paths, and maximized over all possible choices for the actions $a_k$ taken in each state $s_k$. Then $G(s)$ will be defined by the $a_k$ taken from each state $s_k$ that maximizes this expectation.
One technique for solving this problem is value iteration. With this technique, each state $s$ is given a decimal value $W(s)$ which is an estimate of the expected discounted sum of all future rewards gained from state $s$. The estimate for $W(s)$ is iteratively refined for all $s$ by applying this update equation sequentially to all states $s$: $$W_{i+1}(s) := \max_{a \in A(s)} \left[ \sum_{s'} P_a(s, s')(D_a(s, s') + \gamma W_i(s'))\right]$$
Note that at each state $s$, the action which provides maximum total reward is selected. So not only is the estimate of $W(s)$ iteratively refined, but the best action $a$ taken for each state $s$ is also simultaneously improved. Iteration continues until $W_{i+1}(s)$ and $W_i(s)$ converge for all $s$, and $G(s)$ is then the set of actions $a$ selected for each state $s$ in the final iteration.
Extending the MDP to support Farkle
During a farkle turn, a player must iteratively
 decide whether to bank his current points,
 then, if he decides not to bank, roll the dice,
 then decide how to score the dice.
To fit farkle to an MDP model, one need only consider the result of a roll as part of the game state, and then reorder the turn sequence to that of a combined scoringandbank/roll action followed by either an endofturnevent or another randomrollevent. But this increases the number of game states by at least a few orders of magnitude and makes the problem unsolvable without a commensurate increase in computer resources.
Alternatively, the rolldecision could be splintered to include any conceivable combination of scoring instructions, directing the MDP how to score each potential roll before the roll is even made, thereby allowing the state machine to transition without additional input from the player once the roll is made. Aside from being a painful way of thinking about the problem, the number of possible scoring instructions is enormous, and the approach is again not feasible.
Rather than forcing the game to fit the MDP model, I instead define an extended MDP (EMDP) to more naturally model the game. Like an MDP, an EMDP has a finite set of states $S$. For each state $s \in S$, there are a set of primary actions that may be taken $A(s)$. For each primary action $a \in A(s)$, there are a set of sets of secondary actions $R_a(s)$. Once primary action $a$ is taken, one set of secondary actions $r \in R_a(s)$ is selected randomly by the EMDP according to a probability distribution $P_a(s)$. So instead of transitioning, the EMDP responds to action $a$ by offering a randomly chosen set of secondary actions that may be taken. For each secondary action $c \in r$, a deterministic transition state $s'$ is defined by a transition matrix $s' = X_{a,c}(s)$. Selection of action $c$ causes the EMDP to transition to $s'$ and reward $D_{a,c}(s, s')$ is granted.
Because there are two actions to be taken in a turn, the optimal strategy also has two parts: $G_A(s)$ is the optimal primary action in state $s$, and $G_C(s, r)$ is the optimal secondary action to take in state $s$ given that secondary action set $r$ was randomly offered by the EMDP in response to action $a$.
Value iteration processing is also extended to account for the secondary action: $$W_{i+1}(s) := \max_{a \in A(s)} \left[ \sum_{r \in R_a(s)} P_{a,r}(s) \max_{c \in r} \Big[ D_{a,c}(s, s') + \gamma W_i(s') \Big] \right]$$
Neller and Presser chose a reward function to incentivize winning the game^{1}. For any transition from a nonwinning game state $s$ to a winning game state $s_w$, $D_{a,c}(s, s_w) = 1$ (where $s_w$ is any state where the player's banked points plus his turn points meets or exceeds the game goal of 10,000 points, and where his turn points meets or exceeds the minimum banking threshold). For transitions to any other nonwinning state $s_v$, $D_{a,c}(s, s_v) = 0$. Because all game winning states $s_w$ are terminal, all future rewards from such states must be zero, so $W(s_w) = 0$. Because $W$ is known for all game winning states $s_w$, $W(s_w)$ is never updated during value iteration. (I.e., although a game winning state can appear on the right side of the value iteration update equation, it never appears on the left.)
Normally for an MDP, $0 \le \gamma < 1$, but we do not wish to value a game you win in 30 turns, less than a game you win in 10. Following Neller and Presser's approach, I instead set $\gamma = 1$. In general this can prevent value iteration from converging, but it does not cause a problem for farkle. (I think this is because there is no circular state transition path offering unbounded rewards, and because only the player that gets to a game winning state first actually wins, which ensures that the optimal strategy can't be attracted to some infinitely long path to a game winning state.) With $\gamma = 1$, $W(s)$ converges to the probability of winning from any nonterminal state $s$ when using an optimal strategy; the $a$ selected in the $\max_a$ operation converges to the optimal banking strategy from state $s$: $a = G_A(s)$; and the $c$ selected by the $\max_c$ operation converges to the optimal scoring strategy from state $s$ given that roll $r$ was thrown: $c = G_C(s,r)$.
Given that $\gamma = 1$, simplifications can be achieved if you move the reward for transitions to a game winning state out of the reward function, and into the value function for those same game winning states. That is, instead of defining $W(s_w) = 0$, define $W(s_w) = 1$, and set $D_{a,c}(s, s') = 0$ everywhere, allowing the reward function to be completely eliminated from the update equation. This definition also results in $W(s)$ consistently being the probability of winning the game from any state $s$ (including terminal game winning states).
Applying the simplifications of $\gamma = 1$, $D_{a,c}(s, s') = 0$, and $W(s_w) = 1$ reduces the extended value iteration update formula to:
$$W_{i+1}(s) := \max_{a \in A(s)} \left[ \sum_{r \in R_a(s)} P_{a,r}(s) \max_{c \in r} W_i(s') \right]$$This general approach for finding the optimal play strategy for games having both preroll and postroll actions, is further detailed for the specifics of 2player farkle in the sections below.
Game State Characterization
The current game state $s$ is characterized by six component state variables:
$$s = (t, n, b, d, f, e)$$where $t$ is the number of points accumulated only from your current turn. Once $b + t \gt 9950$ (which means you've hit the goal of 10,000 points) and you've met the minimum requirement to bank $t > 250$, the game is over, so for nonterminal game states we have:
$$t \in \{0, 50, 100, ..., \max [250, 9950b]\}\text{.}$$$n$ is the number of dice you have to roll
$$n \in \{1, 2, 3, 4, 5, 6\}\text{,}$$$b$ is your banked score for which I enforce a lower bound $L$
$$b \in \{L, L+50, ..., 100, 50, 0, 50, 100, ..., 9950\}\text{,}$$$d$ is your opponent's banked score which is also lower bounded to $L$
$$d \in \{L, L+50, ..., 100, 50, 0, 50, 100, ..., 9950\}\text{,}$$$f$ is your consecutive farkle count (from previous turns)
$$f \in \{0, 1, 2\}\text{, and}$$$e$ is your opponent's consecutive farkle count
$$e \in \{0, 1, 2\}\text{.}$$The Farkle Value Iteration Equation
In this section we apply the farkle component state variables and rules defined in previous sections to the EMDP value iteration equation:
$$W_{i+1}(s) := \max_{a \in A(s)} \left[ \sum_{r \in R_a(s)} P_{a,r}(s) \max_{c \in r} W_i(s') \right]$$Let's detail $A(s)$ first. For farkle, $A(s)$ is the set of available banking actions having at most two members: BANK and ROLL. For game states where you have so far accumulated less than 300 points, you have only one preroll (primary) action available to you: ROLL the dice. But for all other game states you have two preroll actions available: BANK, or ROLL. This yields:
\begin{equation} W_{i+1}(s) := \left\{ \begin{array}{ll} \sum\limits_{r \in R_{\text{ROLL}}(s)} P_{{\text{ROLL}},r}(s) \max\limits_{c \in r} W_i(s'), & \text{if $t < 300$}.\\ \\ \max \bigg[\sum\limits_{r \in R_{\text{BANK}}(s)} P_{{\text{BANK}},r}(s) \max\limits_{c \in r} W_i(s'), \\ \hspace{30pt} \sum\limits_{r \in R_{\text{ROLL}}(s)} P_{{\text{ROLL}},r}(s) \max\limits_{c \in r} W_i(s') \bigg], & \text{if $t \ge 300$}.\\ \\ \end{array} \right. \end{equation}In the case of a bank action the equation collapses. $R_{\text{BANK}}(s)$ has only one entry: $r_{\text{BANK}}$. There's only one member of the probability distribution: $P_{{\text{BANK}}, r_{\text{BANK}}}(s) = 1$. Also, $r_{\text{BANK}}$ has only one entry: $c_{\text{BANK}}$. So for the case of $t \ge 300$ , the first member of the outer max operation reduces to just $W_i(s')$. All the mathematical machinery in this case is flexible enough to handle the banking case, but is entirely unnecessary. But what exactly is $s'$ after you bank?
Looking back at the game state characterization from the previous section, there is no variable that encodes whose turn it is. Everything I've written so far is from the perspective of the player who controls the dice, and there is no $s'$ expressible in terms of our six component state variables that identifies your state after a banking operation. (This is by design and is consistent with Neller and Presser's approach for optimizing pig game play strategy.) After you bank, it is your opponent's turn who we assume is also playing the optimal strategy, and we can express his state after you bank. If your game state just before you banked was $s = (t, n, b, d, f, e)$, then your opponent's state after you bank will be $o' = (0, 6, d, b + t, e, 0)$. To be clear, after you bank your opponent will have 0 turn points, 6 dice to roll, a banked score of $d$, an opponent's banked score of $b + t$ (which is your new banked score), $e$ consecutive farkles, and his opponent will have $0$ consecutive farkles (your farkle count reverting to zero having just banked). Your opponent's win probability is $W(o')$, which means our win probability after the bank must be: $W_i(s') = 1  W_i(o') = 1  W_i(0, 6, d, b + t, e, 0)$, which yields: \begin{equation} W_{i+1}(s) := \left\{ \begin{array}{ll} \sum\limits_{r \in R_{\text{ROLL}}(s)} P_{{\text{ROLL}},r}(s) \max\limits_{c \in r} W_i(s'), & \text{if $t < 300$}.\\ \\ \max \bigg[\Big(1  W_i(0,6,d,b+t,e,0)\Big), \\ \hspace{30pt} \sum\limits_{r \in R_{\text{ROLL}}(s)} P_{{\text{ROLL}},r}(s) \max\limits_{c \in r} W_i(s') \bigg], & \text{if $t \ge 300$}.\\ \\ \end{array} \right. \end{equation}
In the case of a ROLL action, $R_{\text{ROLL}}(s)$ corresponds to the set of all possible rolls of the dice from state $s$. More precisely, each roll $r$ is defined as a set of possible scoring decisions for some permutation of thrown dice. The $\sum_r$ operation is summing over each possible roll $r \in R_{\text{ROLL}}(s)$. $P_{\text{ROLL},r}(s)$ is the probability of making roll $r$ from game state $s$. And each $c \in r$ is one possible scoring decision given that roll $r$ was thrown. Given scoring decision $c$, the new game state $s'$ is determined. The $\max_c$ operation maximizes the expected win probability $W_i(s')$ given that roll $r$ was thrown over these possible scoring decisions.
First note that the set of potential rolls from game state $s = (t, n, b, d, f, e)$ is only dependent on the number of dice you are rolling, so:
$$R_{\text{ROLL}}(s) = R_{\text{ROLL}}(n)$$Second note that
$$P_{\text{ROLL},r}(s) = {1 \over {6^n}}$$Third, because the expression for $W_i(s')$ is fundamentally different for the case of farkling rolls vs. scoring rolls, it is convenient to partition $R_{\text{ROLL}}(n)$ into two subsets: the subset of all farkling rolls $R_{\text{FARKLE}}(n)$, and the subset of all scoring rolls $R_{\text{SCORE}}(n)$:
$$R_{\text{ROLL}}(n) = R_{\text{FARKLE}}(n) \bigcup R_{\text{SCORE}}(n)$$Fourth, if roll $r$ is a farkle, then $r$ will have only one member: a zero point farkle scoring decision and the $max_c$ operation can be dropped.
Applying these four observations gives:
\begin{equation} \begin{array}{rl} \sum\limits_{r \in R_{\text{ROLL}}(s)} P_{{\text{ROLL}},r}(s) \max\limits_{c \in r} W_i(s') &= \sum\limits_{r \in R_{\text{FARKLE}}(n)} {1 \over {6^n}} W_i(s') + \sum\limits_{r \in R_{\text{SCORE}}(n)} {1 \over {6^n}} \max\limits_{c \in r} W_i(s') \\ &= {1 \over {6^n}} \sum\limits_{r \in R_{\text{FARKLE}}(n)} W_i(s') + {1 \over {6^n}} \sum\limits_{r \in R_{\text{SCORE}}(n)} \max\limits_{c \in r} W_i(s') \end{array} \end{equation}After a farkle, it becomes your opponent's turn and there is again no expression for the new game state. We again instead express $W_i(s')$ in terms of your opponents win probability after your farkle:
$$W_i(s') = 1  W_i(0, 6, d, b', e, f')$$where $b'$ is your new banked score after your farkle, which may have decreased due to the three consecutive farkle penalty, but is enforced to always be at least $L$ to keep the problem tractable:
$$b' = \max [ L, b  Y_f ]\text{;}$$ where $Y_f$ is the number of points you lose from your banked score when you farkle while already having $f$ consecutive farkles $$\begin{align*} Y_0 &= 0 \\ Y_1 &= 0 \\ Y_2 &= 500\text{;} \end{align*}$$and where $f'$ is your new consecutive farkle count which normally just increments, but is reset back to zero if you just had your third consecutive farkle
$$f' = (f+1) \mod 3\text{.}$$Note also that for all farkling rolls the expression inside the sum is the same, so we can replace the sum with a multiplicative factor equaling the number of ways to roll a farkle with $n$ dice. Combining that count with the ${1 \over {6^n}}$ simplifies to the probability of farkling with $n$ dice^{3}:
$$\begin{align*} F_1 &= 2/3 \\ F_2 &= 4/9 \\ F_3 &= 5/18 \\ F_4 &= 17/108 \\ F_5 &= 25/324 \\ F_6 &= 5/216 \\ \end{align*}$$Combining the above observations gives this substitution:
$${1 \over {6^n}} \sum\limits_{r \in R_{\text{FARKLE}}(n)} W_i(s') = F_n (1  W_i(0,6,d,b',e,f'))$$To detail the expression for the case of scoring rolls, first let $C_T(c)$ be the number of points taken with scoring combination $c$, and let $C_N(c)$ be the number of dice used with scoring combination $c$. So after rolling roll $r \in R_{\text{SCORE}}(n)$ and selecting scoring action $c \in r$ the state transitions from $s = (t, n, b, d, f, e)$ to
$$s' = (t', n', b, d, f, e)$$where
$$ \begin{align*} t' &= t+C_T(c) \\ n' &= h(nC_N(c)) \end{align*} $$and where $h(x)$ is a hotdice function for resetting the number of available dice back to $6$ when all dice are successfully scored:
$$h(n)=\begin{cases}6, & \text{for $n = 0$.} \\ n, & \text{otherwise.}\end{cases}$$This gives our final value iteration equation, where (below) I repeat all the supporting equations for convenience:
$$ W_{i+1}(t,n,b,d,f,e) := \left\{ \begin{array}{ll} F_n (1  W_i(0,6,d,bY_f,e,f')) + \\ \hspace{30pt} {1 \over 6^n} \sum\limits_{r \in R_{\text{SCORE}}(n)} \max\limits_{c \in r} \left[ W_i(t',n',b,d,f,e)\right], & \text{if $t < 300$}.\\ \\ \max \bigg[ \Big(1  W_i(0,6,d,b+t,e,0)\Big), \\ \hspace{30pt} \Big( F_n (1  W_i(0,6,d,b',e,f')) + \\ \hspace{60pt} {1 \over 6^n} \sum\limits_{r \in R_{\text{SCORE}}(n)} \max\limits_{c \in r} \left[ W_i(t',n',b,d,f,e) \right] \Big) \bigg], & \text{if $t \ge 300$}. \end{array} \right. $$where
$$ \begin{align*} b' &= \max [ L, b  Y_f ] \\ f' &= (f+1) \mod 3 \\ t' &= t+C_T(c) \\ n' &= h(nC_N(c)) \\ h(n) &=\begin{cases}6, & \text{for $n = 0$.} \\ n, & \text{otherwise.}\end{cases} \\ Y_0 &= 0 \\ Y_1 &= 0 \\ Y_2 &= 500 \\ F_1 &= 2/3 \\ F_2 &= 4/9 \\ F_3 &= 5/18 \\ F_4 &= 17/108 \\ F_5 &= 25/324 \\ F_6 &= 5/216 \\ \end{align*} $$Performance
With a lower bound of $L=2500$ there are 423,765,000 game states. Each state is modeled with a double precision floating point number requiring 8 bytes, so the entire state matrix requires 3.39012 billion bytes of RAM (plus some array overhead). With game goal $g$ the number of states grows as $(gL)^3$. Lowering $L$ from 2500 to, say, 10,000 (to eliminate all reasonable doubt that you'll ever venture into portions of the calculated strategy that are nonoptimal) will increase the number of game states to over 1.7 billion and memory requirements to 14 GB (which is more than I have on any of my computers).
The value iteration software to solve the optimal farkle problem was written in C++. Obvious optimizations were made. For example, I don't actually iterate over possible rolls, but only over a precalculated set of unique score sets (which is orders of magnitude smaller in count than rolls), and weighting each score set by the number of rolls that share that same set. I let the program iterate over all game states until the maximum relative change over all states from one iteration to the next was less than 1 part in a billion. (I.e., iterations continued until no state had a change in value from one iteration to the next of more than 1 part in a billion, so that even those states with extremely small win probabilities were calculated with precision.) With the floor set to 2500 points, it took 62 iterations for the matrix to converge. Using one core of a dualcore Intel I3 4130T, the software performed 1.30 million state updates per second, each pass of the matrix taking 5 minutes 26 seconds, and convergence taking 5 hours 37 minutes.
The Strategy
It is not practical to list the win probabilities and banking rules for all of the half billion game states in two player farkle. Here I provide only a limited view into the complete strategy by means of five examples. Each of the five subsections below provide a 2 dimensional slice of the 6 dimensional strategy matrix, sufficient for optimal play of a single turn for a particular startofturn game state.
Optimal Strategy Case: $b = 0, d = 0, f = 0, e = 0$
Table 1 shows the win probabilities and banking actions needed to play an opening turn optimally. Each cell shows the win probability for different turn scores (starting at $t=0$ in the top row and increasing down the page) and/or different number of dice to roll (starting at $n=6$ in the first column and decreasing as you move to the right). For all entries in this table, the other four component state variables are fixed: your banked score (b) is fixed at 0 points, your opponent's banked score (d) at 0 points, your consecutive farkle count (f) at 0, and your opponent's consecutive farkle count (e) at zero. States shaded green are states from which your optimal banking action is to roll. States shaded red are states from which your optimal banking action is to bank. States with an asterisk are inaccessible and can be ignored (although the software still calculates your win probability in, and optimal play out of these states which would be useful if, say through a disruption in the laws of the universe, you somehow find yourself in such a state).
t  n  

6  5  4  3  2  1  
0  0.534870  0.506721^{*}  0.493622^{*}  0.487801^{*}  0.486163^{*}  0.489950^{*} 
50  0.540099^{*}  0.511005  0.495849^{*}  0.489177^{*}  0.487586^{*}  0.491718^{*} 
100  0.545359^{*}  0.515776  0.499374  0.490798^{*}  0.489034^{*}  0.493525^{*} 
150  0.550709^{*}  0.520616^{*}  0.503815  0.493135  0.490494^{*}  0.495375^{*} 
200  0.556198^{*}  0.525472^{*}  0.508564  0.497158  0.492462  0.497250^{*} 
250  0.561811^{*}  0.530484^{*}  0.513285^{*}  0.502074  0.495439  0.499128 
300  0.567448  0.535760^{*}  0.518054^{*}  0.506680  0.503290  0.503290 
350  0.573082  0.541181  0.523166^{*}  0.511296^{*}  0.509711  0.509711 
400  0.578710  0.546604  0.528579  0.516146  0.516146  0.516146 
450  0.584332  0.552028  0.533996  0.522592  0.522592  0.522592 
500  0.589966  0.557452  0.539418  0.529048  0.529048  0.529048 
550  0.595675  0.562873  0.544842  0.535513  0.535513  0.535513 
600  0.601436  0.568338  0.550268  0.541986  0.541986  0.541986 
650  0.607195  0.573937  0.555693  0.548463  0.548463  0.548463 
700  0.612941  0.579588  0.561115  0.554944  0.554944  0.554944 
750  0.618671  0.585232  0.566534  0.561427  0.561427  0.561427 
800  0.624400  0.590866  0.571948  0.567910  0.567910  0.567910 
850  0.630173  0.596489  0.577354  0.574392  0.574392  0.574392 
900  0.635987  0.602137  0.582753  0.580870  0.580870  0.580870 
950  0.641793  0.607914  0.588142  0.587343  0.587343  0.587343 
1000  0.647615  0.613782  0.593809  0.593809  0.593809  0.593809 
1050  0.653417  0.619633  0.600266  0.600266  0.600266  0.600266 
1100  0.659192  0.625466  0.606713  0.606713  0.606713  0.606713 
1150  0.664939  0.631280  0.613147  0.613147  0.613147  0.613147 
1200  0.670657  0.637072  0.619567  0.619567  0.619567  0.619567 
1250  0.676345  0.642842  0.625970  0.625970  0.625970  0.625970 
1300  0.682002  0.648587  0.632356  0.632356  0.632356  0.632356 
1350  0.687625  0.654306  0.638721  0.638721  0.638721  0.638721 
1400  0.693212  0.659998  0.645065  0.645065  0.645065  0.645065 
1450  0.698762  0.665660  0.651385  0.651385  0.651385  0.651385 
1500  0.704289  0.671292  0.657681  0.657681  0.657681  0.657681 
1550  0.709840  0.676892  0.663950  0.663950  0.663950  0.663950 
1600  0.715350  0.682458  0.670190  0.670190  0.670190  0.670190 
1650  0.720818  0.687989  0.676400  0.676400  0.676400  0.676400 
1700  0.726243  0.693483  0.682579  0.682579  0.682579  0.682579 
1750  0.731622  0.698938  0.688723  0.688723  0.688723  0.688723 
1800  0.736955  0.704354  0.694832  0.694832  0.694832  0.694832 
1850  0.742239  0.709727  0.700904  0.700904  0.700904  0.700904 
1900  0.747472  0.715058  0.706936  0.706936  0.706936  0.706936 
1950  0.752655  0.720345  0.712927  0.712927  0.712927  0.712927 
2000  0.757846  0.725587  0.718875  0.718875  0.718875  0.718875 
2050  0.763028  0.730782  0.724781  0.724781  0.724781  0.724781 
2100  0.768162  0.735929  0.730640  0.730640  0.730640  0.730640 
2150  0.773241  0.741026  0.736453  0.736453  0.736453  0.736453 
2200  0.778262  0.746072  0.742217  0.742217  0.742217  0.742217 
2250  0.783224  0.751066  0.747932  0.747932  0.747932  0.747932 
2300  0.788127  0.756006  0.753594  0.753594  0.753594  0.753594 
2350  0.792969  0.760892  0.759204  0.759204  0.759204  0.759204 
2400  0.797788  0.765722  0.764759  0.764759  0.764759  0.764759 
2450  0.802600  0.770495  0.770258  0.770258  0.770258  0.770258 
2500  0.807367  0.775700  0.775700  0.775700  0.775700  0.775700 
2550  0.812069  0.781084  0.781084  0.781084  0.781084  0.781084 
2600  0.816707  0.786408  0.786408  0.786408  0.786408  0.786408 
2650  0.821278  0.791671  0.791671  0.791671  0.791671  0.791671 
2700  0.825781  0.796871  0.796871  0.796871  0.796871  0.796871 
2750  0.830218  0.802008  0.802008  0.802008  0.802008  0.802008 
2800  0.834586  0.807081  0.807081  0.807081  0.807081  0.807081 
2850  0.838886  0.812087  0.812087  0.812087  0.812087  0.812087 
2900  0.843116  0.817025  0.817025  0.817025  0.817025  0.817025 
2950  0.847277  0.821894  0.821894  0.821894  0.821894  0.821894 
3000  0.851367  0.826696  0.826696  0.826696  0.826696  0.826696 
3050  0.855387  0.831428  0.831428  0.831428  0.831428  0.831428 
3100  0.859336  0.836090  0.836090  0.836090  0.836090  0.836090 
3150  0.863214  0.840682  0.840682  0.840682  0.840682  0.840682 
3200  0.867019  0.845202  0.845202  0.845202  0.845202  0.845202 
3250  0.870753  0.849650  0.849650  0.849650  0.849650  0.849650 
3300  0.874415  0.854025  0.854025  0.854025  0.854025  0.854025 
3350  0.878005  0.858327  0.858327  0.858327  0.858327  0.858327 
3400  0.881522  0.862555  0.862555  0.862555  0.862555  0.862555 
3450  0.884967  0.866709  0.866709  0.866709  0.866709  0.866709 
3500  0.888343  0.870788  0.870788  0.870788  0.870788  0.870788 
3550  0.891652  0.874791  0.874791  0.874791  0.874791  0.874791 
3600  0.894888  0.878719  0.878719  0.878719  0.878719  0.878719 
3650  0.898052  0.882571  0.882571  0.882571  0.882571  0.882571 
3700  0.901145  0.886347  0.886347  0.886347  0.886347  0.886347 
3750  0.904167  0.890047  0.890047  0.890047  0.890047  0.890047 
3800  0.907118  0.893671  0.893671  0.893671  0.893671  0.893671 
3850  0.909999  0.897219  0.897219  0.897219  0.897219  0.897219 
3900  0.912809  0.900691  0.900691  0.900691  0.900691  0.900691 
3950  0.915548  0.904088  0.904088  0.904088  0.904088  0.904088 
4000  0.918219  0.907409  0.907409  0.907409  0.907409  0.907409 
4050  0.920820  0.910655  0.910655  0.910655  0.910655  0.910655 
4100  0.923353  0.913826  0.913826  0.913826  0.913826  0.913826 
4150  0.925819  0.916920  0.916920  0.916920  0.916920  0.916920 
4200  0.928219  0.919939  0.919939  0.919939  0.919939  0.919939 
4250  0.930560  0.922884  0.922884  0.922884  0.922884  0.922884 
4300  0.932847  0.925756  0.925756  0.925756  0.925756  0.925756 
4350  0.935068  0.928555  0.928555  0.928555  0.928555  0.928555 
4400  0.937226  0.931282  0.931282  0.931282  0.931282  0.931282 
4450  0.939321  0.933937  0.933937  0.933937  0.933937  0.933937 
4500  0.941354  0.936521  0.936521  0.936521  0.936521  0.936521 
4550  0.943326  0.939035  0.939035  0.939035  0.939035  0.939035 
4600  0.945236  0.941479  0.941479  0.941479  0.941479  0.941479 
4650  0.947087  0.943854  0.943854  0.943854  0.943854  0.943854 
4700  0.948879  0.946160  0.946160  0.946160  0.946160  0.946160 
4750  0.950612  0.948400  0.948400  0.948400  0.948400  0.948400 
4800  0.952287  0.950573  0.950573  0.950573  0.950573  0.950573 
4850  0.953905  0.952679  0.952679  0.952679  0.952679  0.952679 
4900  0.955468  0.954721  0.954721  0.954721  0.954721  0.954721 
4950  0.956977  0.956699  0.956699  0.956699  0.956699  0.956699 
5000  0.958614  0.958614  0.958614  0.958614  0.958614  0.958614 
The game begins in the upper left cell at $t=0$ and $n=6$. The win probability listed here means that if both players play optimally, the player going first wins 53.487% of the time. To play a turn, first you must choose your banking action. Since cell (t=0, n=6) is green, the optimal banking action is to roll.
Second, you roll the dice. Let's say for your opening roll, you throw:
$$6, 5, 3, 3, 3, 2$$Third, you must choose how to score the dice. Here you have three choices:
 take the 5 for 50 points,
 take the three 3 for 300 points, or
 take the 5 and the three 3 for 350 points.
To determine your optimal scoring choice, you must look at the win probabilities from the three states corresponding to each scoring option:
$$ \begin{align*} W(t=50, n=5) &= 0.511005\\ W(t=300, n=3) &= 0.506680\\ W(t=350, n=2) &= 0.509711 \end{align*} $$So in this case, you should score only the 5 since it moves you to the state with the best probability of ultimately winning the game. This ends one primaryaction, randomresponse, secondaryaction sequence effecting one state change in the EMDP. This process is then simply repeated (starting with another banking decision from the new state (t=50, n=5) until you either roll a farkle, or end in a state where you're supposed to bank.
It is interesting to see how the optimal play strategy for the opening turn differs from the strategy that simply maximizes your expected farkle turn score. To maximize expected score you wouldn't bank with 6 dice available to roll unless you had 16,400 or more points on the turn. In contrast, to optimize your chances of winning a game, you would bank 5000 or more points on your opening turn even when you have six dice available to roll. The banking threshold for 5 dice is similarly reduced from 3050 to a more conservative 2500. The banking threshold for other available dice counts do not differ.
Note that if you manage to bank 1000 points on your opening turn, your chances of winning increase to almost 60%; banking 1850 points on your first turn leaves you with better than 70% chance of ultimately winning; bank 2750 points and your chances of winning are above 80%; and manage to put up 3900 points on the opening turn and you're win probability is over 90%.
Optimal Strategy Case: $b = 6000, d = 8000, f = 0, e = 0$
As a second example, consider the case of Table 2 which shows how to optimally play a turn where you start with 6000 points and are 2000 points behind to your opponent who has 8000 points.
t  n  

6  5  4  3  2  1  
0  0.162365  0.135452^{*}  0.124658^{*}  0.118916^{*}  0.116823^{*}  0.120172^{*} 
50  0.167053^{*}  0.138098  0.126521^{*}  0.120462^{*}  0.118340^{*}  0.121980^{*} 
100  0.172282^{*}  0.141184  0.128485  0.122068^{*}  0.119912^{*}  0.123894^{*} 
150  0.177898^{*}  0.144859^{*}  0.130513  0.123738  0.121540^{*}  0.125881^{*} 
200  0.183766^{*}  0.149237^{*}  0.133183  0.125473  0.123237  0.127935^{*} 
250  0.189820^{*}  0.153963^{*}  0.136653^{*}  0.127264  0.125005  0.130069 
300  0.196092  0.158915^{*}  0.140895^{*}  0.130017  0.126838  0.132298 
350  0.202623  0.164030  0.145391^{*}  0.133939^{*}  0.128728  0.134616 
400  0.209467  0.169363  0.150024  0.138163  0.133991  0.137010 
450  0.216533  0.174975  0.154798  0.142503  0.139457  0.139475 
500  0.223831  0.180897  0.159828  0.146975  0.145115  0.145115 
550  0.231322  0.187028  0.165248  0.151603  0.150876  0.150876 
600  0.239035  0.193362  0.171012  0.156843  0.156843  0.156843 
650  0.246918  0.199976  0.176975  0.163015  0.163015  0.163015 
700  0.255037  0.206817  0.183154  0.169453  0.169453  0.169453 
750  0.263357  0.213830  0.189533  0.176109  0.176109  0.176109 
800  0.271945  0.221070  0.196123  0.183059  0.183059  0.183059 
850  0.280741  0.228460  0.202874  0.190242  0.190242  0.190242 
900  0.289790  0.236097  0.209824  0.197712  0.197712  0.197712 
950  0.298954  0.243918  0.216938  0.205281  0.205281  0.205281 
1000  0.308358  0.251943  0.224242  0.213245  0.213245  0.213245 
1050  0.317895  0.260087  0.231740  0.221146  0.221146  0.221146 
1100  0.327785  0.268509  0.239462  0.229404  0.229404  0.229404 
1150  0.337937  0.277274  0.247380  0.237923  0.237923  0.237923 
1200  0.348393  0.286386  0.255529  0.246764  0.246764  0.246764 
1250  0.358985  0.295733  0.263897  0.255751  0.255751  0.255751 
1300  0.369743  0.305337  0.272544  0.264989  0.264989  0.264989 
1350  0.380541  0.315057  0.281403  0.274404  0.274404  0.274404 
1400  0.391642  0.324961  0.290466  0.284353  0.284353  0.284353 
1450  0.402933  0.334964  0.299633  0.294503  0.294503  0.294503 
1500  0.414660  0.345257  0.308958  0.305091  0.305091  0.305091 
1550  0.426484  0.355749  0.318423  0.315621  0.315621  0.315621 
1600  0.438586  0.366590  0.328154  0.326360  0.326360  0.326360 
1650  0.450706  0.377696  0.338163  0.337112  0.337112  0.337112 
1700  0.463015  0.389278  0.348576  0.348173  0.348173  0.348173 
1750  0.475329  0.401117  0.359573  0.359573  0.359573  0.359573 
1800  0.488007  0.413244  0.371759  0.371759  0.371759  0.371759 
1850  0.500848  0.425457  0.384134  0.384134  0.384134  0.384134 
1900  0.514151  0.437907  0.397273  0.397273  0.397273  0.397273 
1950  0.527546  0.450291  0.410346  0.410346  0.410346  0.410346 
2000  0.541169  0.463032  0.423910  0.423910  0.423910  0.423910 
2050  0.553892  0.475650  0.435848  0.435848  0.435848  0.435848 
2100  0.566687  0.488532  0.448671  0.448671  0.448671  0.448671 
2150  0.579544  0.501372  0.462112  0.462112  0.462112  0.462112 
2200  0.593075  0.514412  0.476647  0.476647  0.476647  0.476647 
2250  0.606845  0.527568  0.490860  0.490860  0.490860  0.490860 
2300  0.620874  0.541235  0.505044  0.505044  0.505044  0.505044 
2350  0.634230  0.554962  0.518525  0.518525  0.518525  0.518525 
2400  0.647380  0.568960  0.532648  0.532648  0.532648  0.532648 
2450  0.659830  0.582404  0.547160  0.547160  0.547160  0.547160 
2500  0.672922  0.595574  0.563571  0.563571  0.563571  0.563571 
2550  0.685577  0.608065  0.579471  0.579471  0.579471  0.579471 
2600  0.699046  0.620640  0.594828  0.594828  0.594828  0.594828 
2650  0.712677  0.633358  0.608315  0.608315  0.608315  0.608315 
2700  0.727273  0.647372  0.621304  0.621304  0.621304  0.621304 
2750  0.740566  0.661998  0.632781  0.632781  0.632781  0.632781 
2800  0.752872  0.677539  0.646547  0.646547  0.646547  0.646547 
2850  0.763740  0.691731  0.661928  0.661928  0.661928  0.661928 
2900  0.775267  0.704645  0.683575  0.683575  0.683575  0.683575 
2950  0.787428  0.716595  0.704022  0.704022  0.704022  0.704022 
3000  0.801293  0.729617  0.722026  0.722026  0.722026  0.722026 
3050  0.812468  0.740574  0.732915  0.732915  0.732915  0.732915 
3100  0.824009  0.752452  0.744321  0.744321  0.744321  0.744321 
3150  0.834312  0.764049  0.754637  0.754637  0.754637  0.754637 
3200  0.844465  0.775780  0.768552  0.768552  0.768552  0.768552 
3250  0.854392  0.785990  0.782612  0.782612  0.782612  0.782612 
3300  0.865209  0.800609  0.800609  0.800609  0.800609  0.800609 
3350  0.875145  0.811712  0.811712  0.811712  0.811712  0.811712 
3400  0.887417  0.826357  0.826357  0.826357  0.826357  0.826357 
3450  0.896088  0.833481  0.833481  0.833481  0.833481  0.833481 
3500  0.907011  0.847263  0.847263  0.847263  0.847263  0.847263 
3550  0.912996  0.864340  0.864340  0.864340  0.864340  0.864340 
3600  0.918091  0.883970  0.883970  0.883970  0.883970  0.883970 
3650  0.920604  0.894832  0.894832  0.894832  0.894832  0.894832 
3700  0.925351  0.903017  0.903017  0.903017  0.903017  0.903017 
3750  0.933956  0.903045  0.903045  0.903045  0.903045  0.903045 
3800  0.950436  0.903078  0.903078  0.903078  0.903078  0.903078 
3850  0.962853  0.903098  0.903098  0.903098  0.903098  0.903098 
3900  0.973675  0.917497  0.903124  0.903124  0.903124  0.903124 
3950  0.979061  0.930203  0.903136  0.903136  0.903136  0.903136 
Given your 2000 point defecit, it is interesting to see how much more agressively this turn should be played compared to an opening turn: the banking thresholds here for 6, 5, 4, 3, 2 and 1 dice are, respectively, 4000, 3350, 1750, 600, 400, and 500 points. Compare that to the banking thresholds for an opening turn at 5000, 2500, 1000, 350, 300, and 300 points.
Another interesting observation about this strategy is that although you should bank between 3300 and 3850 points when you have 5 dice to roll, you should instead roll if you have 3900 to 3950 points (because you have such a high probability of closing out the game).
Optimal Strategy Case: $b = 8000, d = 6000, f = 0, e = 0$
Now consider the reverse of the situation in the previous example. Table 3 shows how to optimally play a turn where you start with 8000 points and are 2000 points ahead of your opponent who has 6000 points.
t  n  

6  5  4  3  2  1  
0  0.903422  0.878090^{*}  0.862982^{*}  0.857184^{*}  0.856210^{*}  0.860061^{*} 
50  0.907970^{*}  0.883520  0.866464^{*}  0.858208^{*}  0.857345^{*}  0.861539^{*} 
100  0.912448^{*}  0.888520  0.872004  0.860637^{*}  0.858466^{*}  0.863007^{*} 
150  0.916837^{*}  0.893230^{*}  0.877734  0.864994  0.859560^{*}  0.864460^{*} 
200  0.921258^{*}  0.898001^{*}  0.882899  0.871213  0.863650  0.865889^{*} 
250  0.925550^{*}  0.902725^{*}  0.887727^{*}  0.877180  0.868829  0.867273 
300  0.929835  0.907504^{*}  0.892777^{*}  0.882276  0.882276  0.882276 
350  0.933851  0.912127  0.897734^{*}  0.888668^{*}  0.888668  0.888668 
400  0.937676  0.916796  0.902726  0.894905  0.894905  0.894905 
450  0.941287  0.921250  0.907616  0.900824  0.900824  0.900824 
500  0.944922  0.925413  0.912287  0.907268  0.907268  0.907268 
550  0.948355  0.929370  0.916532  0.913595  0.913595  0.913595 
600  0.951784  0.933391  0.920430  0.919922  0.919922  0.919922 
650  0.955073  0.937177  0.925407  0.925407  0.925407  0.925407 
700  0.958290  0.940961  0.930450  0.930450  0.930450  0.930450 
750  0.961368  0.944716  0.934554  0.934554  0.934554  0.934554 
800  0.964239  0.948516  0.938914  0.938914  0.938914  0.938914 
850  0.966594  0.952008  0.943151  0.943151  0.943151  0.943151 
900  0.968986  0.955211  0.948312  0.948312  0.948312  0.948312 
950  0.971335  0.958000  0.953239  0.953239  0.953239  0.953239 
1000  0.973780  0.960716  0.958124  0.958124  0.958124  0.958124 
1050  0.975945  0.963165  0.961202  0.961202  0.961202  0.961202 
1100  0.978076  0.965682  0.964291  0.964291  0.964291  0.964291 
1150  0.979977  0.968029  0.966812  0.966812  0.966812  0.966812 
1200  0.981763  0.970175  0.970114  0.970114  0.970114  0.970114 
1250  0.983328  0.973408  0.973408  0.973408  0.973408  0.973408 
1300  0.984740  0.977098  0.977098  0.977098  0.977098  0.977098 
1350  0.986091  0.979125  0.979125  0.979125  0.979125  0.979125 
1400  0.987692  0.981344  0.981344  0.981344  0.981344  0.981344 
1450  0.988993  0.982226  0.982226  0.982226  0.982226  0.982226 
1500  0.990373  0.984029  0.984029  0.984029  0.984029  0.984029 
1550  0.991218  0.985999  0.985999  0.985999  0.985999  0.985999 
1600  0.991737  0.989914  0.989914  0.989914  0.989914  0.989914 
1650  0.992073  0.992073  0.992073  0.992073  0.992073  0.992073 
1700  0.992947  0.992947  0.992947  0.992947  0.992947  0.992947 
1750  0.992964  0.992964  0.992964  0.992964  0.992964  0.992964 
1800  0.993911  0.992981  0.992981  0.992981  0.992981  0.992981 
1850  0.994722  0.992990  0.992990  0.992990  0.992990  0.992990 
1900  0.995640  0.992998  0.992998  0.992998  0.992998  0.992998 
1950  0.996180  0.993000  0.993000  0.993000  0.993000  0.993000 
In this example, you have a 2000 point lead and are yourself only 2000 points from winning the game. It is interesting to see how much more conservatively this turn should be played compared to an opening turn: the banking thresholds here for the different number of dice to throw are 1650, 1250, 650, 300, 300, and 300 points. Compare that to the banking thresholds for an opening turn at 5000, 2500, 1000, 350, 300, and 300 points.
As in the previous example, there is another banking rule inversion in this table, but this time for the case of 6 dice, where at point values above 1750 it again becomes advantageous to roll to try to end the game.
Optimal Strategy Case: $b = 9000, d = 9500, f = 0, e = 0$
As a fourth example, consider the case of Table 4 which shows how to optimally play a turn where you start with 9000 points and are 500 points behind your opponent who has 9500 points.
t  n  

6  5  4  3  2  1  
0  0.454366  0.351125^{*}  0.303308^{*}  0.277271^{*}  0.270069^{*}  0.285659^{*} 
50  0.463700^{*}  0.358576  0.309820^{*}  0.283131^{*}  0.274544^{*}  0.289393^{*} 
100  0.475303^{*}  0.366847  0.316369  0.289394^{*}  0.280312^{*}  0.294362^{*} 
150  0.486102^{*}  0.376478^{*}  0.323286  0.295374  0.286679^{*}  0.300901^{*} 
200  0.505119^{*}  0.388392^{*}  0.331770  0.301490  0.292906  0.309194^{*} 
250  0.525335^{*}  0.399404^{*}  0.342403^{*}  0.308335  0.298948  0.317272 
300  0.554879  0.417884^{*}  0.354687^{*}  0.317757  0.305197  0.325207 
350  0.573800  0.432225  0.366843^{*}  0.329936^{*}  0.313516  0.332809 
400  0.602493  0.454655  0.382838  0.343746  0.325562  0.341277 
450  0.619405  0.462471  0.391661  0.356792  0.340660  0.354200 
500  0.653307  0.481158  0.404518  0.367897  0.356345  0.372251 
550  0.696939  0.489452  0.408392  0.376025  0.370389  0.393025 
600  0.761617  0.519126  0.416628  0.381381  0.381505  0.412584 
650  0.821580  0.577904  0.419050  0.384455  0.389121  0.429136 
700  0.878974  0.671873  0.455551  0.391137  0.393701  0.441170 
750  0.920889  0.759474  0.538352  0.391141  0.396326  0.448721 
800  0.951179  0.838237  0.658773  0.453755  0.398829  0.452947 
850  0.966192  0.886061  0.752905  0.563689  0.401559  0.455471 
900  0.976536  0.921141  0.831614  0.696407  0.522215  0.459381 
950  0.981336  0.937788  0.873088  0.776038  0.641661  0.462492 
I find it interesting that the only time you should bank during such a turn is if you have exactly 3 dice to roll and either 700 or 750 points on the turn. Why does it make sense to roll just 1 or 2 dice with those same number of points? And why does it make sense to roll three dice when you have 800 or more points?
To answer the first question, observe that it's easier to ultimately hotdice (score all dice thrown) when you are rolling only 1 or 2 dice than when you are rolling 3. If you can manage to get back to 6 dice, your chances of ending the game this turn increase dramatically. Furthermore, any additional game points you gain by rolling 3 dice are meaningless unless you hotdice. If only 1 or 2 of the 3 thrown dice score, you will still find yourself below the game goal and (if you then decide to bank these few extra points) will not have even reduced the number of points you have to accumulate on your subsequent turn due to the minimum banking threshold.
To answer the second question, note that it's possible to reach 10,000 points without hotdice: in particular there's a 7% chance of rolling two ones and a third nonscoring die. These rolls end the game if you have 9800 points.
Optimal Strategy Case: $b = 9500, d = 9000, f = 0, e = 0$
Now consider the reverse of the situation in the previous example. Table 3 shows how to optimally play a turn where you start with 9500 points and are 500 points ahead of your opponent who has 9000 points.
t  n  

6  5  4  3  2  1  
0  0.801016  0.702303^{*}  0.658594^{*}  0.637598^{*}  0.630975^{*}  0.640081^{*} 
50  0.826173^{*}  0.706846  0.660815^{*}  0.642258^{*}  0.639026^{*}  0.652003^{*} 
100  0.863321^{*}  0.724040  0.665063  0.645329^{*}  0.645400^{*}  0.663218^{*} 
150  0.897707^{*}  0.757952^{*}  0.666217  0.647091  0.649767^{*}  0.672708^{*} 
200  0.930612^{*}  0.811876^{*}  0.687617  0.648362  0.652392  0.679607^{*} 
250  0.954643^{*}  0.862100^{*}  0.735325^{*}  0.649657  0.653897  0.683936 
300  0.972009  0.907257^{*}  0.804365^{*}  0.686823  0.655332  0.686359 
350  0.980617  0.934675  0.858333^{*}  0.749851^{*}  0.656897  0.687806 
400  0.986548  0.954788  0.903460  0.825942  0.726073  0.690048 
450  0.989300  0.964332  0.927238  0.871597  0.794555  0.691832 
Here I found it surprising that even though you have a 500 point lead and control of the dice, the optimal strategy requires that you never bank until you've reached the gamewinning 500 points. This is surprisingly more aggressive play than on an opening turn where the score is tied. I would have guessed that banking 300 points when faced with rolling 1 or 2 dice would have been better to avoid the highprobability farkle, then allow your opponent a chance to force his way to a 1000 point turn, and assuming he fails, follow up with a highprobability minimum bank turn to win. It turns out if your opponent plays perfectly, he has about a 1/3 chance of reaching 1000 points to steal the win, and so playing a little more aggressively to attempt to end the game and deny him that possible steal is advantageous.
Also note how for many point totals, you are better off with fewer dice. This is often the case when playing turns where your best play is to try to reach 10,000 points and end the game, and also in cases where you are just really far behind an opponent that is fast approaching a game winning score.
Strategy Validation
In this section I share the efforts I've made to sanity check the calculated strategies for reasonableness, and to determine the deviation between the calculated strategy and the truly optimal strategy. This section definitely gets into the weeds and is not for the faint of heart, but may be of interest to skeptics or the particularly nerdy (of which I am both).
Comparison to strategy that maximizes expected turn score
Strategies that maximize expected turn score for farkle variants have been independently calculated with consistent results^{4,5,6}. These strategies do not consider your current banked score, your opponent's banked score or farkle count, or the end of game scoring goal. Still, one would expect that for an opening turn (where the score is tied at zero and where the endofgame scoring goal is distant) that the game winning strategy documented here and the strategy that maximizes expected turn score would be similar for low turn score states. Furthermore, the two strategies should converge if the game scoring goal is increased from 10,000 points towards infinity. Unfortunately, the 10,000 point goal was already stretching both the memory limits and processing power of my computer. To reduce the size of the state space, I eliminated the three consecutive farkle penalty. This effectively eliminates the f and e state variables, reducing the size of the state space by a factor of 9, and also eliminates the need to model banked score states of less than 0 points, which shrinks things even more.
My Farkle Strategy Generator (FSG)^{6} produces farkle strategies that maximize expected turn score. The FSG outputs strategy tables that prioritize each turn state with a sequential preference number. Those states associated with higher expected turn scores have a higher preference number. To follow the strategy, you score thrown dice to move to the state with the highest number. This is exactly the same process you use to follow the game winning strategy presented in this document: you score your dice to move to the state with the highest probability of winning. By indexing the states from lowest win probability to highest, the two strategies can easily be compared.
First compare the two strategies for the case of a 10,000 point goal. Table 6W shows the strategy that maximizes your chances of winning the game (produced by value iteration) for an opening turn. Table 6E shows the strategy that maximizes expected turn score (produced by the FSG). Turn scores in both cases are truncated to a maximum of 1500. States with different preference numbers are highlighted. There are numerous slight differences, but they are obviously very similar: the banking thresholds are almost identical, and the state preference numbers deviate only slightly.


Tables 7W and 7E compare these same two strategies but with the scoring goal increased to 30,000 points. Here the two strategies are seen to be almost identical for these low turn score states. The two strategies do appear to be converging as the scoring goal increases. While this doesn't assure strategy correctness, it offers convincing evidence of correctness in at least this limiting case.


Effects of the banked score lower bound
The calculated strategy is nonoptimal when playing in regions of the state space where either your banked score, or your opponent's banked score is near the lower bound. This discrepancy between the calculated strategy and the truly optimal strategy is further exacerbated as the consecutive farkle count associated with a banked score near the bound increases.
To estimate the extent of the discrepancy, I lowered the bound on banked scores from 2500 to 3000 and regenerated the strategy matrix. One expects the relative deviation between the win probabilities of corresponding states in the two strategies to decrease as the minimum of the player's banked scores increases. To see the rate of convergence, for each possible banking score $B$ ($2500 <= B < 10,000$), I found the maximum relative deviation between win probabilities (calculated as ${{2w1w2}\over{w1+w2}}$) over all game states satisfying $min(b,d) >= B$. Table 8 lists selected results. Note that each time B increases by 500 points (which just happens to be the triple farkle penalty), the maximum relative deviation consistently decreased by 2 orders of magnitude (with one of those two orders of magnitude coming with the last 50 point increase). Given this exponential convergence rate, for any given $B$, the maximum relative deviation between the strategies for $L=2500$ and $L=\infty$ must be only slightly greater (on the order of 1%) than the maximum relative deviation between the strategies for $L=2500$ and $L=3000$.
B  Maximum Relative Deviation 

2500  0.296463834 
2050  0.028572943 
2000  0.003438359 
1550  0.000249319 
1500  0.000029516 
1050  0.000002074 
1000  0.000000234 
550  0.000000016 
500  0.000000001 
Note that the state win probabilities for two calculated strategies do not have to be identical for the rules about when to bank, and when to roll, and how to score the dice in each situation to be to be identical. So how close do the calculated win probabilities and the optimal win probabilities need to be to make it likely we are playing with a truly optimal strategy? Using state preference charts like those shown in the previous section, I compared complete turn strategies for all combinations of state variables $b$, $d$, $f$, and $e$. If two state preference charts are identical, then the strategies are identical. Once I found a bizarre state preference discrepancy associated with a relative difference in win probability on the same order as the convergence threshold for the entire state matrix. I ignored this and any others like it that might be lurking in the tables. Table 9 shows the banked score and consecutive farkle count thresholds above which no more strategy discrepancies otherwise appeared. So for example if both you and your opponent have at most 1 consecutive farkle, and you both have banked scores of at least 1500 points, then there are no discrepancies between the two strategies.
So although there's no way to pick a lower bound $L$ that will assure all turn strategies will be truly optimal for a given working range of banked scores and consecutive farkle counts, driving the maximum relative deviation down to within about 1 part in 10^{8} seems to assure that the vast majority of turn strategies (and likely all of them) are optimal. Given the convergence rate seen above, this means that as long as both players are about 2000 points above the lower bound on banked scores, all turn strategies are likely to be optimal.
Minimum Banked Score  Maximum Farkle Count  Maximum Relative Deviation 

1500  0  0.000001175 
1200  1  0.000001529 
700  2  0.000000070 
Conclusions
Extended forms of a Markov Decision Process (MDP) and value iteration were devised that allow more efficient modeling of 2player Farkle than is possible with a standard MDP. Using this model, the strategy that maximizes win probability using Facebook scoring rules has been determined under the constraint that banked scores are lower bounded to a large negative score. The strategy is shown to be unaffected by this bound in regions where banked scores are at least 2000 points above the bound. The strategy appears to converge (as expected) to the strategy that maximizes expected turn score as the endofgame scoring goal is increased.
With nearly a half billion game states, and so many different dice rolls and scoring options possible from each of these different states, it wasn't immediately obvious whether the problem was solvable with a home computer. I was quite pleased when I saw the first value iteration update complete, and even more pleased when I saw the matrix was actually converging! It was a nice problem in the sense that it's complexity puts finding the solution at the limits of what can be done with a home computer (or at least at the limits of what I can do with a home computer).
The optimal strategy showed many expected behaviors including more aggressive play when behind, and more conservative play when ahead. But it also showed many behaviors that were unexpected to me, such as playing aggressively even with a sizable lead when nearing the game goal of 10,000 points. Numerous times I was convinced I had discovered a flaw in the strategy (and therefore either a bug in my code or an error in my analysis) only to realize later that the strategy was actually reasonable.
Next Steps
The very limited view I've offered into the game winning strategy is unsatisfying. Time permitting, I intend to develop a web page that will allow you to explore the gargantuan optimal Farkle strategy in detail, and perhaps even play a game of farkle against a perfect computer opponent. I am motivated by your interest, so please leave a comment if you'd like to see it.
A Facebook Farkle game is won by the first player to bank 10,000 or more points, but there are other variations to endofgame rules. In Zilch, for example, the other player always gets a final go at the dice. Some people like to assure that both players get the same number of turns. In this latter case, the optimal strategies played by the two players are not even the same — with the player going second clearly having a strategic advantage. I think any of these could be modeled, but they are each very different, and not currently supported by my solver. These are ickier problems, and I don't intend to work on them anytime soon.
References
 T Neller and C Presser, "Optimal Play of the Dice Game Pig," The UMAP Journal 25(1) (2004), pp. 2547.
 R. E. Bellman. Dynamic Programming. Princeton University Press, Princeton, NJ, 1957.
 Cap Khoury, Multinomial Coefficients and Farkle, Jul. 2009.
 M Busche, Maximizing Expected Scores in the Game of Zilch, Matt's Maniacal Musings, Aug. 2010.
 E Farmer, Analysis of Farkle, Possibly Wrong, Apr. 2013.
 Farkle Strategy Generator, Matt's Maniacal Musings.
I would love to see a website with all the statistics available!!
I love delving into these kind of games! ðŸ˜€
One of the problems is that everyone plays with slightly different rules. It takes several hours of computation time to solve for a single rule set, and with a half billion game states, each solution would eat up 4 gigs of my server’s hard disk too. I guess I could dedicate a couple of terrabyte hard drives to storing (and backing up) solutions to various game rules.
Does the 53.487 percent to win for the first player take into account the second player’s rebuttal after the first passes 10,000 points?
No. The first player to bank 10,000 or more points wins in this version of the game (consistent with Facebook Farkle game rules).
It would be fairly straightforward, however, to analyze the game rule variant you describe. It’s something I’ve thought about doing for some time. The best way, I think, is to start by precalculating the chance of a “rebuttal” succeeding for all possible point deficits that must be overcome in a single (final) turn. If, for example, the first player banks 10,200 points whilst the second player had 9,000 points, then you are interested in the probability of the second player (given a final turn) to successfully walk the safest path to 1,250 or more points in a single turn. (Incidentally, my FSG can be configured to do this — I describe how in one of my other comments on this website somewhere.) Do this for all possible deficits from 100, to something absurdly large (although a few thousand points would probably be sufficient as it is surely suboptimal to continue rolling with 15,000 points, so higher point game states would never be explored by an optimal player).
Then the win probability matrix would then be extended to include game states at and above 10,000 points for the active player (i.e., the player whose turn it is). And the win probability associated with banking in these states would be fixed with the rebuttal probabilities we just calculated. The win probability for these super10,000 point game states could be calculated in a single downward pass (since the win probabilities for banking from each them are fixed and known).
So, anyway, I think this would be fairly straightforward to do; and would require computation time that is negligible compared to what’s already being done. It’s a flexibility I’ve thought about adding to my Farkle analysis tools, but haven’t, as yet, had sufficient motivation to pursue it.
Just a heads up to anyone else who’s confused by this:
There’s an error below “Optimal Strategy Case: b=0,d=0,f=0,e=0”, where the win probabilities are listed as:
W(t=50,n=5)W(t=300,n=5)W(t=350,n=5)=0.511005=0.506680=0.509711
I felt really stupid when I couldn’t figure out how n could be 5 for all three actions, but I looked up the numbers in the table and it’s simply supposed to be:
W(t=50,n=5)W(t=300,n=3)W(t=350,n=2)=0.511005=0.506680=0.509711
Hey Kasi,
Good job finding this cutnpaste error. I’ve corrected the document.
So in conclusion, I do what?
If what you’re asking is, “how the devil am I supposed to use all this to improve my farkle game play?”, then you’re right: although this article may be technically interesting to some, it is not terribly accessible if you’re interest is improving your farkle play.
Todd Neller and I wrote about a technique where you can closely approximate a maximum scoring strategy. For the rule set we used, the technique actually perfectly reproduces the maximum scoring strategy. Depending on the scoring rules, I suspect our technique may sometimes introduce small deviations from the true maximum scoring strategy, but these will certainly be rare, and can only introduce small reductions in expected score when they occur.
See section 4 of our article on optimal farkle play. Skip down to the paragraph that starts, “We have devised a means by which a human can play the maximum scoring
strategy perfectly…” That table V(n,t) provided is only good for the simple (and rarely used) rules described in our paper. But you can use my farkle strategy generator to generate maxscorestrategies for your exact rule set. Then click on the button E(S,N). The table revealed is the same table as V(n,t) described in our paper, except that is tuned for your rule set. To generate the estimation table, simply round each entry UP to the next 50point score point. So for example if E(S,N) is 139.951, round it up to 150. Replace all banking states with the value 0. Then as you scan down each column, write down the scores where the value changes from one 50 point value to the next. These are the break points you have to memorize. Put this information in the same layout as described in our paper, memorize that table and follow the instructions we give there.
This will easily let you play the maxscore strategy (or at least something very close to it). In our paper, we also describe goforit regions where you or you’re opponent are so close to going out, that it pays to just keep rolling until you’ve won or you’ve farkled. I currently do not provide a tool to identify these regions for you. I could possibly work on this if there’s significant interest, but my time is currently very limited.
Absolutely wonderful! I’m really happy to read this and to see the page with the strategy generator. Would love to see more things on Farkle. It’s quite fascinating. Thank you for this.