$26.99

## Description

1 Introduction

In this homework you will write a game and a game player using the techniques you have been learning in class. There are two major parts: writing the representation of the game, and writing a parallel version of the popular alpha-beta pruning algorithm. This will also continue to test your ability to use and understand the module system.

1.1 Getting The Homework Assignment

The starter files for the homework assignment have been distributed through our git repos- itory as usual.

1.2 Submitting The Homework Assignment

Submissions will be handled through Autolab, at

https://autolab.cs.cmu.edu

To submit your solutions, run make from the hw/11 directory (which should contain a code folder). This should produce a file hw11.tar, containing the files that should be handed in for this homework assignment. Open the Autolab web site, find the page for this assignment, and submit your hw11.tar file via the “Handin your work” link.

The Autolab handin script does some basic checks on your submission: making sure that the file names are correct; making sure that no files are missing; making sure that your PDF is valid; making sure that your code compiles cleanly. Note that the handin script is *not *a grading script—a timely submission that passes the handin script will be graded, but will not necessarily receive full credit. You can view the results of the handin script by clicking the number (usually either 0.0 or 1.0) corresponding to the “check” section of your latest handin on the “Handin History” page. If this number is 0.0, your submission failed the check script; if it is 1.0, it passed.

Your graphseq.sml, riskless.sml, alphabeta.sml, and jamboree.sml files must con- tain all the code that you want to have graded for this assignment, and must compile cleanly. If you have a function that happens to be named the same as one of the required functions but does not have the required type, it will not be graded.

1.3 Due Date

1.4 Methodology

You must use the five step methodology discussed in class for writing functions, for every

function you write in this assignment. Recall the five step methodology:

- 1. In the first line of comments, write the name and type of the function.

- 2. In the second line of comments, specify via a REQUIRES clause any assumptions about the arguments passed to the function.

- 3. In the third line of comments, specify via an ENSURES clause what the function com- putes (what it returns).

- 4. Implement the function.

- 5. Provide testcases, generally in the format

val <return value> = <function> <argument value>

For example, for the factorial function presented in lecture:

(* fact : int -> int

* REQUIRES: n >= 0

* ENSURES: fact(n) ==> n!

*)

fun fact (0 : int) : int = 1

| fact (n : int) : int = n * fact(n-1) (* Tests: *)

val 1 = fact 0

val 720 = fact 6

2 Views

2.1 Introduction to Views

As we have discussed many times, lists operations have bad parallel complexity, but the corresponding sequence operations are much better.

However, sometimes you want to write a sequential algorithm (e.g. because the inputs

aren’t very big, or because no good parallel algorithms are known for the problem). Given the sequence interface so far, it is difficult to decompose a sequence as “either empty, or a cons with a head and a tail.” To implement this using the sequence operations we have provided, you have to write code that would lose style points, such as:

case Seq.length s of

0 =>

| _ => … uses (Seq.hd s) and (Seq.tl s) …

We can solve this problem using a *view*. This means we put an appropriate datatype in the signature, along with functions converting sequences to and from this datatype. This allows us to pattern-match on an abstract type, while keeping the actual representation abstract. For this assignment, we have extended the SEQUENCE signature with the following components to enable viewing a sequence like a list:

datatype ’a lview = Nil | Cons of ’a * ’a seq val showl : ’a seq -> ’a lview

val hidel : ’a lview -> ’a seq

(* invariant: showl (hidel v) ==> v *)

Because the datatype definition is in the signature, the constructors Nil and Cons can be used outside the abstraction boundary. The showl and hidel functions convert between sequences and list views. The following is an example of using this view to perform list-like pattern matching:

case Seq.showl s of

Seq.Nil => … (* Nil case *)

| Seq.Cons (x, s’) => … uses x and s’ … (* Cons case *)

Note that the second argument to Cons is another ’a seq, *not *an lview. Thus, showl lets you do one level of pattern matching at a time: you can write patterns like Seq.Cons(x,xs) but not Seq.Cons(x,Seq.Nil) (to match a sequence with exactly one element). We have also provided hidel, which converts a view back to a sequence—Seq.hidel (Seq.Cons(x,xs)) is equivalent to Seq.cons(x,xs) and Seq.hidel Seq.Nil is equivalent to Seq.empty().

Task 2.1 (2%). In graphseq.sml, write a function:

to list : ’a Seq.seq -> ’a list

that uses listviews to convert a sequence into a list. Your algorithm should run in O(n) work and span.

3 Graphs

3.1 Introduction to Graphs

Graphs are a powerful data structure used to represent relationships between pairs of items. Many real-world problems can be reduced to graphs; social networks, links between docu- ments, and transportation routes can all be represented by graphs.

On this homework, we will be using graphs to represent territories and the paths between them. Our maps will be represented as simple, undirected, and connected graphs.

Consider the picture below, which we can think of as representing the routes of a relatively dysfunctional airline. In this picture, the *vertices *(or *nodes *) in this graph represent cities, and the *edges *between the vertices represent flight paths between these cities.

We will start by numbering the vertices from 0 to n − 1, where n is the number of vertices

in the graph.

Figure 1: Example graph of flight paths in the USA

We call a connection between two vertices an edge. We will represent an edge as a pair of vertices; the first vertex in the pair represents the source, and the second vertex in the pair represents the destination. For given vertices u and v, (u, v) represents a *directed *edge from u to v.

We will be representing our undirected graphs as directed graphs, such that ∀ edges

(u, v), there is also an edge (v, u). For example, in the figure above, the graph is undirected if for every flight path from u to v, there exists a flight path from v to u.

Since we know the nodes are numbered 0 through n − 1, the graph could be completely

described by a sequence of edges, and the number of vertices in the graph.

3.2 Adjacency List Representation of Graphs

In this assignment, we will be representing graphs as adjacency lists; that is, for each vertex, we maintain a list of all of the vertices that our vertex is adjacent to.

For the purposes of this section, we will represent the graph using the following type:

type vertex = int

type graph = vertex seq seq

In this representation, each index in the outer sequence of some graph G corresponds to the source vertex, and the *vertex sequence *at that index represents the list of destination vertices. Finally, we know that the number of vertices in the graph is length G.

The following is an example of the graph in Figure 1, represented as an adjacency list.

The graph in Figure 1, as a mapping from source to destination:

0 -> 1

1 -> 2

2 -> 1, 3

3 -> 1, 2

The graph in Figure 1, as a vertex seq seq:

<<1>, <2>, <1, 3>, <1, 2>>

3.3 Graph Warmup

For the purposes of this assignment, assume all graphs are simple and connected. “Simple” means that for all vertices u and v (where u = v), there is at most one edge (u, v). In addition, for any vertex u, there is no edge (u, u) (or self-loop), where a single vertex is both the source and the destination.

In an undirected graph, “connected” means that there is a path from each vertex in the graph to every other vertex in the graph. We will consider a directed graph “weakly connected” if, when we replace every directed edge with an undirected edge, the resulting graph is connected.

In the context of this assignment, by guaranteeing connectedness, we are ensuring that

there are no unreachable, unconquerable “islands” in our graph.

Task 3.1 (3%). In graphseq.sml, write a function

get neighbors: vertex seq seq -> vertex -> vertex seq

such that get_neighbors G source evaluates to a sequence containing only the neighbors of vertex source in an undirected, connected graph G (where a neighbor is any vertex in an edge with source). If the requested vertex is not in the graph, raise not in graph.

For example, get_neighbors <<1>, <0,3>, <3>, <1,2>> 1 should evaluate to <0,3>.

Task 3.2 (5%). Write a function

from edge seq : (vertex * vertex) seq -> int -> vertex seq seq

where from edge seq E n, given the n, number of vertices in the graph, and E, a list of valid edges for the graph of n vertices, constructs the adjacency list representation of the graph. Note that your algorithm must have O(|E|2) work, and O(|E|) span (but it can be faster).

For example, from_edge_seq <(0,1), (1,0), (1,2), (2,1)> 3 should evaluate to

<<1>, <0,2>, <1>>.

Task 3.3 (5%). Write a function

is undirected : vertex seq seq -> bool

that returns true if all edges in a graph G are undirected (i.e. for each edge (u, v) in G, (v, u)

is also in G), and false otherwise.

3.4 Weighted Nodes

Suppose that in addition to edges, you want to store information on each vertex. To do so, we consider an implementation using *weighted nodes *:

type weight = ’a

type graph = (weight * vertex seq) seq

In this representation, we store both the “weight” of each vertex i and its neighbors (adjacency sequence) at the ith index. As you will see in the next section, a weight can represent anything from an integer label to a datatype storing data about the vertex.

4 Riskless

4.1 Rules and Strategy

4.1.1 Description

Riskless is a strategic game of military might and territorial domination.1 Two players– black and white, or Maxie and Minnie–take turns mobilizing armies to empty or occupied territories. The first player to mobilize at least X armies (for some X ∈ N determined by the map), or the last player standing, wins.

In our implementation, we will model the game board (a political map of Earth) as a simple, connected, undirected graph.2 Each node will represent a territory (empty or occupied), and each edge a road connecting two territories. The players begin with their armies distributed on the occupied territories (Minnie with one extra army), and gameplay commences turn-by-turn (Maxie going first):

- 1. On a player’s turn, he or she mobilizes all the armies on any occupied territory to an adjacent territory.

(a) The (now vacated) source territory remains occupied by the same player with 0 units.

(b) The target territory updates as follows:

- If the target is empty, the player occupies the territory, and all armies on the source are moved to the target.
- If the target is occupied by the same player, all armies on the source are moved (added) to the target territory.

iii. If the target is occupied by the opponent, the player with more armies wins the battle and reoccupies the territory with the (absolute) difference of armies remaining. In the case of a tie, the attacking player wins.

(c) A player cannot abstain from moving.

- 2. At the end of every move, both players train one additional army on each of their occupied territories up to a set cap (determined by the map). Any territories that exceed the cap are reduced to the limit.

See Figure 2 below for an example game on a graph of four vertices; an arrow represents a move, and “x” represents a battle, as mentioned above. Each graph shows the state after a move and after each territory has trained an additional unit.

1 A deterministic variation of its namesake Risk, the game has been simplified for ease of coding. More information on the board game can be found at http://en.wikipedia.org/wiki/Risk_(game)

2 Remember that a simple graph has no self-loops and at most one edge between any two nodes.

Figure 2: Example game of Riskless; red wins!

4.1.2 Strategy

There are a number of useful strategies to ensure success on your journey to world domina- tion. A good tactician should consider, among other factors, the following:

- 1. The more territories a player controls, the more armies the player can train in one turn.

- 2. Players should watch borders (nodes adjacent to opposing territories) for a build up of armies: this could signal a potential attack!

- 3. Players should amass armies on their borders for better defense.

4.1.3 Evaluation

To implement Riskless for a computer player, we need to evaluate different game states. A very na¨ıve static evaluation algorithm would be to award points for the total number of armies on the map. For instance, one could sum the number of armies, with positive points for the “Max” player and negative points for the “Min” player.

More sophisticated static evaluations might take other factors into account, such as:

- the number of territories the player has conquered

- the importance of amassing larger armies (as opposed to stretching out one’s troops)

- the value of defending borders

- the value of safe territories (nodes surrounded by the player’s own territories)

4.2 Implementation

Part of an implementation of the GAME signature for Riskless is provided in riskless.sml. It uses the following abstract representation of game states:

datatype position = Min of int | Max of int | Empty datatype territory = T of string * position

type vertex = int

datatype gstate = B of (territory * vertex seq) seq * player type state = gstate

type move = vertex * vertex

Here, we have provided the Sequence library. We represent the game board as a weighted graph, where each vertex in the sequence contains a territory and its adjacency sequence. Each territory has a label (name) and a position, indicating whether the node is occupied and how many armies are currently occupying the territory. Each game state is represented by a board, as well as the current player making the move. Finally, a move is a tuple of the source and destination vertex.

Recall the GAME signature we discussed in class. We have extended it to include the following:

datatype player = Minnie | Maxie

datatype outcome = Winner of player | Draw datatype status = Over of outcome | In play

Each game state will store both the current player and have a status, represented as Over with an outcome or In play.

structure Est : EST

The EST signature contains estimated values and functions, notably minnie wins

and maxie wins. See estimate.sig for more information.

val status : state -> status

Given a game state s, status s evaluates to the current status of the game. Re- member that a game of Riskless ends when either one player is out of territories, or the other has reached the target number of units, represented by Dim.max_units. A Draw state occurs when both players have exceeded the target number of units.

val moves : state -> move Seq.seq

If a state s is still In_play, moves s evaluates to a non-empty sequence of valid moves for s. Otherwise, moves s returns an empty sequence.

val player : state -> player

Returns the current player making the move in the current state.

val make_move : (state * move) -> state

Assuming a valid move for the current state, evaluates to the state after the move has been applied. Remember to update the number of armies, and to account for the maximum cap per territory (defined by Dim.max_terri).

val estimate : state -> Est.est

Returns an estimated score for a given state. You will receive full credit as long as estimate performs no worse than the na¨ıve version above. Of course, a more refined evaluation algorithm will yield a more competent and challenging computer opponent! Your implementation of estimate should work on all valid game states, whether it is in play or it is a terminal state.

Please describe your implementation and the strategies you considered in a few sentences in a comment.

Task 4.1 (40%). Complete the implementation of the Riskless functor in riskless.sml. This takes a structure ascribing to the MAP signature specifying the layout of the board.

4.3 Running Your Game

The file runriskless.sml uses the Referee to construct a Riskless game, with Maxie as a human player and Minnie as Minimax. Thus, you can run your game with:

– CM.make “sources.cm”;

– Risk HvMM.go();

You can write other referee applications to test different variations on the game: different graph sizes, different search depths, different game tree search algorithms (such as Jamboree below. . . ).

4.4 Testing Your Game

Task 4.2 (5%). The test structure TestRiskless has been provided in riskless.sml. Please fill out the test structure with appropriate tests.

5 Alpha-Beta Pruning

In lecture, we implemented the minimax game tree search algorithm: each player chooses the move that, assuming optimal play of both players, gives them the best possible score (highest for Maxie, lowest for Minnie). This results in a relatively simple recursive algorithm, which searches the entire game tree up to a fixed depth. However, it is possible to do better than this!

Consider the following game tree:

a

b e

c[3] d[5] f [2] g

Maxie nodes are drawn as an upward-pointing triangle; Minnie nodes are drawn as a downward-pointing triangle. Each node is labelled with a letter, for reference below. The leaves, drawn as squares, are Maxie nodes and are also labelled with their values (e.g. given by the estimator).

Let’s search for the best move from left to right, starting from the root a. If Maxie takes

the left move to b, then Maxie can achieve a value of 3 (because Minnie has only two choices, node c with value 3 and node d with value 5). If Maxie takes the right move to e, then Minnie can take the her left move to f , yielding a value of 2. But this is worse than what Maxie can already achieve by going left at the top! So Maxie already knows to go left to b rather than right to e. So, there is no reason to explore the tree g (which might be a big tree and take a while to explore) to find out what Minnie’s value for e would actually be: we already know that the value, whatever it is, will be less than or equal to 2, and this is enough for Maxie not to take this path.

αβ*-pruning *is an improved search algorithm based on this observation. In αβ-pruning, the tree g is *pruned *(not explored). This lets you explore more of the relevant parts of the game tree in the same amount of time.

5.1 Setup

In αβ-pruning, we keep track of two values, representing the best (that is, highest for Maxie, and lowest for Minnie) scores that can be guaranteed for each player based on the parts of the tree that have been explored so far. α is the highest guaranteed score for Maxie, and β is the lowest guaranteed score for Minnie.

For each node, αβ-pruning computes a *candidate value *which represents the best score (in particular, Game.estimate) of that node. The values are labeled according to the following spec:

Spec for αβ-pruning: Fix a search depth and an estimation function, so that both minimax and αβ-pruning are exploring a tree with the same leaves. Fix bounds α and β such that such that α < β. Let M M be the minimax value of a node s. Then the αβ-pruning result (i.e., the candidate value) for that node, AB, satisfies the following:

- If s is a Maxie node and M M ≤ α, then stop evaluating any siblings (the parent of s should be pruned).
- If s is a Minnie node and β ≤ M M , then stop evaluating any siblings (the parent of s should be pruned).
- Otherwise, AB = M M .

In other words, α is what we know Maxie can achieve, and β is what we know Minnie can achieve. If the true minimax value of a node is between these bounds, then αβ-pruning computes that value. If it is outside these bounds, then αβ-pruning signals this in one of two ways, depending on which side the actual value falls on, and whose turn it is. Suppose that it’s Maxie’s turn:

If the actual minimax value is less than α, the subtrees rooted at this node’s parent can be pruned. The reason: this Maxie node’s value is worse for Maxie than what we already know Maxie can achieve, so it gives the enclosing Minnie node an option that Maxie doesn’t want it to have. So the Maxie grand-parent should ignore this branch, independently of what the other siblings are. Node f in the above tree is an example.

A symmetric argument can be applied to Minnie.

Sometimes, no bounds on α and β are known (e.g., at the initial call, when you have not yet explored any of the tree). To account for this, we typically set the initial α and β to the smallest and largest possible estimated values, such that any minimax score is equal or better for either player. This can be modeled with −∞ and +∞, or in our implementation, Game.Est.minnie_wins and Game.Est.maxie_wins, the estimated values for Minnie’s vic- tory and Maxie’s victory. In other words, we assume the worst, and only update the bounds when we find a state with a better score.

5.2 The Algorithm

The above spec doesn’t entirely determine the algorithm (a trivial algorithm would be to run minimax and then adjust the label at the end). The extra ingredient is how the values of α and β are adjusted as the algorithm explores the tree. The key idea here is *horizontal propagation*.

Consider the case where we want to compute the value of a *parent *node (so it is not a leaf ), given search bounds α and β based on the portion of the tree seen so far.

- To calculate the result for the parent node, you scan across the children of the node, recursively calculating the result of each child from left to right, using the αβ for the parent as the initial αβ for the first child. Suppose that after considering a child, we find it has the estimated value v:

– Update the appropriate bounds depending the player. If the parent node is a Maxie node, update αnew = max(αold , v) (since the new value can only be better). Dually, if the parent is a Minnie node, update βnew = min(βold, v).

– Next, we check if the new values are within the bounds.

∗ If αnew ≥ β for Maxie or βnew ≤ α for Minnie, we have found a value that will not be considered (since the opposing player will simply choose the tighter value). In this case, we prune the subtree by ignoring the remaining children.

∗ Otherwise, we continue processing the remaining children using the updated bounds.

*N**ote : Most labs presented checking the bounds, and then doing the update. This is equivalent to doing the update, and then checking the bounds. The first is a little easier to explain why it is correct, but the latter results in slightly cleaner code.*

- Once all children have been processed, the parent node is labeled using the final up- dated α (for Maxie) or β (for Minnie) as a
*candidate value*. Realize that this value is equivalent to the minimax value even if we were to have processed the entire subtree.

Note that, except via the returned value of a node, the updates to α and β made in children do not affect the α and β for the parent node—they are based on different information.

Once we reach a terminal node, we label it using the estimated or actual value as a candidate value. This base case can be reached in two ways: either we have reached a leaf node with no children, or we have reached the maximum search depth (d = 0).

5.3 Extended Example

Figure 4 shows a trace of this algorithm running on the tree in Figure 3. As explained above, we use the values minnie wins and maxie wins in the initial call to represent “no bound for α/β.” The c − d − g subtree is equivalent to the above example tree, and shows how the algorithm runs on it.

a

b n

c j o s

d g k p t w

e[3] f[5] h[2] i[7] l[10] m[4] q[2] r[7] u[8] v[2] x[4] y[6]

Figure 3: Extended αβ-pruning example

We start evaluating c with no bounds, so we recursively look at d, and then e, still with no bounds. The estimate for e is 3, which is trivially within the bounds, so the result of e is

- 3. Because d is a Minnie node, we update β to be min(maxie wins,3), which is defined to be 3, for a recursive call on f . Since the estimate of f is 5, which is greater than the given β (which is 3), and it is a Maxie node, we take the min(3, 5) and update d to 3. Since c is a Maxie node, we update α to be max(minnie wins,3), which is also 3. These bounds are passed down to g and then Since the actual value of h is 2 < α, and it is a Maxie node, we prune the remaining children: we know Maxie can get 3 in the left tree, and this branch alone gives Minnie the ability to get 2 here, so Maxie doesn’t want to take it. Thus, we simply set g to β = 3,
*without even looking at*i. Then we update α to be min(3, 3), which is still 3. Since there are no other children, and since 3 is in the original bounds for c, it is the value of c.

F: state (Maxie,a) with ab=(Leaf(Minnie wins!),Leaf(Maxie wins!)) G: state (Minnie,b) with ab=(Leaf(Minnie wins!),Leaf(Maxie wins!)) F: state (Maxie,c) with ab=(Leaf(Minnie wins!),Leaf(Maxie wins!)) G: state (Minnie,d) with ab=(Leaf(Minnie wins!),Leaf(Maxie wins!)) F: state (Maxie,e) with ab=(Leaf(Minnie wins!),Leaf(Maxie wins!)) Estimating state e[3]

Result of (Maxie,e) is Leaf(Guess: 3)

F: state (Maxie,f) with ab=(Leaf(Minnie wins!),Step(0, Guess: 3)) Estimating state f[5]

Result of (Maxie,f) is Leaf(Guess: 5) Result of (Minnie,d) is Step(0, Guess: 3)

G: state (Minnie,g) with ab=(Step(0, Guess: 3),Leaf(Maxie wins!)) F: state (Maxie,h) with ab=(Step(0, Guess: 3),Leaf(Maxie wins!)) Estimating state h[2]

Result of (Maxie,h) is Leaf(Guess: 2)

Result of (Minnie,g) is Step(0, Guess: 2) (* Pruned *) Result of (Maxie,c) is Step(0, Guess: 3)

F: state (Maxie,j) with ab=(Leaf(Minnie wins!),Step(0, Guess: 3)) G: state (Minnie,k) with ab=(Leaf(Minnie wins!),Step(0, Guess: 3)) F: state (Maxie,l) with ab=(Leaf(Minnie wins!),Step(0, Guess: 3)) Estimating state l[10]

Result of (Maxie,l) is Leaf(Guess: 10)

F: state (Maxie,m) with ab=(Leaf(Minnie wins!),Step(0, Guess: 3)) Estimating state m[4]

Result of (Maxie,m) is Leaf(Guess: 4) Result of (Minnie,k) is Step(0, Guess: 3)

Result of (Maxie,j) is Step(0, Guess: 3) (* Pruned *)

Result of (Minnie,b) is Step(0, Guess: 3)

G: state (Minnie,n) with ab=(Step(0, Guess: 3),Leaf(Maxie wins!)) F: state (Maxie,o) with ab=(Step(0, Guess: 3),Leaf(Maxie wins!)) G: state (Minnie,p) with ab=(Step(0, Guess: 3),Leaf(Maxie wins!)) F: state (Maxie,q) with ab=(Step(0, Guess: 3),Leaf(Maxie wins!)) Estimating state q[2]

Result of (Maxie,q) is Leaf(Guess: 2)

Result of (Minnie,p) is Step(0, Guess: 2) (* Pruned *) Result of (Maxie,o) is Step(0, Guess: 3)

Result of (Minnie,n) is Step(0, Guess: 3) (* Pruned *)

Result of (Maxie,a) is Step(0, Guess: 3)

Therefore value of (Maxie,a) is Guess: 3, and best move is 0. Terminals visited: 6

Figure 4: AB-Pruning trace for the above tree

5.4 Tasks

We have provided starter code in alphabeta.sml. As in minimax, we need to return not just the value of a node, but the move that achieves that value, so that at the top we can select the best move:

type edge = Game.move * Game.est

While it would be simple to let our αβ type value = edge, we must remember to consider the special cases when a node is already at a terminal state (i.e. a leaf node, or one at search depth = 0). To account for these, we represent a terminal node as an Leaf, and all others as a Step:

datatype value = Step of edge

| Leaf of Game.est

This way, we can represent α and β by a value consisting of the current best estimated candidate value, and the current best move (if there is one).

Note that values are compared via their candidate values, where minnie_wins and

maxie_wins represents the smallest and greatest values. The presence of a move does not affect the ordering, except in the case of equal values, where we must choose a Step over an Leaf to ensure we pass back a valid move.

We have provided three functions:

valuecmp : value * value -> order maxalpha : value * value -> value minbeta : value * value -> value

that will be helpful when implementing these comparisons.

To extract the candidate value or move from α or β, you may use

valueOf : value -> Game.est moveOf : value -> Game.move

Finally, we abbreviate

(* invariant: (alpha, beta) with alpha < beta *)

type alphabeta = value * value

Task 5.1 (1%). Fill in the value

val initialAB : alphabeta = …

for the initial α/β bounds to pass in to your implementation.

Task 5.2 (20%). Define the mutually recursive functions:

fun F (d : int) (s : Game.state) (ab : alphabeta) : value = … and G (d : int) (s : Game.state) (ab : alphabeta) : value = …

which compute the best possible result (move and candidate value) using the αβ algorithm, given a game state s at a depth d, for Maxie (F) and Minnie (G).

You may assume that the depth is non-zero, the status of s is In_play, and that

Game.moves s evaluates to a non-empty sequence of valid moves.

Hint 1: Recall Seq.showl, which takes an ’a Seq.seq and evaluates to a list view (Seq.Nil or Seq.Cons(’a * ’a Seq.seq)). This and a recursive helper function may be useful while sequentially analyzing a parent node’s children. You may also find the utility function SeqUtils.null helpful.

Hint 2: Remember to update the move after processing each child node! If your code is evaluating to invalid moves, or returning only Leafs, check for this bug.

Task 5.3 (4%). Define next_move : Game.state -> Game.move, such that given a state

s, next_move s evaluates to the best move the current player can choose.

Task 5.4 (0%). In runriskless.sml, use the Referee to define a structure Risk HvAB that allows you to play Riskless against your αβ-pruning implementation.

For fun, you can also try pitting MiniMax and AlphaBeta against each other. A search depth of 4-6 should run relatively quickly.

Testing We have provided a functor ExplicitGame that makes a game from a given game tree, along with two explicit games, HandoutSmall (Figure 5) and HandoutBig (Figure 4). These can be used for testing as follows:

– structure TestBig =

AlphaBeta(struct structure G = HandoutBig val search_depth = 4 end);

– TestBig.next_move HandoutBig.start; Estimating state e[3]

Estimating state f[5]

Estimating state h[2] Estimating state l[10] Estimating state m[4] Estimating state q[2]

val it = 0 : HandoutBig.move

structure TestSmall =

AlphaBeta(struct structure G = HandoutSmall val search_depth = 2 end);

– TestSmall.next_move HandoutSmall.start; Estimating state c[3]

Estimating state d[6]

Estimating state e[~2] Estimating state g[6] Estimating state h[4] Estimating state i[10] Estimating state k[1]

val it = 1 : HandoutSmall.move

The search depths of 2 and 4 here are important, because an explicit game causes errors if it tries to estimate in the wrong place.

For these explicit games, estimate prints the states it visits, so you can see what ter- minals are visited, as indicated above. You may additionally wish to annotate your code so that it prints out traces, as above.

6 Jamboree

αβ-pruning is entirely sequential, because you update α/β as you search across the children of a node, which creates a dependency between children. On the other hand, MiniMax is entirely parallel: you can evaluate each child in parallel, because there are no dependencies between them. This is an example of a *work-span tradeoff *: MiniMax does more work, but has a better span, whereas αβ-pruning does less work, but has a worse span.

The *Jamboree *3 algorithm manages this tradeoff by evaluating *some *of the children se-

quentially, updating αβ, and then the remainder in parallel, using the updated information. Depending on the parallelism available in your execution environment, you can choose how many children to evaluate sequentially to prioritize work or span.

In the file jamboree.sml, you will implement a functor

functor Jamboree (Settings : sig

structure G : GAME

val search_depth : int

val prune_percentage : real end) : PLAYER

prune_percentage is assumed to be a number between 0 and 1. For each node, 100.0 * prune_percentage percent of the children are evaluated sequentially, updating αβ, and the remaining children are evaluated in parallel. For example, with prune_percentage = 0, Jamboree performs just like Minimax, and with prune_percentage = 1, Jamboree performs like completely sequential αβ-pruning.

Suppose prune_percentage is 0.5. For the tree in Figure 3, Jamboree will explore the tree in the same way as in Figure 4: no node has more than 2 children, so the restriction on pruning never comes into play. However, on the following tree, the traces differ:

3 A parallelized version of the original Scout αβ algorithm in 1980, Jamboree is so named for running multiple “scouts” in parallel.

3 6 −2 6 4 10 1 30 9

See Figure 5 for the traces.

6.1 Tasks

Task 6.1 Copy initialAB from your αβ-pruning implementation, as this will be unchanged.

Task 6.2 (3%). Define the function splitMoves : Game.state -> (Game.move seq * Game.move seq) which, given a game state, splits the possible moves into two sequences (abmoves, mmmoves) depending on the prune_percentage. In particular, if s has n moves, abmoves should contain the first (Int.floor (prune_percentage * n)) moves, and mmmoves whatever is left over.

Task 6.3 (10%). Define the functions:

fun F (d : int) (s : Game.state) (ab : alphabeta) : value = … and G (d : int) (s : Game.state) (ab : alphabeta) : value = …

using the Jamboree algorithm.

The specs for F and G are as above, except that during the processing of each parent’s child nodes, it should divide up the moves from s into abmoves and mmmoves.

The algorithm should process abmoves sequentially, updating α/β as you go, as in your αβ-pruning implementation. When there are no more abmoves, you should then process mmmoves in parallel, and combine the max/min of their values with the incoming α/β to produce the appropriate value for s.

(Again, you may find the helper functions in SeqUtils useful.)

Task 6.4 (2%). Define next_move.

Jamboree with prune percentage = 0.5, so αβ are updated only after the first child (we round down):

F: state (Maxie,a) with ab=(Leaf(Minnie wins!),Leaf(Maxie wins!)) G: state (Minnie,b) with ab=(Leaf(Minnie wins!),Leaf(Maxie wins!)) F: state (Maxie,c) with ab=(Leaf(Minnie wins!),Leaf(Maxie wins!)) Result of (Maxie,c) is Leaf(Guess: 3)

F: state (Maxie,d) with ab=(Leaf(Minnie wins!),Step(0, Guess: 3))

Result of (Maxie,d) is Leaf(Guess: 6)

F: state (Maxie,e) with ab=(Leaf(Minnie wins!),Step(0, Guess: 3)) Result of (Maxie,e) is Leaf(Guess: ~2)

Result of (Minnie,b) is Step(2, Guess: ~2)

G: state (Minnie,f) with ab=(Step(0, Guess: ~2),Leaf(Maxie wins!)) F: state (Maxie,g) with ab=(Step(0, Guess: ~2),Leaf(Maxie wins!)) Result of (Maxie,g) is Leaf(Guess: 6)

F: state (Maxie,h) with ab=(Step(0, Guess: ~2),Step(0, Guess: 6))

Result of (Maxie,h) is Leaf(Guess: 4)

F: state (Maxie,i) with ab=(Step(0, Guess: ~2),Step(0, Guess: 6)) Result of (Maxie,i) is Leaf(Guess: 10)

Result of (Minnie,f) is Step(1, Guess: 4)

G: state (Minnie,j) with ab=(Step(0, Guess: ~2),Leaf(Maxie wins!)) F: state (Maxie,k) with ab=(Step(0, Guess: ~2),Leaf(Maxie wins!)) Result of (Maxie,k) is Leaf(Guess: 1)

F: state (Maxie,l) with ab=(Step(0, Guess: ~2),Step(0, Guess: 1))

Result of (Maxie,l) is Leaf(Guess: 30)

F: state (Maxie,m) with ab=(Step(0, Guess: ~2),Step(0, Guess: 1)) Result of (Maxie,m) is Leaf(Guess: 9)

Result of (Minnie,j) is Step(0, Guess: 1) Result of (Maxie,a) is Step(1, Guess: 4) Terminals visited: 9

Overall choice: move 1[middle]

αβ-pruning:

F: state (Maxie,a) with ab=(Leaf(Minnie wins!),Leaf(Maxie wins!)) G: state (Minnie,b) with ab=(Leaf(Minnie wins!),Leaf(Maxie wins!)) F: state (Maxie,c) with ab=(Leaf(Minnie wins!),Leaf(Maxie wins!)) Result of (Maxie,c) is Leaf(Guess: 3)

F: state (Maxie,d) with ab=(Leaf(Minnie wins!),Step(0, Guess: 3)) Result of (Maxie,d) is Leaf(Guess: 6)

F: state (Maxie,e) with ab=(Leaf(Minnie wins!),Step(0, Guess: 3)) Result of (Maxie,e) is Leaf(Guess: ~2)

Result of (Minnie,b) is Step(2, Guess: ~2)

G: state (Minnie,f) with ab=(Step(0, Guess: ~2),Leaf(Maxie wins!)) F: state (Maxie,g) with ab=(Step(0, Guess: ~2),Leaf(Maxie wins!)) Result of (Maxie,g) is Leaf(Guess: 6)

F: state (Maxie,h) with ab=(Step(0, Guess: ~2),Step(0, Guess: 6)) Result of (Maxie,h) is Leaf(Guess: 4)

F: state (Maxie,i) with ab=(Step(0, Guess: ~2),Step(1, Guess: 4)) Result of (Maxie,i) is Leaf(Guess: 10)

Result of (Minnie,f) is Step(1, Guess: 4)

G: state (Minnie,j) with ab=(Step(1, Guess: 4),Leaf(Maxie wins!)) F: state (Maxie,k) with ab=(Step(1, Guess: 4),Leaf(Maxie wins!)) Result of (Maxie,k) is Leaf(Guess: 1)

Result of (Minnie,j) is Step(0, Guess: 1) (* Pruned *)

Result of (Maxie,a) is Step(1, Guess: 4) 20

Terminals visited: 7

Overall choice: move 1[middle]

Figure 5: Traces for Tree 2