Tetris: Intelligent Agent
Link to project here.
In artificial intelligence, an intelligent agent (IA) refers to an autonomous entity which acts, directing its activity towards achieving goals (i.e. it is an agent), upon an environment using observation through sensors and consequent actuators.
In this blog I'll talk about the rule based agent that I implemented for the game of Tetris. To quickly describe it, it just takes information such as what the next/held pieces are and the state of the Stack, generates all possible outcomes and then scores each outcome. It then takes steps to work toward the best outcome.
Table Of Contents
Generating Outcomes
In order for the agent to figure out what to do with the pieces it has, it needs to be able to know the possible outcomes of for each piece on a given Stack. These outcomes can then be scored to indicate their relative value to the agent.
These outcomes are generated simply by looping through all rotations of a given piece in all possible end positions in the Stack. There are some endpoints that are possible via complex maneuvers such as T-spins, but for the sake of efficiency and simplicity, I chose to only compute positions accessible by dropping straight down.
const getEndPoints = (piece, stack) => {
let endPoints = []
for(let r=0; r<piece.rotations.length; r++) {
// -2 to account for inconsistent column spacing in base Tetrominoes
for(let i=-2; i<stack[0].length; i++) {
let p = piece.cons(i, 0, r)
let ghost = Game.getGhost(p, stack)
if (ghost != null) endPoints.push(ghost)
}
}
return endPoints
}
Scoring The Stack
My approach to the scoring is basically to count everything wrong with the Stack and then produce a score based on a given multiplier or constant, thus the Stack with the highest score is the worst one to choose.
The piece is placed into the Stack at the end points generated in the previous section and then scored.
Number of Holes
Pretty obvious criteria, the more holes there are in the Stack, the harder it is to shrink it. However, this isn't as simple as counting empty spots as holes can take many forms and can be large or long.
So I generate a set of positions that are accessible from the top of the play field. This is done via Breadth First Search. If there is a block already in place in a location, it blocks the search from going further. If the position is empty, it is added to the set.
const getEmptyPostions = (i, j, stack) => {
const origin = Tetromino.Node(i, j)
const open = [[null, origin]]
const closed = Scoring.PosSet()
const directions = [Tetromino.Node(0,-1), Tetromino.Node(0,1), Tetromino.Node(1,0), Tetromino.Node(-1,0)]
let current = null
while (open.length>0) {
current = open.shift()
directions.map(d => current[1].sum(d))
.filter(n => n.inBounds(stack[0].length, stack.length) &&
!closed.has(n.i, n.j) &&
stack[n.j][n.i]==null
)
.forEach(n => {
open.push([current, n])
closed.add(n.i, n.j)
})
}
return closed
}
Each position is simply encoded as strings to be usable with a builtin Javascript Set.
const PosSet = () => {
const cs = {}
cs.set = new Set()
cs.encIJ = (i, j) => `${i},${j}`
cs.add = (i, j) => cs.set.add(cs.encIJ(i,j))
cs.has = (i, j) => cs.set.has(cs.encIJ(i,j))
return cs
}
With this set, I can loop from from the bottom up, counting empty spaces that are not in it.
Empty Columns and Blocked Spaces
Because I am limiting the agent to only dropping pieces straight down, it would be impossible to fill spaces that are not holes but have been blocked by other pieces. A position is checked for this by essentially checking if it has a line of sight of the very top space in its column.
Additionally, while single width tall columns are useful and satisfying to watch when an I Tetromino is placed into it, letting them grow to beyond a certain height increases to chances of the agent being forced to use a piece other than an I Tetromino, creating a tall unfillable hole. Having more than one of these columns compounds the problem even further, so these are traits that will be discouraged as well.
The checks for Empty Columns and Blocked Spaces loop through the column counting their own things, so they are bundled together in a function to prevent looping through a column multiple times.
checkColumn = (i, j, stack) => {
let isBlocked = false
let tallEmptyColumnSize = 0
for(let j2=j-1; j2>=0; j2--) {
if(stack[j2][i]!=null) {
isBlocked = true
break
}
if((i==0 || stack[j2][i-1]!=null) && (i==stack[j2].length-1 || stack[j2][i+1]!=null)) {
tallEmptyColumnSize += 1
}
}
return [isBlocked, tallEmptyColumnSize>=3]
}
Stack Height
While stack height is not necessarily an bad thing if managed well, beyond a certain point it can be easy for bad luck with Tetromino order to simply kill a run. Thus scoring for lines beyond 1/2 and 3/4 height of the play field were added to help control how the agent prioritizes stack height against the other variables.
Bringing It All Together
In the following monolith of a function, I fit all the counting together to avoid looping through the Stack multiple times. At the very end then the score is computed based on the given weights and returned.
const score = (piece, stack, weights) => {
Game.addPiece(piece, stack)
let emptySet = Scoring.getEmptyPostions(Game.SPAWN_LOC[0], Game.SPAWN_LOC[1], stack)
let stackHeight = 0
let nHoles = 0
let nBlocked = 0
let nTallEmptyColumns = 0
let nFilledRows = 0
for(let j=stack.length-1; j>=0; j--) {
let nFilled = 0
for(let i=stack[j].length-1; i>=0; i--) {
// nFilled
if (stack[j][i]!=null) nFilled += 1
// Holes
else if (stack[j][i]==null && !emptySet.has(i, j)) nHoles += 1
// Blocked & tallEmptyColumns
else if (stack[j][i]==null) {
let res = Scoring.checkColumn(i, j, stack)
if (res[0]) nBlocked += 1
if (res[1]) nTallEmptyColumns += 1
}
}
// Early Exit
if (nFilled == 0) break
// Track rows that will be removed
else if (nFilled == stack[j].length) nFilledRows += 1
// Track height
stackHeight += 1
}
Game.removePiece(piece, stack)
let score = 0
score += Math.max(0, (nHoles + weights.nHolesConst())*weights.nHolesMult())
score += Math.max(0, (nBlocked + weights.nBlockedConst())*weights.nBlockMult())
score += Math.max(0, (nTallEmptyColumns + weights.nTallEmptyColsConst())*weights.nTallEmptyColsMult())
let halfHeightPenalty = Math.max(0, stackHeight - Math.floor(stack.length/2) - nFilledRows)*weights.halfHeightRowMult()
let threeQuarterHeightPenalty = Math.max(0, stackHeight - Math.floor(3*stack.length/4) - nFilledRows)*weights.threeQuarterHeightRowMult()
score += stackHeight - nFilledRows + halfHeightPenalty + threeQuarterHeightPenalty
return score
}
Agent
Ultimately because the scoring function basically handles everything important, all the agent has to do is to figure out how to get to the best position as directed by the scoring function.
In order to get a move from the agent, the current state of the game is given to it, and then the agent runs the scoring function and chooses a target endpoint via agent.updateTarget(state)
. Afterward it just decides what it needs to do in each update cycle rather than doing everything at once.
agent.move = state => {
if (state.current==null) return null
agent.updateTarget(state)
// Dont do everything at once, just space out enough to see what is going on
if (agent.target==null) {
Game.updateCurrent(state.current.next(0,1), state)
} else if (agent.holdCurrent) {
Game.holdPiece(state)
} else if (state.current.i != agent.target.i || state.current.rot != agent.target.rot) {
// Fix column location and rotation
let p = state.current.next(agent.target.i - state.current.i, state.current.cons==Tetromino.I ? 1 : 0, agent.target.rot)
Game.updateCurrent(p, state)
} else {
// Lock in at ghost
Game.updateCurrent(state.ghost, state)
}
}
The agent.updateTarget()
method gets the best end points of both the current piece and the held piece and chooses between the two using the outcome of the scoring function.
agent.updateTarget = state => {
// Get all valid endpoints using the current piece and the held piece
let endPoints = Scoring.getEndPoints(state.current, state.stack)
let altEndpoints = null
if (state.hold==null) altEndpoints = Scoring.getEndPoints(state.nextPieces[0], state.stack)
else altEndpoints = Scoring.getEndPoints(state.hold, state.stack)
// Score all endpoints
let mainRes = Scoring.getBestEndpoint(endPoints, state.stack, agent.weights)
let altRes = Scoring.getBestEndpoint(altEndpoints, state.stack, agent.weights)
// Choose piece + location with the lowest score (least penalties)
if (altRes[0]<mainRes[0]) {
agent.holdCurrent = true
agent.target = altRes[1]
} else {
agent.holdCurrent = false
agent.target = mainRes[1]
}
}