Can a Bayesian spam filter play chess?
by Laird A. Breyer
Introduction
Many people these days depend on Bayesian filters to
protect them from the ever present email scourge that is spam.
Unlike older technologies, these programs' claim to fame is that
they learn the spam patterns automatically, and more importantly,
learn personalized spam (bad) and ham (good) email patterns.
Like many others, I wrote a Bayesian filter to protect me from unwanted
email, which I called dbacl.
My implementation functions as a Unix command line text classifier,
with special email support, and can be used with procmail.
People are often astonished at how well statistical mail filtering works
after they first try it, and it's tempting to imagine that such programs
actually understand the emails being delivered, rather than merely matching
patterns.
Now chess
has always been a popular gauge of intelligence that everyone
can understand, so if we put all these ideas together,
then the question "Can a Bayesian spam filter play chess?"
seems like a fun experiment with a lot of appeal.
Let's put down some ground rules: This experiment will test a real spam filter, not a specially designed chess program. It won't aim to beat
Deep Thought
(I wouldn't know where to start, and I have a feeling this could be difficult anyway ;-), but it will aim to show signs of "intelligence", or we won't claim
success.
Finally, since dry tables and graphs are no fun, a theoretical proof of concept is not enough:
the spam filter must really play chess
in a way that everyone can see, and try out at home.
The account below is designed so that you can follow and duplicate
it by yourself. All
you need is a Unix compatible computer. You'll have to open a terminal
and be ready to type shell commands. All shell commands below are preceded
by % to indicate the prompt, but you don't type the '%'.
Instructions are fairly detailed, and
various scripts can be downloaded when needed, but it helps if you're
familiar with the shell. Ask a friend if you need help.
Important: You must follow these instructions if you want to actually play chess
against the spam filter. You must also download some training games and
teach the filter beforehand. Running the scripts alone is not enough.
The instructions below have been tested to work with the GNU utilities and
bash.
Start by making a directory to keep all the
workings in one place.
% mkdir chess
% cd chess
Setting up the game(s)
The first thing we have to do is obtain a (preferably large) collection
of chess games that we can learn.
Not being an expert, I started off by
browsing the web for likely keywords. It became soon apparent that a large
collection of free games is available electronically in something called the
PGN format. So I ended up downloading all the files available from
Chessopolis and placing them into a subdirectory.
% mkdir zipfiles
% cd zipfiles
% ls
100-pg.zip can92-pg.zip gmcom3pg.zip krp-pg.zip swede2pg.zip
4queenpg.zip cp8687pg.zip irish-pg.zip lonepine.zip ten-pg.zip
acbul-pg.zip cp8891pg.zip italy-pg.zip maxg-pgn.zip trap-pg.zip
bril-pg.zip croat-pg.zip kbp-pg.zip minis-pg.zip wchamppg.zip
brit60pg.zip denm-pg.zip knp-pg.zip pca9395.zip
brit70pg.zip gmcom1pg.zip kp-kp-pg.zip sams-pg.zip
cal-pg.zip gmcom2pg.zip kqp-pg.zip storm-pg.zip
Now that we have a collection, let's look at a typical game, say from
sams-pg.zip:
% zcat sams-pg.zip | head -15
[Event "?"]
[Site "Active Chess Championship, Kuala Lumpur (Malays"]
[Date "1989.??.??"]
[Round "?"]
[White "Anand Viswanathan (IND)"]
[Black "Sloan Sam"]
[Result "1-0"]
[ECO "C57"]
1. e4 e5 2. Nf3 Nc6 3. Bc4 Nf6 4. Ng5 Nxe4 5. Bxf7+ Ke7 6. d3 Nf6 7.
Bb3 d5 8. Nc3 Bg4 9. f3 Bf5 10. f4 Bg4 11. Qd2 h6 12. fxe5 Nxe5 13. Qe3
Kd6 14. d4 Nd3+ 15. Qxd3 Qe7+ 16. Be3 Re8 17. Nf7+ Qxf7 18. O-O c6 19.
Bf4+ Kd7 20. Be5 Be7 21. Rae1 Rhf8 22. Nxd5 cxd5 23. Ba4+ Kd8 24. Qc3
Bb4 25. Qxb4 Re6 26. c4 Rb6 27. Qa5 Bc8 28. c5 1-0
The trouble with data collections is that they are never exactly in the format
we want. The chess game is obviously the bit at the bottom, while the
text in square brackets looks quite useless to teach our filter.
Looking at the game itself, the numbers obviously count the moves,
while the actual symbols that follow just seem like noise. But look
more closely, and each move is actually followed by two expressions,
one for each player.
In chess, the White player always starts first,
and if you know that a chess board's columns are marked by letters,
and the rows are marked by numbers, then e4 is a square on the board.
The capital letters such as B, N, Q, K probably stand for Bishop,
kNight, Queen and King. Of course, if you get stuck, you might just
want to read the PGN format specification instead of guessing.
So now we know that each player's moves are separated by spaces, and
that the numbers ending in a dot are just there to help people read the
moves, and can be ignored just like the text in brackets with the names
of the players etc. The real game information could be simply written like this:
e4 e5 Nf3 Nc6 Bc4 Nf6 Ng5 Nxe4 Bxf7+ Ke7 d3 Nf6
Bb3 d5 Nc3 Bg4 f3 Bf5 f4 Bg4 Qd2 h6 fxe5 Nxe5 Qe3
Kd6 d4 Nd3+ Qxd3 Qe7+ Be3 Re8 Nf7+ Qxf7 O-O c6
Bf4+ Kd7 Be5 Be7 Rae1 Rhf8 Nxd5 cxd5 Ba4+ Kd8 Qc3
Bb4 Qxb4 Re6 c4 Rb6 Qa5 Bc8 c5 1-0
The engine's brain
dbacl is a text classifier. It reads words and builds a probability model
from their frequencies. It can also look at pairs of words, triples, etc. up
to 7. The tokens it looks at can be up to 30 characters long. These limitations
are technical and would take too long to explain, but are useful to know up front.
What happens when dbacl classifies a text is that it computes the
probability of seeing the text naturally occurring under the
model. This is how spam filtering works: we build two models, one for
spam and one for good emails, and then dbacl checks which model
probability is higher for every incoming email.
Before we go on,
make sure that dbacl is available. You'll need version 1.11 at least, so
the easiest way is to download the program from its homepage. Place the
dbacl-1.11.tar.gz file in your chess directory.
% cd ..
% tar xfz dbacl-1.11.tar.gz
% mv dbacl-1.11 dbacl
% cd dbacl
% ./configure && make && make check
% cd ..
For chess, we first want to learn a model from many, many games. But
then what? If we play an actual game, we'll end up with another piece
of text that looks like the game above, but it will be incomplete. And
we want dbacl to choose each move for the side it plays.
Now dbacl doesn't know the rules of chess, in fact anybody would be
hard pressed to recognize that the text above is actually a game
played on a board with wooden or plastic pieces. Anybody who has never
heard of PGN of course.
What dbacl can do is choose. So here's what we will do: we
take an incomplete game, and we add one extra (half) move at the end ourselves.
We repeat this for all possible legal moves at that point. We'll get
nearly identical partial games whose only difference is the last expression.
We'll ask dbacl to work out the probabilities of each partial game under
its model. And then we'll pick the most likely partial game.
Figuring out what to learn
We know that dbacl can tell us if a piece of text is typical for
a model, and in turn a model is learned by reading many examples together.
So if a pattern occurs very often in the games being learned, then such
a typical pattern will be recommended by dbacl. And if the pattern is rare
then dbacl will recommend it rarely.
But we want dbacl to win. So we want it to recommend the kind of things
winners often do. So when dbacl plays White, it must learn a model
from games where White wins, and if dbacl plays Black, then its model must
be from games where Black wins.
At least that's a good first assumption. Sometimes, strong players lose a game
against weaker players, and if dbacl learns this type of game, then it
will pick up bad habits from the weaker player. But we'll assume that most
games the better player wins. Also, if we learn to play by studying games
from terrible players, then we'll pick up bad habits no matter what.
But this is for later, or we'll never get anywhere.
Unfortunately, we now have work to do. We must split our thousands of
sample games into White-Win (1-0) and Black-Win (0-1). And what to do
about draws (1/2-1/2)? We can put them in both categories or just ignore them.
The files I downloaded are zipped MS-DOS files called *.PGN whose lines end in "\r\n"
instead of ending in "\n", which is the Unix convention.
% cd zipfiles
% for f in *.zip; do unzip $f; done
% mkdir ../gamefiles
% mv *.PGN ../gamefiles
% cd ..
After inspecting a few *.PGN files, it's clear that
a typical game takes several lines to write out fully,
but the lines in between are either empty or contain all sorts of comments
and useless information which must be scrubbed. We can do this by recombining
the lines of a game into a single long line, and since all games start
with a "1.", any lines left that don't start this way can be thrown away.
% cd gamefiles
% cat *.PGN | sed -e 's/\r//g' \
| sed -e :a -e '$!N;s/\n\([a-hKQNBRO0-9]\)/ \1/;ta' -e 'P;D' \
| sed -e 's/^ *//' \
| grep '^1\.' \
> allgames.txt
What we now have is a big file allgames.txt which contains very long lines
where each line is a single game. In the PGN format, the end result is
marked at the end of the game, so it is easy for us to sort the games
by throwing away the lines which either contain 1-0 (White wins) or 0-1 (Black wins). We also remove the move numbers which we don't need anymore.
% cat allgames.txt | grep -v '0-1' | sed 's/[0-9]*\.[ ]*//g' > WhiteWinDraw.txt
% cat allgames.txt | grep -v '1-0' | sed 's/[0-9]*\.[ ]*//g' > BlackWinDraw.txt
% cat allgames.txt | grep '1-0' | sed 's/[0-9]*\.[ ]*//g' > WhiteWin.txt
% cat allgames.txt | grep '0-1' | sed 's/[0-9]*\.[ ]*//g' > BlackWin.txt
Let's see how many games we've got:
% wc -l *.txt
46245 allgames.txt
26809 BlackWinDraw.txt
14913 BlackWin.txt
31332 WhiteWinDraw.txt
19436 WhiteWin.txt
138735 total
All right, around 15-20 thousand winning games of each type.
That should give dbacl something to read!
Remember, each game is on its own line. I'm going to leave the final
scores at the end of each line where they are, as they are harmless
(they can't occur in the middle of a game in progress). Let's learn
the models:
% cd ..
% ./dbacl/src/dbacl -T text -l ./WhiteWinDraw -e alnum -L uniform -j -w 2 -H 20 ./gamefiles/WhiteWinDraw.txt
% ./dbacl/src/dbacl -T text -l ./BlackWinDraw -e alnum -L uniform -j -w 2 -H 20 ./gamefiles/BlackWinDraw.txt
The most important option here is "-w 2", which tells dbacl that it
must pick up single words as well as word pairs. We'll see later if that's a
good idea. If all went well, then you should have two files in your chess directory.
% ls -lh *Win*
-rw-r----- 1 laird laird 3.2M 2005-06-24 17:16 BlackWinDraw
-rw-r----- 1 laird laird 3.2M 2005-06-24 17:15 WhiteWinDraw
Building the engine
Now let's build the engine. This is where we'll cheat
take advantage of the wonderful world of open source.
Remember that what we want to do is take a PGN game that's in progress,
and let dbacl complete the next move by picking all possible legal next
moves and adding them in turn to the current game, then we let dbacl
compute the most probable game under its model. We don't really know
if dbacl's model is a good one for chess, and let's not even think
about competing with Deep Thought here,
but this procedure will at least give us a
way to pick the move.
Unfortunately, computing legal moves is straightforward but tedious,
so I looked around the internet for something useful, and found the
SAN Toolkit. This is an old
freeware toolkit for chess which does everything we want, even though
it was last updated in 1993 and probably no longer state of the art.
It's written in C and we have to compile it, but most importantly we
can use it directly in a Unix shell environment. Download it and place
it in the chess directory.
% untar xfz san.tgz
% cd SAN
% cp SAN_DOC/Makefile SAN_SRC
% cd SAN_SRC
% make
% cd ../..
The documentation for this program is a little wanting, but it's not
too hard to figure out, and the source code helps.
Everything is calculated by the program san,
which accepts commands. We'll write a script which accepts a PGN
partial game and one of the dbacl categories we learned, and outputs
the "best" move, meaning what dbacl thinks is closest to the category
that was learned.
Let's try those ideas out before we proceed. We'll need a test PGN file.
% cat > test.pgn
1. e4 c5 2. Nf3 e6 3. d3 Nc6
The cat commands waits until you press CTRL-D to finish. Let's first
ask the san program for a list of legal moves.
% echo -ne "svop verb namd nabd napr\nlfer test.pgn\nenum 1\n" \
| ./SAN/SAN_SRC/san > test.legal 2>&1
% head -10 test.legal
: : : Welcome to the SAN Kit.
Revised: 1993.05.16
Bd2 : 1
Be2 : 1
Be3 : 1
Bf4 : 1
Bg5 : 1
Bh6 : 1
Kd2 : 1
Okay, this looks promising! There are some pieces of text we'll have to
remove, but the possible first moves are all there (the listing is cut after
10 lines). Let's build the game
line.
% cat test.pgn | sed 's/[0-9]*\.[ ]*//g' > test.gameline
% cat test.gameline
e4 c5 Nf3 e6 d3 Nc6
Next, we complete this gameline with each possible move:
% cat test.legal | grep '^.* : 1$' | cut -f 1 -d ' ' | \
while read move; do
echo `cat test.gameline` $move
done > test.complete
% head -10 test.complete
e4 c5 Nf3 e6 d3 Nc6 Bd2
e4 c5 Nf3 e6 d3 Nc6 Be2
e4 c5 Nf3 e6 d3 Nc6 Be3
e4 c5 Nf3 e6 d3 Nc6 Bf4
e4 c5 Nf3 e6 d3 Nc6 Bg5
e4 c5 Nf3 e6 d3 Nc6 Bh6
e4 c5 Nf3 e6 d3 Nc6 Kd2
e4 c5 Nf3 e6 d3 Nc6 Ke2
e4 c5 Nf3 e6 d3 Nc6 Na3
e4 c5 Nf3 e6 d3 Nc6 Nbd2
And of course, let's check what dbacl thinks of this:
% cat test.complete | ./dbacl/src/dbacl -n -c ./WhiteWinDraw -f 1 > test.scores
% cat test.scores
WhiteWinDraw 53.88 e4 c5 Nf3 e6 d3 Nc6 Bd2
WhiteWinDraw 52.93 e4 c5 Nf3 e6 d3 Nc6 Be2
WhiteWinDraw 52.16 e4 c5 Nf3 e6 d3 Nc6 Be3
WhiteWinDraw 53.11 e4 c5 Nf3 e6 d3 Nc6 Bf4
WhiteWinDraw 52.88 e4 c5 Nf3 e6 d3 Nc6 Bg5
WhiteWinDraw 54.64 e4 c5 Nf3 e6 d3 Nc6 Bh6
WhiteWinDraw 55.32 e4 c5 Nf3 e6 d3 Nc6 Kd2
WhiteWinDraw 54.82 e4 c5 Nf3 e6 d3 Nc6 Ke2
WhiteWinDraw 54.55 e4 c5 Nf3 e6 d3 Nc6 Na3
WhiteWinDraw 57.48 e4 c5 Nf3 e6 d3 Nc6 Nbd2
WhiteWinDraw 52.07 e4 c5 Nf3 e6 d3 Nc6 Nc3
WhiteWinDraw 54.69 e4 c5 Nf3 e6 d3 Nc6 Nd4
WhiteWinDraw 53.83 e4 c5 Nf3 e6 d3 Nc6 Ne5
WhiteWinDraw 60.95 e4 c5 Nf3 e6 d3 Nc6 Nfd2
WhiteWinDraw 56.18 e4 c5 Nf3 e6 d3 Nc6 Ng1
WhiteWinDraw 54.51 e4 c5 Nf3 e6 d3 Nc6 Ng5
WhiteWinDraw 55.05 e4 c5 Nf3 e6 d3 Nc6 Nh4
WhiteWinDraw 52.88 e4 c5 Nf3 e6 d3 Nc6 Qd2
WhiteWinDraw 53.56 e4 c5 Nf3 e6 d3 Nc6 Qe2
WhiteWinDraw 53.83 e4 c5 Nf3 e6 d3 Nc6 Rg1
WhiteWinDraw 48.60 e4 c5 Nf3 e6 d3 Nc6 a3
WhiteWinDraw 49.86 e4 c5 Nf3 e6 d3 Nc6 a4
WhiteWinDraw 49.73 e4 c5 Nf3 e6 d3 Nc6 b3
WhiteWinDraw 49.77 e4 c5 Nf3 e6 d3 Nc6 b4
WhiteWinDraw 48.51 e4 c5 Nf3 e6 d3 Nc6 c3
WhiteWinDraw 49.23 e4 c5 Nf3 e6 d3 Nc6 c4
WhiteWinDraw 47.56 e4 c5 Nf3 e6 d3 Nc6 d4
WhiteWinDraw 49.77 e4 c5 Nf3 e6 d3 Nc6 e5
WhiteWinDraw 48.33 e4 c5 Nf3 e6 d3 Nc6 g3
WhiteWinDraw 50.00 e4 c5 Nf3 e6 d3 Nc6 g4
WhiteWinDraw 49.23 e4 c5 Nf3 e6 d3 Nc6 h3
WhiteWinDraw 49.86 e4 c5 Nf3 e6 d3 Nc6 h4
First, you'll note that each line contains the current game, but ends
with one of the legal moves. Just before each game sequence, there
is a score for that sequence, and since the sequences are nearly identical,
the scores are nearly identical too.
In the world of dbacl, these scores are the negative logarithm (base 2) of the
probability of the sequence, based on the model WhiteWinDraw. It's best to think
of these scores as a distance away from the category model, so the line with
the smallest score is the most likely. If you want to know what probability
each sequence has, it's 1/(2^54.73) etc., which is pretty close to zero! But that's
normal with these kinds of models.
So what's the best move? Simply sort the lines in increasing order by score
and print out the first line:
% cat test.scores | sort -k 2 -n | head -1
WhiteWinDraw 47.56 e4 c5 Nf3 e6 d3 Nc6 d4
Remember, dbacl recommends what it thinks most of the
games it has learned would do. We can take a peek at the effect of each half move on the
score for the first line of test.complete by using dbacl's debugging switch:
% head -1 test.complete
e4 c5 Nf3 e6 d3 Nc6 Bd2
% head -1 test.complete | ./dbacl/src/dbacl -nv -c ./WhiteWinDraw -f 1 -d
# categories: WhiteWinDraw
# format: avg_score * complexity
20.25 * 0.5 []e4[](1)
2.91 * 1.0 []e4[]c5[](1)
8.82 * 1.5 []c5[](1)
4.55 * 2.0 []c5[]Nf3[](1)
7.65 * 2.5 []Nf3[](1)
3.15 * 3.0 []Nf3[]e6[](1)
5.71 * 3.5 []e6[](1)
3.21 * 4.0 []e6[]d3[](1)
5.48 * 4.5 []d3[](1)
4.34 * 5.0 []d3[]Nc6[](1)
5.83 * 5.5 []Nc6[](1)
4.31 * 6.0 []Nc6[]Bd2[](1)
5.75 * 6.5 []Bd2[](1)
The scores we saw earlier are obtained by multiplying the average by the
complexity, but dbacl internally works with nats, not bits,
so 5.75 * 6.5 / ln(2) = 53.9. Since here we just want to see the tokens that are being
used, these values aren't important. The most important thing to see is that
there are contributions from single half moves as well as pairs of half moves,
and the score balances them all.
Playing against people
Chess without a board is not as much fun as chess with an actual board.
If we really want to claim that dbacl can play chess, then we need to
make it work with something like
GNU XBoard so it
can play against people. Before we go on, you'll have to make sure
this program is installed. It comes standard with most GNU/Linux distributions,
for example on Debian you can type
as root:
% apt-get install xboard
A real chess engine that works with XBoard must implement the
Chess Engine
Communication Protocol. This is a bit tedious, so I prepared one earlier.
Save it as a file named dce-basic.sh in your chess
directory, or type it yourself from the listing below.
Note that I've just created enough code to play a simple game
without undo, force, switch sides or board setup. In other words, the only
thing you can do is start a new game, and play the moves, and dbacl must play Black.
% cat > dce-basic.sh
#!/bin/bash
# This script functions as an incomplete chess engine for XBoard.
DBACL=./dbacl/src/dbacl
SAN=./SAN/SAN_SRC/san
TMP=.
DCE=dce-basic
SANOK="no"
PGN="$TMP/$DCE.current.pgn"
GAMELINE="$TMP/$DCE.current.gameline"
SCORES="$TMP/$DCE.current.scores"
ENGINEMOVE="$TMP/$DCE.current.emove"
SANOUT="$TMP/$DCE.current.stdout"
SANERR="$TMP/$DCE.current.stderr"
SIDE="black"
MOVENOW="no"
CATFILE="./BlackWinDraw"
trap "" SIGTERM
trap "" SIGINT
function exec_san() {
rm -rf $PGN.new $SANOUT $SANERR
echo -ne "svop namd nabd napr\nlfer $PGN\n$1\nsfer $PGN.new" \
| "$SAN" > "$SANOUT" 2> "$SANERR"
if grep Error "$SANOUT" > /dev/null; then
echo "Error (illegal move): $cmd"
return 1
else
mv $PGN.new $PGN
fi
return 0
}
function do_engine_move() {
if exec_san "enum 1"; then
# legal moves are in $SANOUT
cat "$PGN" \
| sed -e 's/\r//g' \
| sed -e :a -e '$!N;s/\n\([a-hKQNBRO0-9]\)/ \1/;ta' -e 'P;D' \
| sed -e 's/^ *//' \
| grep '^1\.' \
| sed 's/[0-9]*\.[ ]*//g' \
| sed 's/ \*.*$//' \
> "$GAMELINE"
# make all completions and let dbacl decide
cat "$SANOUT" \
| grep '.* : 1$' \
| cut -f 1 -d ' ' \
| while read move; do
echo "`cat $GAMELINE` $move"
done \
| "$DBACL" -n -c "$CATFILE" -f 1 \
> "$SCORES"
if [ `cat "$SCORES"| wc -l` = "0" ]; then
# no moves left, game over!
# the gameline contains the result
echo "`cat $GAMELINE | sed 's/^.* //'` {Play again?}"
return 0
else
# pick best scoring move
cat "$SCORES" \
| sort -k 2 -n \
| head -1 \
| sed -e 's/^.* //' \
> "$ENGINEMOVE"
if exec_san "`cat $ENGINEMOVE`"; then
echo "move `cat $ENGINEMOVE`"
return 0
fi
fi
fi
return 1
}
function do_reset() {
rm -f $PGN
touch $PGN
exec_san ""
}
function do_move() {
if [ "$SANOK" != "yes" ]; then
echo "Illegal move (you must use SAN): $1"
echo "Error (cannot play): $1"
fi
if exec_san "$1"; then
MOVENOW="yes"
fi
}
while read cmd; do
case "$cmd" in
xboard)
echo "feature san=1 done=1"
;;
"accepted san")
SANOK="yes"
;;
quit)
exit 0
;;
new)
do_reset
;;
variant*)
echo "Error (only standard chess please): $cmd"
;;
[a-h][x1-8]*)
do_move "$cmd"
;;
[KQBNRP][xa-h]*)
do_move "$cmd"
;;
O-O*)
do_move "$cmd"
;;
analyze)
echo "Error (unknown command): $cmd"
;;
*)
# ignore other commands
echo "Error (command not implemented): $cmd"
;;
esac
if [ "$MOVENOW" = "yes" ]; then
if do_engine_move; then
MOVENOW="no"
else
echo "Error (engine blew up): kaboom"
exit 1
fi
fi
done
The source code above is not very complex. The last part reads commands one
line at a time and calls various functions depending on the command.
Then, if it is Black's turn, it calls the do_engine_move function.
The current state of the game is constantly updated in the dce-basic.current.pgn file which is created in your chess directory, the list
of scores computed by dbacl is in the file dce-basic.current.scores, and the final best engine move is saved in the file dce-basic.current.emove.
Once you have this engine script in your chess directory, make sure it has
execute permissions, and then you can try it out with XBoard.
% chmod +x ./dce-basic.sh
% xboard -fcp ./dce-basic.sh
It's much more fun to read this account if you actually try this out
with XBoard yourself, so I won't say what happens next or how well
dbacl does as a chess player. The discussion with answers continues
in the next section so consider this a spoiler warning. Don't peek!
When playing chess on XBoard, don't forget that dce-basic.sh is
incomplete. All you can do is play White, and restart the game whenever
you want. No fancy menu choices. However, you can follow what's going
on at any time by looking at the temporary files named
dce-basic.current.* which are placed in your chess directory.
First steps
Okay, you've tried your first game against dbacl, and it wasn't very
impressive. In fact, it was practically random play! Note that it
isn't entirely random - dbacl doesn't pick for example a random pawn
on first move but slightly prefers positions towards the middle of the board. What's
going on?
There can be only one culprit, and it's the model we used for learning,
namely the "-w 2" switch. To understand this, let's look at the game line
again:
% cat test.gameline
e4 c5 Nf3 e6 d3 Nc6
Now dbacl learns by reading single words (ie half moves) and consecutive pairs
of words, then building a model out of these frequencies. This means that there is nothing in the model which makes for example the third word Nf3 depend on
the first one e4 except through the second one c5, which could well be anything.
In other words, consecutive moves by one colour are largely independent,
which is why it feels that dbacl has no strategy.
We can force dbacl instead to learn 3, or better yet 4 consecutive words
as a pattern. That should link e4 with Nf3 and even e6. Let's try it (this could take
a few minutes to run):
% ./dbacl/src/dbacl -T text -l ./BlackWinDraw -e alnum -L uniform -j -w 4 -H 20 ./gamefiles/BlackWinDraw.txt
dbacl:warning: table full, some tokens ignored - try with option -h 21
Oh, oh. Here's a problem that we might as well get out of the way now.
Looking at pairs, triples, etc. (these are called n-grams) uses a lot of memory
because of all the possible combinations. That's one reason why long n-grams aren't that
popular in machine classification.
Here dbacl ran out of space and told us so, but
still tried to learn something. But we don't want a skewed model, so we'll
increase the allowable space and relearn. You will have to be
mindful of this when you experiment.
% ./dbacl/src/dbacl -T text -l ./BlackWinDraw -e alnum -L uniform -j -w 4 -H 22 ./gamefiles/BlackWinDraw.txt
% xboard -fcp ./dce-basic.sh
Several steps
It's hard to see a big improvement with "-w 4", which is not very surprising if you think about it.
Let's try with "-w 7", which for technical reasons is the maximum dbacl can
handle. But before we do this, I want to briefly mention the work of
C.E. Shannon.
One of Shannon's famous experiments concerns the approximation of English. He
asked what would sentences look like if letters or words were picked randomly from
a book. Here is what he found:
(Single letters) OCRO HLI RGWR NMIELWIS EU LL NBNESEBYA TH EEI ALHENHTTPA OOBTTVA NAH BRL.
(Pairs of letters) ON IE ANTSOUTINYS ARE T INCTORE ST BE S DEAMY ACHIN D ILONASIVE TUCOOWE AT TEASONARE FUSO TIZIN ANDY TOBE SEACE CTISBE.
(Triples of letters) IN NO IST LAT WHEY CRATICT FROURE BIRS GROCID PONDENOME OF DEMONSTURES OF THE REPTAGIN IS REGOACTIONA OF CRE.
(Single words) REPRESENTING AND SPEEDILY IS AN GOOD APT OR COME CAN DIFFERENT NATURAL HERE HE THE A IN CAME THE TO OF TO EXPERT GRAY COME TO FURNISHES
THE LINE MESSAGE HAD BE THESE.
(Pairs of words) THE HEAD AND IN FRONTAL ATTACK ON AN ENGLISH WRITER THAT THE CHARACTER OF THIS POINT IS THEREFORE ANOTHER METHOD FOR THE LETTERS THAT
THE TIME OF WHO EVER TOLD THE PROBLEM FOR AN UNEXPECTED.
So can we expect this also with chess moves? Let's try it. Because of
the heavy memory requirements when "-w 7" is used (and the long time it takes),
we'll learn a smaller
game collection BlackWin.txt
(but we keep the BlackWinDraw category name so we don't have to modify dce-basic.sh).
% ./dbacl/src/dbacl -T text -l ./BlackWinDraw -e alnum -L uniform -j -w 7 -H 23 ./gamefiles/BlackWin.txt
Before we start playing, how can we be sure that dbacl will match patterns in the way we expect?
Let's try the debug switch like we did before on our test gameline:
% head -1 test.complete | ./dbacl/src/dbacl -nv -c ./BlackWinDraw -f 1 -d
# categories: BlackWinDraw
# format: avg_score * complexity
63.73 * 0.1 []e4[](1)
-9.44 * 0.3 []e4[]c5[](1)
15.46 * 0.4 []c5[](1)
6.91 * 0.6 []e4[]c5[]Nf3[](1)
-6.53 * 0.7 []c5[]Nf3[](1)
5.07 * 0.9 []Nf3[](1)
4.17 * 1.0 []e4[]c5[]Nf3[]e6[](1)
1.31 * 1.1 []c5[]Nf3[]e6[](1)
-9.96 * 1.3 []Nf3[]e6[](1)
-2.22 * 1.4 []e6[](1)
-0.86 * 1.6 []e4[]c5[]Nf3[]e6[]d3[](1)
-1.00 * 1.7 []c5[]Nf3[]e6[]d3[](1)
-3.88 * 1.9 []Nf3[]e6[]d3[](1)
-9.31 * 2.0 []e6[]d3[](1)
-3.80 * 2.1 []d3[](1)
-2.59 * 2.3 []e4[]c5[]Nf3[]e6[]d3[]Nc6[](1)
-1.58 * 2.4 []c5[]Nf3[]e6[]d3[]Nc6[](1)
-2.16 * 2.6 []Nf3[]e6[]d3[]Nc6[](1)
-2.99 * 2.7 []e6[]d3[]Nc6[](1)
-5.49 * 2.9 []d3[]Nc6[](1)
-2.12 * 3.0 []Nc6[](1)
0.51 * 3.1 []e4[]c5[]Nf3[]e6[]d3[]Nc6[]Bd2[](1)
2.92 * 3.3 []c5[]Nf3[]e6[]d3[]Nc6[]Bd2[](1)
5.13 * 3.4 []Nf3[]e6[]d3[]Nc6[]Bd2[](1)
7.16 * 3.6 []e6[]d3[]Nc6[]Bd2[](1)
6.37 * 3.7 []d3[]Nc6[]Bd2[](1)
3.35 * 3.9 []Nc6[]Bd2[](1)
5.84 * 4.0 []Bd2[](1)
Perfect! Clearly dbacl is picking up sequences up to seven long. Now
let's make a proper check:
% xboard -fcp ./dce-basic.sh
Another improvement
If you've tried the case "-w 7", then you may have noticed a true improvement
over the previous attempts. But while dbacl no longer plays quite so randomly, the overall
game seems touch and go. Openings are sometimes recognized, but there's no
strategy and frequently dbacl seems to forget what it was doing. Also,
there aren't many attempts to protect the 8th row from even direct attacks.
We can explain this type of behaviour with what we already know about the
dbacl model. Since one chess move needs two words in PGN notation,
then even with "-w 7", the longest connected word sequences
are just over 3 chess moves long. These sequences aren't under dbacl's control
since you play White, so when they break, this causes confusion and another
potential sequence is followed. That's why dbacl seems to lose interest.
Another problem is that dbacl's model has no concept of opening, middle and
endgame. If row 8 is attacked, it has no way of knowing if there are pawns
which protect the piece, and whether there is room to move away, because
the training patterns are averaged over many games. We'll come
back to this observation later.
So far we've treated half-moves as fundamental, but perhaps it makes more sense
to base our models on full moves? Since our engine always plays
Black, then the completed gamelines will always have an even number of
words. Moreover, the '-w 7" model shall truly be seven chess moves long, not three
and a half. Let's try it. We can combine pairs of moves by replacing the
space between them with underscores as follows (combine_half_moves.sh).
% cat test.gameline
e4 c5 Nf3 e6 d3 Nc6
% cat > combine_half_moves.sh
#!/bin/sh
sed -e 's/$/ Z/' -e 's/ \([^ ]* *[^ ]\)/_\1/g' -e 's/[ ]*Z$//'
% chmod +x combine_half_moves.sh
% cat test.gameline | ./combine_half_moves.sh
e4_c5 Nf3_e6 d3_Nc6
Now we have to adapt the training sets, and change the code of dce-basic.sh
slightly. I've called this modification dce-1.sh
% cd gamefiles
% cat BlackWinDraw.txt | ../combine_half_moves.sh > BlackWinDraw-1.txt
% cat BlackWin.txt | ../combine_half_moves.sh > BlackWin-1.txt
% cd ..
Naturally, we must learn the new dataset. You might want to make a cup of coffee while you wait.
% ./dbacl/src/dbacl -T text -l ./BlackWin-1 -e graph -L uniform -j -w 7 -H 23 ./gamefiles/BlackWin-1.txt
One last safety check and we'll be ready:
% cat test.gameline | ./combine_half_moves.sh | ./dbacl/src/dbacl -nv -c ./BlackWin-1 -f 1 -d
# categories: BlackWin-1
# format: avg_score * complexity
128.93 * 0.1 []e4_c5[](1)
-11.83 * 0.3 []e4_c5[]Nf3_e6[](1)
37.71 * 0.4 []Nf3_e6[](1)
17.15 * 0.6 []e4_c5[]Nf3_e6[]d3_Nc6[](1)
-20.69 * 0.7 []Nf3_e6[]d3_Nc6[](1)
7.60 * 0.9 []d3_Nc6[](1)
Okay, it seems to be working.
% chmod +x dce-1.sh
% xboard -fcp ./dce-1.sh
Aimless behaviour
Compared to the earlier attempts, this change seems to have improved
dbacl's tactics. However, we
still have much aimless behaviour in the middle and end games. How do
we address this?
Our biggest problem is possibly that dbacl is blind. It simply doesn't know or care about
the true configuration of the board pieces, like a real chess engine would.
Instead, everything it knows are the likely sequences of moves that it found
in the training games.
Computer chess is an area which has been studied extensively for a long time,
and while we could try to apply these methods to our little engine,
we wouldn't learn anything new about chess or about dbacl. So instead,
I am only going to try the kind of things that would not be out of place in spam filtering.
Deep in the dawn of spam filtering, people devised keyword rules to
send unwanted email to the trash. Even today, this is a popular method
to detect, for example, those messages which contain "VIAGRA" in the
subject line, and automatically pick an action to take.
Maybe we can detect some fixed text pattern in the gameline and use this to override
the normal dbacl scores? Let's look at a typical game.
% head -1 ./gamefiles/BlackWin.txt | fmt
d4 Nf6 c4 e6 Nf3 b6 g3 Ba6 Qc2 c5 Bg2 Bb7 O-O Be7 Nc3 cxd4 Nxd4 Bxg2
Kxg2 Qc7 Qd3 O-O e4 d6 f3 Nbd7 b3 a6 Be3 Qb7 Rfd1 Rfe8 Bf2 Bf8 Nc2 Rec8
Ne3 Rab8 a4 Ne5 Qd2 Rc7 Rac1 Rbc8 Qe2 Nc6 Be1 Nd7 g4 Nc5 Qc2 Nb4 Qb1
Be7 Ne2 Nc6 Bc3 Ne5 Ng3 Bg5 Ngf1 Bf4 Bxe5 Bxe5 Rd2 Bf4 Rcd1 b5 axb5
axb5 Ra2 Nd7 Qd3 Nc5 Qc2 h6 Ra5 bxc4 bxc4 Nd7 Qe2 Ne5 Rb5 Qa6 Rb2
Nxc4 Nxc4 Qxc4 Rd3 Qc5 Ne3 Qg5 Qf2 Rc3 Rxc3 Rxc3 Nf1 Kh7 Rc2 Qe5 Ng3
Be3 Qe2 g6 Nf1 Bf4 Ng3 Kg7 Qf2 Be3 Qe2 Qd4 Nf1 Bf4 Ng3 Rxc2 Qxc2 Qd2+
Qxd2 Bxd2 Ne2 g5 Nd4 Kf6 Nb3 Bc3 Kf2 Ke5 Ke3 Bb4 Nc1 Bc5+ Ke2 Bg1 Nd3+
Kd4 h3 Kc3 Nc1 Bh2 Nd3 Bf4 Nf2 d5 exd5 exd5 Nd3 Be5 Nc1 Kd4 Nd3 Bf4 Nb4
Ke5 Nd3+ Kd6 Nb4 Ke6 Nd3 Bd6 Nb2 Ke5 Nd3+ Kf6 Nb2 Ke6 Nd3 h5 Nf2 f5 Nd1
fxg4 fxg4 hxg4 hxg4 Kf6 Nb2 Ke5 Kf3 Kd4 Nd1 Kd3 Nf2+ Kd2 Nh3 Be7 Nf2
Bc5 Nh1 Bd6 Nf2 Bf4 Nh1 Be3 Ng3 Kd3 Nh1 Bg1 Ng3 Kd2 Nf5 d4 Ng3 d3 0-1
Besides the normal moves that represent a change in position, the most
obvious feature is that some of
the moves above also contain an 'x', which means that this is a capturing move.
Interesting! One of the problems with the dbacl engine so far is that often,
an opportunity for capturing White's pieces is simply ignored. Can we devise
a keyword rule which triggers on 'x' to force dbacl to capture an available
piece instead of ignoring it?
Let's try it: first we'll need a gameline which includes an 'x' type
move, since our previous test.gameline file doesn't. Note that
we'll forget temporarily the underscore trick we used in the previous
section, just to keep things simple at first.
If you look at the full game listed two paragraphs ago, you'll see that
the last full move on the first line is "Nxd4 Bxg2", so if we use this line
and delete the last full move, then the possible completions will contain at least "Nxd4" and we have a suitable test case.
% cat > test2.pgn
1. d4 Nf6 2. c4 e6 3. Nf3 b6 4. g3 Ba6 5. Qc2 c5 6. Bg2 Bb7 7. O-O Be7 8. Nc3 cxd4
% echo -ne "svop verb namd nabd napr\nlfer test2.pgn\nenum 1\n" \
| ./SAN/SAN_SRC/san > test2.legal 2>&1
% cat test2.legal | grep '^.* : 1$' | cut -f 1 -d ' ' | \
while read move; do
echo `cat test2.gameline` $move
done > test2.complete
% cat test2.complete | ./dbacl/src/dbacl -n -c ./WhiteWinDraw -f 1 > test2.scores
There are 48 different potential moves in the test2.scores file, and
what we are interested in is the score (column 2) and the potential half move
(column 21).
% cat test2.scores | sort -k 2 -n | cut -f 2,21 -d ' ' | head -25
129.08 h3
129.35 a4
129.89 a3
129.89 b3
129.89 e3
133.09 Qg6
133.36 Bf4
133.36 Bg5
133.36 Qa4
133.36 Rb1
133.36 Rd1
133.86 Be3
133.86 Bh3
133.86 Nb5
133.86 Ne4
133.86 Nh4
133.86 Qb3
133.86 Qd3
133.86 Qe4
137.87 Nxd4
154.68 e4
154.96 c5
155.77 b4
155.81 h4
155.95 g4
I've only listed the first 25 moves by score but it's clear that dbacl's
model puts "h3" as the most likely move, and "Nxd4" way down in 20th position!
Let's extract the capturing moves.
% cat test2.scores | grep 'x[^ _]*$' | sort -k 2 -n
WhiteWinDraw 138.73 d4 Nf6 c4 e6 Nf3 b6 g3 Ba6 Qc2 c5 Bg2 Bb7 O-O Be7 Nc3 cxd4 Nxd4 Bxg2 Nxd4
WhiteWinDraw 162.90 d4 Nf6 c4 e6 Nf3 b6 g3 Ba6 Qc2 c5 Bg2 Bb7 O-O Be7 Nc3 cxd4 Nxd4 Bxg2 Qxh7
Since there is more than one possible capturing move, the engine has to decide what it wants
to play.
By sorting the capturing moves by their scores, we let dbacl tell us which move it prefers.
Obviously, "Nxd4" is this preferred candidate.
It's also possible that for some gamelines, there is no legal capturing move; we'll have
to use the full list of scores as before in that case.
Finally, we must decide how we are going to integrate this special handling of capturing moves
into our chess engine.
In spam filters, a keyword rule typically stops all other tests from happening afterwards, because the rule is trusted to override other decisions. Here, this means that we first score
all the capturing moves if there are any, and use the best one regardless of other options.
It's only if there are no capturing moves that we look at the remaining possibilities.
By using the explanations above, you can modify dce-basic.sh yourself, or try
out dce-2.sh, which implements both the 'x' and "underscore" tricks together.
% chmod +x dce-2.sh
% xboard -fcp ./dce-2.sh
Tactical adjustments
Quite a change in behaviour! The engine no longer ignores capture opportunities,
but our method has
made it greedy. dbacl is now so greedy that there is no tension left in the game, and
since it doesn't know the value of each piece, it often makes bad bargains.
Unfortunately, there isn't enough data in the PGN format to let dbacl
read off easily the value of an exchange. If you look at the full training
game listed earlier, you'll see many moves marked with an 'x', such as
"cxd4" "Nxd4" "Bxg2" "Kxg2", but none of these moves identifies the
type of the captured piece, only the piece doing the capturing. It's certainly possible to deduce
the relevant piece by replaying all the moves on an imaginary board,
but then our chess engine would no longer act like a spam
filter. Keeping an imaginary board is roughly the equivalent of
understanding the actual meaning of an email.
So dbacl, as a chess-playing-spam-filter, is limited to two things:
it can limit the risk of an exchange, and limit the frequency of exchanges.
To limit risk, if there are several capture scenarios available, it can
always prefer to capture with the least valued piece (since the type of the piece
doing the capturing is known by looking at the move), but note that this doesn't help
when there is exactly one capture move possible.
Limiting the frequency of exchanges would make dbacl less greedy and
refuse to capture pieces all the time. As we noted earlier, dbacl doesn't naturally
tend to capture pieces, so we must find a balance.
We can force a capture when there are two or more capture opportunities. Together
with the risk limitation idea, these types of captures are then performed by
lower value pieces in relative terms.
Clearly, there are endless other things we can try, but there is a price to pay.
As capture rules become heavier and more complex, the original patterns learned by
dbacl from the training games lose their importance. The dbacl chess engine becomes
a hybrid, part ordinary chess engine and part Bayesian text classifier.
I've implemented the two rules above in dce-3.sh. Try it out now.
% chmod +x dce-3.sh
% xboard -fcp ./dce-3.sh
Randomized play
We've come a long way, so let's go back to the initial question.
"Can a spam filter play chess?". To answer this question affirmatively,
we need at least one example which can't be explained by random play.
While dbacl undoubtedly makes some strange chess decisions sometimes,
let's look at the following game fragment played against
dce-3.sh:
Clearly, dbacl picked up the pawn defensive moves from the games archive,
as this succession of moves is very unlikely to be random (all moves are on
the same side of the board, and none of the moves are capturing moves, so
cannot be explained by capture heuristics which partially override dbacl's
natural choices).
So dbacl has definitely learned something about chess, at
least in some tactical situations,
and can probably hold
its own against an average three year old. Mission accomplished!
Well, the title of this section is "randomized play", and there is
one more thing to do. So far, the dbacl chess engine is deterministic:
when faced with identical White moves, it will always play the same way.
This gets boring very fast. What we would like is more randomized
play, but which still uses the dbacl scoring system.
For randomized play, we can think of the scores as equivalent
probabilities. Then we
can pick not just the highest probability (lowest score) move as we've done so far, but instead any move according
to its probability. Let's look again at the file test.scores
that we created earlier:
% head -5 test.scores
WhiteWinDraw 53.88 e4 c5 Nf3 e6 d3 Nc6 Bd2
WhiteWinDraw 52.93 e4 c5 Nf3 e6 d3 Nc6 Be2
WhiteWinDraw 52.16 e4 c5 Nf3 e6 d3 Nc6 Be3
WhiteWinDraw 53.11 e4 c5 Nf3 e6 d3 Nc6 Bf4
WhiteWinDraw 52.88 e4 c5 Nf3 e6 d3 Nc6 Bg5
As I already mentioned, the dbacl model probabilities are related to
these scores as follows: Prob[Bd2] = 2^(-53.88), Prob[Be2] = 2^(-52.93), etc.
Unfortunately, these are all very small probabilities and we can't
easily represent them as floating point numbers once the gameline becomes
much longer, let alone pick them at random. So first, let's subtract the biggest common factor: If we sort the scores, then the top score is the factor we want:
% cat test.scores | sort -k 2 -n > test.scores.prob
% head -1 test.scores.prob
WhiteWinDraw 47.56 e4 c5 Nf3 e6 d3 Nc6 d4
We'll write an awk program to do the randomizing, so here's the first part:
% cat > renorm.awk
#!/usr/bin/awk -f
{
if( cf == 0 ) {
cf = $2
}
print $2 - cf, $0
}
% cat test.scores.prob | awk -f ./renorm.awk | head -5
0 WhiteWinDraw 47.56 e4 c5 Nf3 e6 d3 Nc6 d4
0.77 WhiteWinDraw 48.33 e4 c5 Nf3 e6 d3 Nc6 g3
0.95 WhiteWinDraw 48.51 e4 c5 Nf3 e6 d3 Nc6 c3
1.04 WhiteWinDraw 48.60 e4 c5 Nf3 e6 d3 Nc6 a3
1.67 WhiteWinDraw 49.23 e4 c5 Nf3 e6 d3 Nc6 c4
Okay, the first column gives the modified score and the rest is the original
line. Since these are binary exponents rather than actual probabilities,
we'll use a rejection sampler to pick the correct line and print it out.
Here's the full randomizer:
% cat > randomizer.awk
#!/usr/bin/awk -f
{
if( cf == 0 ) {
cf = $2;
}
score[NR] = $2 - cf;
line[NR] = $0;
}
END{
# randomizer seeded by time of day
# don't use more often than once per second.
srand();
while(1) {
x = int(rand() * NR) + 1;
t = -log(rand());
if( log(2) * score[x] < t ) {
print line[x];
break;
}
}
}
% cat test.scores.prob | ./randomizer.awk
WhiteWinDraw 48.33 e4 c5 Nf3 e6 d3 Nc6 g3
% cat test.scores.prob | ./randomizer.awk
WhiteWinDraw 47.56 e4 c5 Nf3 e6 d3 Nc6 d4
Awk's randomizer is seeded by the time of day, so if you repeatedly run the randomizer.awk script in a fast loop, it will always output the same result. But for chess against people, we don't really care about the quality of the randomness involved.
I've added the randomizer to the final version of the chess engine, dce.sh, which also contains a few other improvements. If you feel
like experimenting further with the dbacl chess engine, that script is probably
the best starting point. Enjoy!
% chmod +x ./dce.sh
% xboard -fcp ./dce.sh
Wrapping up
It's time to conclude this investigation and see what we've learned. The
original question was "Can a spam filter play chess?". Clearly, the answer to
this is yes, but making it play well is not so easy.
A crucial aspect I haven't touched on here are chess tournaments. The only way to
reliably judge potential improvements in an engine is to make it play other engines
with known strengths. Not all types of tournaments are appropriate for the dbacl
chess engine - for example randomized initial positions and endgame puzzles are meaningless,
because dbacl doesn't think ahead more than one half move, and it needs the full game
history for matching patterns. Moreover, the fundamental behaviour is learned entirely
from training sets. Thus dbacl's strength as a chess player is meaningless without reference
to its training archive. However, if this archive is fixed, then incremental improvements to the
algorithms can be evaluated in that context.
dbacl is able to learn some tactics simply by reading large collections of
games. However, it seems that strategy is beyond its capabilities. Moreover,
there are some fundamental limitations in treating a PGN format game like a text
document: some information such as the values of exchanged pieces aren't easy to read off
without keeping an imaginary board for replaying moves.
There are many ways to change the characteristics of the basic chess engine we've
built here. Besides more complex capture heuristics, one can try to account for
the length of the game (opening/middle/endgame), the difference in the number of pieces
captured by each side, etc. Beyond that, the PGN gameline representation which we used
here can be replaced with more informative symbol sequences. In principle, one could replace
each move with a pictorial representation of the board such as
FEN, but besides the added implementational complications, this causes difficulties because of dbacl's limit of 30 character tokens, and possibly statistical issues in recognizing similar but not identical board configurations.
Perhaps most interestingly overall, it should be remembered that dbacl
doesn't think ahead like most chess engines do. Its successes and
failures are almost entirely based on the historical record of the
game as it develops and mimicry of training games, not at all on calculating moves and
countermoves in the future.
Download the latest dbacl chess engine: dce.sh
Post Scriptum
After this essay was written, it appeared on slashdot. This resulted in some interesting comments and emails. I've selected some of them
below and added some responses.
-
aug24 writes on slashdot
"For example, it currently uses entire games to
compare. So if it comes across an unusual opening,
even one close to a standard one, it's not able to
decide effectively. Perhaps something using game
fragments would be possible, then it might
reproduce structured plays even when the previous
game play has been unusual."
-
It's true that in the essay, the full game is scored from the beginning,
but this is only for convenience. For the actual decisions, solely the
last few moves matter (how many depends on the -w switch), and the extra
score contributions from the beginning are identical for all possible
choices, so cancel out in the final decision. The beginnings of the games
could be cut away before scoring, but this would be more work and would
make no difference in the way dbacl plays chess.
-
N3Roaster writes on slashdot
"I've got one. I read somewhere that high level
chess players don't view the current game
state so much in terms of exactly where each
piece is, but in terms of piece groupings. I'm
more of a go player myself, but it seems like
that would probably be right. Perhaps a richer
yet more abstract structure representing game
state would be better. In this way, situations
where the structure of play dynamics are the
same even while the key pieces are in
different positions could be picked up on.
This combined with your game fragment notion
would probably allow the chess filter to learn
much faster."
-
Other representations besides PGN are certainly possible and worth
trying. Since dbacl learns patterns directly from input text, it
is first necessary to convert the training games into such another
representation, then the current game must also be available in this
representation during play.
-
pclminion writes on slashdot
"I'm sure people have thought of it before, but
they haven't done it because it's clear that
it can't work unless it's given an infinite
context of moves. Otherwise, you can set the
computer up to fail by creating a situation
near the beginning of the midgame and then
playing out a line that traps the computer as
soon as it runs out of context.
It is also limited to what it learns by
"observing" other games. So it can never move
creatively or unexpectedly."
-
dbacl does not operate this way. It is not limited to
matching known sequences which were seen in the training
games, but instead predicts likelihoods for never before seen sequences
from its model by generalization. There is no need for an infinite
context of moves, the model is in fact learned from a limited number of
training games, and once learned, it can handle any game sequence
"creatively", since such a sequence is decomposed into known and unknown
fragments and recombined through the rules of probability.
The dbacl chess engine can also move "unexpectedly", either because
the model differs from a player's expectation, or becasue of the
randomization step explained at the end of the essay.
-
mrthoughtful writes on slashdot
"It would never get to beat modern computer chess
programs, as it depends upon a database of previous
games that are similar to the current game played, and
has no scope for examining possible futures."
-
While it is true that in the essay, dbacl predicts the next move
by examining the immediate past, there is nothing inherently
impossible about predicting several moves ahead. With some work,
the engine can be modified to generate all the game sequences which
are two or more moves into the future, and dbacl will happily score
each such sequence as well.
However, the fundamental difficulty with
chess is that the number of game sequences grows exponentially with
the number of predicted moves, so the computational cost of predicting
several moves into the future grows very quickly.
-
Steinfiend writes on slashdot
"So a
Chess playing Bayesian filter isn't necessarily 100%
useful now, but what they learned from doing it might
be able to be applied in some other situation. Maybe
they can reapply whatever they learned back into spam
filtering and improve all of our in-boxes."
-
Perhaps a different way of saying this is that both spam
filtering and this chess playing experiment exhibit common
weaknesses. For example, currently the dbacl chess engine
has difficulty adapting to the middle and end games. This is
difficult because there is no simple boundary between the phases
of a chess game. Similarly, general emails have a complicated structure
but it can be hard to distinguish its semantic parts. A message
can switch from being professional to personal etc. This weakness is
hard to observe in spam filtering, but much easier to see in the
chess experiment.
-
Chris_Mir writes on slashdot
"I can imagine, depending on how many games are played and
the available memory space, that it will develop to have a
decent opening. But nothing more then that." and
aug24 responds "Absolutely, because it is effectively playing with
complete games in order to do well, which is
equivalent to making a tree of every possible chess
move from the start - a rather big data set.
-
dbacl doesn't remember full trees, it picks up game fragments of
a certain number of moves and uses these to analyze unknown and longer
sequences using Bayesian theory. The openings are learned more quickly
because everyone starts with the same initial board setup, and there
are only a handful of very popular openings in training games.
-
Flyboy Connor writes on slashdot
"The premise of a Bayesian filter is that is learns
sequences of words, or characters, or whatever. Spam-chess
learns sequences of moves. This premise is wrong, since
good moves are related to complete board positions, not to
what was done in the previous few moves.
Of course, the longer your string of moves is, the better
it will represent the board position, especially during
the opening phase of the game. And the example the article
provides of reasonable play of spam-chess, is actually
from the game's opening, where the learned sequences
indeed represent the complete game.
For the middle game, however, spam chess will perform
badly, always."
-
This is a good comment. It's true that chess requires an understanding
of the board positions, and the PGN format is very limited in
this respect (however, the PGN sequence is indirectly equivalent
to a full board representation, since it is possible to reconstruct the
board from such a sequence). However, it does not follow that
this approach must always perform badly in the middle game. Chess is
a combination of both strategy and tactics, and tactical thinking is
confined to a small number of consecutive moves, precisely what dbacl
tends to pick up. A good tactician can play well in the middle game,
even if he is a bad strategist.
-
m50d writes on slashdot
"How about applying it differently, getting it to learn
to pick a move for each position? Obviously that's a
little harder than a simple spam/not spam judgement,
but I'd have thought you could get it to recognise
that if there are 3 pawns in front of the king and you
have a rook available you can do a back rank mate,
etc."
-
This can be done by using an appropriate choice of representation.
The PGN format is not the only possible format for representing chess.
The FEN format represents the full board, and a hybrid representation
could include any features of interest. One could represent each
game as a sequence of FEN positions. dbacl has a limit on the length
of words which makes using straight FEN impractical, but the positions
could just as well be compressed by using a hash function.
-
barawn writes on slashdot
"I think the main limitation - not being able to
understand values of exchanges - is probably the
biggest problem. It's also the easiest to fix, as you
can just write a program to add the piece that's being
taken (i.e. instead of Bxc4, you'd have BxBc4). The
relative value of each piece can be determined just
from the number of times an exchange happens for
winners."
-
Nice idea. Does anyone want to try?
-
aridg writes on slashdot
"Reading this article, I was reminded of the old children's
story about "stone soup". You remember that one -- someone
advertises that he can make soup from a stone, and various
others gather around to watch this amazing feat. Well, the
soup needs a little extra seasoning, so he gets someone to
put in some carrots while the stone cooks, then he adds
some onions, etc, etc... I think you can see where this is
going.
Sure you can make a chess playing program from a spam
filter.
You just need to throw in a legal move generator, and a
game database, and some capture heuristics, and position
displayer, etc, etc..."
-
This is an interesting criticism, but is problematic on inspection.
What does it take to play chess? Since a game is defined as a sequence
of legal moves, there is nothing optional about a legal move generator.
Imagine if two people are playing chess, and one tries an illegal
move. The opponent will point out the mistake and refuse to allow it.
The first person will try again until a legal move is found.
Statistically, dbacl could learn to distinguish legal from illegal moves
in any one board position by observing enough training games,
ie the illegal moves can be learned
to have arbitrarily small probabilities. However, in a real game the
opponent won't allow illegal moves, so dbacl would continue to generate
moves until it obtains a legal move. Mathematically, this is equivalent
to using a move generator like SAN, but using SAN is faster. Similar
comments apply to the other criticisms.
-
lawrenqj writes on slashdot
"Well actually Bayesian filters aught to be able to
discard inserted "noise". I'm not sure how the package
the article pointed out works, but usually the attempt
at pattern recognition accounts for either noise or
randomization and finds appropriate matches despite
small changes. So a bad move on the human's part
should be ignored by the filter. I'm not sure it would
be able to take advantage of the situation though."
-
This is correct. dbacl builds a statistical model from the training
games. The model handles noise as appropriate via the rules
of probability and the model assumptions. In other words, when
a sequence of moves is analysed, it is decomposed into known
subsequences and unknown subsequences, which are combined through
a Bayesian weighting mechanism which depends on the model. In this
way, all relevant training sequences, even if they only overlap
partially, are used to predict the scores of a game.
-
blacksky writes on slashdot
"Someone alluded to the Wargames movie/book. Which begs the
question: would you see an improvement if you set the
bayesian filter against itself, and fed the resulting
games back into its knowledge base? Would favouring the
shorter games in this feedback loop improve it further?"
-
This could be an interesting experiment. The chess engine presented in
the essay is incomplete, it only plays Black, but with some work it
can be completed. I don't know if playing against itself is a good
idea. It might help to extract the likely moves, but it might also
cause degeneration similar to incest.
-
Chuckstar writes on slashdot
"What if you ignore the 'x' altogether.
Specifically, delete all the 'x's before running anything
through dbacl. Put back the 'x' where necessary in
returning the move to the chess program.
That way Nxd5 = Nd5
The idea being that whether or not you make a specific
move is not always dependent on whether you make a
capture. Sometimes you just want the piece on that spot on
the board. The program currently treats those two
instances (moving a piece for a to b, and moving a piece
from a while capturing the piece at b) as different moves,
so that the presence of a piece at b alters the outcome of
the decision process."
-
This is an idea worth trying. However, if dbacl never sees
an 'x' in the training games, then it will assign any move containing
an 'x' during play a low probability, identical for all capturing moves.
This would make it even more reluctant to capture pieces naturally,
and give no way of choosing which pieces should be captured. A simple
way to make things work correctly would be to remove all the 'x'
from the training
games and train on both the old games containing 'x', and the same games
where the 'x' is missing.
-
Eric Towers writes in an email
"[I] wondered if dbacl would play better if instead of learning
multi-move sequences, it learned {board state, next move} pairs."
-
It's possible. I only tried the simplest things in the essay, but that
approach is very promising. You don't actually have to represent the
full board state, for example you might also represent a slightly
higher level state which tells things like the number and values of
important pieces etc.
-
John Kipling Lewis writes in an email
"Also, as end games are probably going to be a bit of a bother, it
would be nice to select a sub-section of possible moves when in the
middle game and then compare their LAST 7 moves as an additional
modifier."
-
While I didn't do it in the writeup, it is certainly possible to
generate more than a single move ahead and apply the same scoring
method. In principle, up to 7 moves could be generated, although the
exponential branching will always pose a practical problem just as in
real chess engines.
It's an interesting question however because it isn't obvious I
think if this might help or not. It might help because it's a
form of educated guessing where the opponent's moves are filled
in from experience with a large collection of games, but it
might not help because it would amount to wishful thinking, that
the opponent behaves like in the training games.
In the best case, it would amount to an effective doubling of the
sequence
length from 7 to about 14, since the calculations would involve all the
overlapping 7-move (or less) sequences straddling the current move.
-
-
-
-
-
-
-
-
-
|