Following up on my previous post about optimal hangman strategy, I’ve run some more experiments, fixed up and tested my script, with some interesting results.
First of all, I tested the script in a game of hangman against every single word in a 70k word dictionary. I played with 10 lives before losing, which is probably on the conservative side, 12 seems like a common figure. The script lost on 469 words, of which 99 were 3 letters long. There were 20 words 7 letters long, all of which ended in ‘ing’. There were no words of 8 letters or longer.
A post on DataGenetics did the rounds last week, applying the might of statistical analysis to the game Hangman to try and guess what an optimal strategy might be. Many techniques were leveled at the problem, from basic analysis of letter frequencies to conditional probability, all in order to try and generate the best sequence you should call the letters.
Having read it I was slightly perplexed, it seemed like massive overkill for something that can be calculated fairly simply, so I created pyngman, a python script that generates optimal next guesses for Hangman. Input the state of the game and the letters you’ve called and it will tell you what letter to call next.
You supply the information as a state, such as ..e.., where .’s are unknown letters, followed by a list of letters you’ve tried:
$ pyngman -state ..e.. est
> Your best next guess is: a
It does this by using a dictionary (you must supply the dictionary, so the results will change depending on what you supply!), and looking at all possible words that could be the solution, and working out the letter with the highest probability of being present. So far I have been unable to find a word which causes the program to lose a game of hangman!
Grab pyngman from github and have a go yourself