A post on DataGenetics did the rounds last week, applying the might of statistical analysis to the game Hangman to try and guess what an optimal strategy might be. Many techniques were leveled at the problem, from basic analysis of letter frequencies to conditional probability, all in order to try and generate the best sequence you should call the letters.

Having read it I was slightly perplexed, it seemed like massive overkill for something that can be calculated fairly simply, so I created pyngman, a python script that generates optimal next guesses for Hangman. Input the state of the game and the letters you’ve called and it will tell you what letter to call next.

You supply the information as a state, such as ..e.., where .’s are unknown letters, followed by a list of letters you’ve tried:

$ pyngman -state ..e.. est
> Your best next guess is: a

It does this by using a dictionary (you must supply the dictionary, so the results will change depending on what you supply!), and looking at all possible words that could be the solution, and working out the letter with the highest probability of being present. So far I have been unable to find a word which causes the program to lose a game of hangman!

Grab pyngman from github and have a go yourself

(more…)