Today I bring you something I’ve been doing lately. The idea is to play games automatically to test different artificial intelligence algorithms. I wanted to start simple so I used this game that came with my Linux distro. The game is called Hitori and here is a demo of my bot running and solving the game:

I know, it is kind of hard to understand what is happening, but bear with me. There are three main things happening here:

  • The program detects the game’s window and reads it using OpenCV and OCR. The result of this can be seen in the first matrix from the output
  • Then, having all the data it needs, it solves the game. It’ll became clear after I explain the actual rules for this game.
  • Finally, it simulates mouse movements and the needed clicks to solve the game. It also clicks the “play again” button and loops ☺

The only moment that I’m controlling the mouse is when I change the board size from 5 x 5 to 6 x 6. I could leave the program playing by itself indefinitely if I let it.


Game Rules

Quoting the game help screen:

Hitori is a small logic puzzle in a similar vein to the more popular Sudoku. In the game, the player starts with a square board of numbers, and has to paint out cells until there are no duplicate numbers in each row and column. The following rules apply:

  • [RULE #1] There must only be one of each number in the unpainted cells in each row and column.
  • [RULE #2] No painted cell may be adjacent to another, vertically or horizontally.
  • [RULE #3] All the unpainted cells must be joined together vertically and horizontally in one group.

Examples of these rules:

RULE #1: No repeated numbers in rows and columns.

RULE #2: No painted cells adjacent to each other.

RULE #3: Unpainted cells must be joined in only one group.

If you don’t have the game and want to try it

$ sudo apt install hitori


OpenCV Detection and OCR

To be able to solve this game programatically I must read the numbers in the screen somehow. I’ve decided to start by taking a screenshot of the board and process it using OpenCV. For the screenshots I used a library called Python MSS. This library seems to be pretty fast, I’ve measured a steady 25 frames per second. I only need one screenshot this time but it could became handy for next projects.

Using OpenCV I want to isolate the numbers to leave an easier job for the OCR engine. In this step, more or less what I do is to apply a threshold to the image and delete the borders between numbers.

Now it should be easy for the OCR to process this. The library used for OCR is Google’s tesseract . It is a very known and standard library for this cases and works well enough. The command run is:

$ tesseract tmp_img.jpg tmp_result -psm 11 -c tessedit_char_whitelist=0123456789

This reads the image tmp_img.jpg and stores the result in tmp_result.txt. The parameter psm refers to a “page segmentation mode” that tells tesseract how to treat the image. From the documentation psm=11 corresponds to “Sparse text. Find as much text as possible in no particular order”. I also set a whitelist of possible characters to be found in the input text (in our case, the image contains only digits).


Solving the game

Now I have everything I need to solve the game. My game board in Python looks something like this:

Python
  game_board = [['6', '2', '5', '6', '5'],
                ['1', '3', '5', '2', '6'],
                ['5', '6', '2', '3', '6'],
                ['4', '1', '3', '5', '2'],
                ['3', '4', '1', '4', '3']]

From this point on, the problem is easy. I solve the game without overthinking it too much using only backtracking. For every cell I check if it is possible to paint it (is the cell nullable?):

Python
    def is_nullable(self, i, j):
        return self.board[i][j] != '*' and
               neighbour_not_nulled and (self.repeated_in_row(i, j) or
                                         self.repeated_in_col(i, j))

First, if the cell is already painted, I skip it. Then, I check for RULE #1 (it is nullable only if the current number is repeated in the same row or column) and for RULE #2 (it doesn’t have a neighbour cell that is also currently painted). If those conditions hold, I replace the number with a “*” and check if I already solved the game.

Python
    def is_solved(self):
        solved = self.still_connected()
        for i in range(len(self.board)):
            for j in range(len(self.board)):
                solved &= self.board[i][j] == "*" or
                          (not self.repeated_in_col(i, j) and
                           not self.repeated_in_row(i, j))
        return solved

Here with still_connected(...) I’m checking for RULE #3 (is the board still connected in only one group?). In that method I run a DFS from any cell to know if I can get from there to all other cells. After that method, I check if every other cell is valid. This is, every cell is either a “*” or a number that doesn’t break RULE #1.

If I still haven’t solve the game, I try all the process again starting from another cell recursibly or backtracking if necessary. Eventually I’ll reach a solution and know exactly which cells I have to click.


Clicking

I could leave everything like it is at this point and calling it done. But it would be great for the program to play by itself without human intervention. It turns out that it is easy to make this happen. There is a library called Autopy that’s simple to use and does what I need. It has methods to click, move the mouse and press keys, all these things could be useful in the future for other games. From the solution I generated and the output of OpenCV that has the coordenates of each number in the board, I can get all the regions to be clicked. Then I just:

Python
  def click_regions(left, top, regions):
      for x, y, h, w in regions:
          autopy.mouse.smooth_move(x + w / 2 + left, y + h / 2 + top)
          autopy.mouse.click()

With left and top being the position of the window in the screen and regions the regions I got earlier. The smooth_move thing is a nice feature, makes the movement visible and adds to the “robotic” feel of the program.

This is all for now, I will surely follow this up with other more complex games.

Thanks for reading!

Source code: Github