MiniMax Algorithm Implementation for Tic-Tac-Toe Shows Only Offensive Play

I’m working on understanding the MiniMax algorithm by building a Tic-Tac-Toe game where two AI players compete against each other. However, I’m running into an issue where the first player (X) always manages to win every game.

I think the main problem is that my AI implementation isn’t considering defensive strategies properly. The players seem to focus only on winning moves rather than blocking their opponent’s winning opportunities.

Here’s my current implementation:

import java.util.*;

public class TicTacToeAI {

    BoardState currentState;
    int nodesExplored;
    int currentLevelNodes;

    public TicTacToeAI() {
        currentState = new BoardState();
        ArrayList<Position> availableMoves = new ArrayList<Position>();
        for (int row = 0; row < 3; row++) {
            for (int col = 0; col < 3; col++) {
                availableMoves.add(new Position(row, col));
            }
        }
        currentState.currentPlayer = "X";
        currentState.availableMoves = availableMoves;
        nodesExplored = 0;
        currentLevelNodes = 0;
    }

    public ArrayList<BoardState> generateNextStates(BoardState state) {
        ArrayList<BoardState> nextStates = new ArrayList<BoardState>();
        for (int i = 0; i < state.availableMoves.size(); i++) {
            Position pos = (Position) state.availableMoves.get(i);
            BoardState newState = executeMove(state, pos);
            newState.lastMove = pos;
            nextStates.add(newState);
        }
        return nextStates;
    }

    public BoardState executeMove(BoardState state, Position pos) {
        BoardState newState = null;
        ArrayList updatedMoves = (ArrayList) state.availableMoves.clone();
        if (updatedMoves.contains(pos)) {
            updatedMoves.remove(pos);
            newState = new BoardState();
            newState.availableMoves = updatedMoves;
            GameBoard newBoard = (GameBoard) state.gameBoard.clone();
            if (state.currentPlayer.equals("X")) {
                newBoard.placeX(pos.row, pos.col);
                newState.currentPlayer = "O";
            } else {
                newBoard.placeO(pos.row, pos.col);
                newState.currentPlayer = "X";
            }
            newState.gameBoard = newBoard;
        }
        return newState;
    }

    public void executeMinimaxMove() {
        nodesExplored += currentLevelNodes;
        currentLevelNodes = 0;
        BoardState state = currentState;
        ArrayList<BoardState> possibleStates = generateNextStates(state);
        int optimalValue = 0;
        int evaluatedValue = 0;
        for (BoardState nextState : possibleStates) {
            if (state.currentPlayer.equals("X")) {
                optimalValue = -1;
                evaluatedValue = minimizeValue(nextState);
                if (evaluatedValue > optimalValue) {
                    optimalValue = evaluatedValue;
                    currentState.nextState = nextState;
                }
                if (optimalValue == 1)
                    break;
            } else {
                optimalValue = 1;
                evaluatedValue = maximizeValue(nextState);
                if (evaluatedValue < optimalValue) {
                    optimalValue = evaluatedValue;
                    currentState.nextState = nextState;
                }
                if (optimalValue == -1)
                    break;
            }
        }
        currentState = executeMove(currentState, currentState.nextState.lastMove);
    }

    private int maximizeValue(BoardState state) {
        if (state.gameBoard.hasWinner() && state.currentPlayer.equals("X"))
            return 1;
        else if (state.gameBoard.isFull())
            return 0;
        else if (state.gameBoard.hasWinner() && !state.currentPlayer.equals("X"))
            return -1;

        currentLevelNodes++;
        ArrayList<BoardState> states = generateNextStates(state);
        int bestValue = -1;
        for (BoardState s : states) {
            int value = minimizeValue(s);
            if (value > bestValue) {
                bestValue = value;
            }
            if (bestValue == 1)
                break;
        }
        return bestValue;
    }

    private int minimizeValue(BoardState state) {
        if (state.gameBoard.hasWinner() && state.currentPlayer.equals("O"))
            return -1;
        else if (state.gameBoard.isFull())
            return 0;
        else if (state.gameBoard.hasWinner() && !state.currentPlayer.equals("O"))
            return 1;

        currentLevelNodes++;
        ArrayList<BoardState> states = generateNextStates(state);
        int bestValue = 1;
        for (BoardState s : states) {
            int value = maximizeValue(s);
            if (value < bestValue) {
                bestValue = value;
            }
            if (bestValue == -1)
                break;
        }
        return bestValue;
    }
}

Can anyone help me figure out why my AI isn’t playing defensively? I suspect there might be an issue with how I’m evaluating the game states or handling the minimax logic.

your terminal condition logic is backwards in the maximize/minimize functions. when you check state.currentPlayer.equals("X") after a winning move, you’re checking the player who’s about to move next - not who just won. since the current player switches after each move, you’re evaluating the wrong player’s victory.

Yeah Mandy’s right about the terminal condition being broken, but there’s another problem.

Your executeMinimaxMove() method is wrong. You’re setting optimalValue to -1 for X and 1 for O, then comparing with > and <. This means X only picks moves better than -1, and O only picks moves better than 1. Since minimax returns -1, 0, or 1, X should start with Integer.MIN_VALUE and O with Integer.MAX_VALUE.

I hit the same issue building game AIs for our testing tools. Instead of manually debugging minimax, I automated the whole simulation with Latenode. Run thousands of game variations, log move decisions, and catch where the logic breaks. Way faster than stepping through code.

You could build a workflow that tests your minimax against known optimal positions, feeds it different board states, and validates defensive moves automatically. Makes debugging these algorithm issues much easier when you can see decision patterns across hundreds of games.