Training an AI – part 3

In my first post on the subject, Training an AI, I described how I intend to use genetic algorithms to find the optimum priority values for each of the ten actions the strategic AI can perform. In the second post I used this model to improve the performance of the Humans and the Greenskins. In this third post I modify my algorithms based on my experiences from the Human/Greenskin trial and use them on the Elf and Ende races.

Lets start by taking a more detailed look at my algorithms. The ‘crossover’ and ‘mutation’ is similar to the Human/Greenskin trial but the ‘randomize winner’ is new for the Elf/Ende trial described in this post. As this is an Android project all code is in Java.

Crossover: In the crossover, I have selected random values from each of the two parents. For example build scout might be taken from parent 1 and conquer settlement from parent 2 (named a and b below).

/***************************************************************************
* Method makes a crossover of two priority maps and return the new priority
* map.
***************************************************************************/
private static Map<Integer,Double> priorityCrossover(Map<Integer,Double> a, Map<Integer,Double> b) {
	// Create empty priority map
	Map<Integer,Double> result = new HashMap<Integer,Double>();
	
	// Create a crossover of map a and b
	// --> 50% that value is taken from map a
	// --> 50% that value is taken from map b
	for ( int i = 0; i < Planner.plannerTypes.length; i++) {
		if ( Math.random() > 0.5 ) {
			result.put(i, a.get(i));
		}
		else {
			result.put(i, b.get(i));
		}
	}
	return result;
}

Mutation: When I mutate a set of values a random value will be increased or decreased with 1 (the value must be between 0.5 and 10).

/***************************************************************************
* Method takes a priority map and mutate a random priority variable
***************************************************************************/
private static Map<Integer,Double> priorityMutation(Map<Integer,Double> original) {
	// Create a clone of the original list
	Map<Integer,Double> result = new HashMap<Integer,Double>();
	for ( int i = 0; i < Planner.plannerTypes.length; i++) {
		result.put(i, original.get(i));
	}
	
	// Randomly select which index to mutate
	Random random = new Random();
	int i = random.nextInt(Planner.plannerTypes.length);
	
	// Mutate the selected value
	// --> 50% that value increases with 1 (maximum value = 10)
	// --> 50% that value decreases with 1 (minimum value = 0.5)
	if ( Math.random() > 0.5 ) {
		result.put(i, result.get(i)+1);
		if ( result.get(i) > 10 ) {
			result.put(i, 10.0);
		}
	}
	else {
		result.put(i, result.get(i)-1);
		if ( result.get(i) < 0.5 ) {
			result.put(i, 0.5);
		}
	}
	return result;
}

Randomize winner: Instead of making a fully random set of values as I did when training the Human and Greenskin races I’ve made the tenth AI into a randomized version of the winning AI from the previous generation. For each value there is a 40% chance that the value is increased with 0-2.0, 40% chance that the value is decreased with 0-2.0 and 20% chance that it’s not changed.

/***************************************************************************
* Method takes a priority map and change all the values randomly.
***************************************************************************/
private static Map<Integer,Double> priorityRandomize(Map<Integer,Double> original) {
	// Create a clone of the original list
	Map<Integer,Double> result = new HashMap<Integer,Double>();
	for ( int i = 0; i < Planner.plannerTypes.length; i++) {
		result.put(i, original.get(i));
	}
	
	// Randomly change the values in the list
	// --> 40% that value increases with 0-2 (maximum value = 10)
	// --> 40% that value decreases with 0-2 (minimum value = 0.5)
	// --> 20% that value isn't changed
	for ( int i = 0; i < Planner.plannerTypes.length; i++) {
		Double dice = Math.random(); 
		if ( dice < 0.4 ) {
			result.put(i, result.get(i)+Math.random()*2);
			if ( result.get(i) > 10 ) {
				result.put(i, 10.0);
			}
		}
		else if ( dice < 0.8 ) {
			result.put(i, result.get(i)-Math.random()*2);
			if ( result.get(i) < 0.5 ) {
				result.put(i, 0.5);
			}
		}
	}
	return result;
}

In truth each small method is quite simple. What I found hardest was putting together the method that controlled the trial: create a new world, create new sets of priority maps based on the result from the previous generation, populate with empires, assign new AIs from the new set of priority maps to each empire, play 200 turns, evaluate the result and then repeat.

Playing a game with 20 empires for 200 turns takes around 10-13 minutes on my computer. 40 generations takes approxiomately 460 minutes, i.e. close to 8 hours.

One thought on “Training an AI – part 3

  1. ‘aepso’, a member on Reddit (http://redd.it/1l8u4h) where I’ve written a bit about my tests to evolve the AI, asked if I could run the evolved AI against a random AI and see how well it performs.

    I ran 5 games, each 200 turns, with Humans on the Surface and Greenskins in the Netherworld. On each level 5 of the AIs were evolved and 5 fully random. After each game I checked how many of the top three positions that the evolved AI of each race got.

    For the Human race the result isn’t very positive. The evolved AI took 8 of the possible 15 top three positions. Similar to the result in my articles here it seems that the Human AI haven’t evolved very much (or in the right direction). The nature of the Surface level is very open and flexible and a ‘winning’ strategy might be harder to evolve there.

    But for the Greenskins the result is better. The evolved AI took 11 of the possible 15 top three positions. A quick study show that some of the random bots that took top positions has a structure in their priority values that is similar to the evolved AI. The Netherworld is very different from the Surface and require a more specialised game play which might be the reason the evolved AI performs better here.