因此,我使用 Pygame 和 Python 创建了 Snake 游戏。然后我想用遗传算法和一个简单的神经网络创建一个人工智能来玩它。看起来很有趣,但事情并不顺利。
这是我的遗传算法:
def calculate_fitness(population):
"""Calculate the fitness value for the entire population of the generation."""
# First we create all_fit, an empty array, at the start. Then we proceed to start the chromosome x and we will
# calculate his fit_value. Then we will insert, inside the all_fit array, all the fit_values for each chromosome
# of the population and return the array
all_fit = []
for i in range(len(population)):
fit_value = Fitness().fitness(population[i])
all_fit.append(fit_value)
return all_fit
def select_best_individuals(population, fitness):
"""Select X number of best parents based on their fitness score."""
# Create an empty array of the size of number_parents_crossover and the shape of the weights
# after that we need to create an array with x number of the best parents, where x is NUMBER_PARENTS_CROSSOVER
# inside config file. Then we search for the fittest parents inside the fitness array created by the
# calculate_fitness function. Numpy.where return (array([], dtype=int64),) that satisfy the query, so we
# take only the first element of the array and then it's value (the index inside fitness array). After we have
# the index of the element we just need to take all the weights of that chromosome and insert them as a new
# parent. Finally we change the fitness value of the fitness value of that chromosome inside the fitness
# array in order to have all different parents and not only the fittest
parents = numpy.empty((config.NUMBER_PARENTS_CROSSOVER, population.shape[1]))
for parent_num in range(config.NUMBER_PARENTS_CROSSOVER):
index_fittest = numpy.where(fitness == numpy.max(fitness))
index_fittest = index_fittest[0][0]
parents[parent_num, :] = population[index_fittest, :]
fitness[index_fittest] = -99999
return parents
def crossover(parents, offspring_size):
"""Create a crossover of the best parents."""
# First we start by creating and empty array with the size equal to offspring_size we want. The type of the
# array is [ [Index, Weights[]] ]. If the parents size is only 1 than we can't make crossover and we return
# the parent itself, otherwise we select 2 random parents and then mix their weights based on a probability
offspring = numpy.empty(offspring_size)
if parents.shape[0] == 1:
offspring = parents
else:
for offspring_index in range(offspring_size[0]):
while True:
index_parent_1 = random.randint(0, parents.shape[0] - 1)
index_parent_2 = random.randint(0, parents.shape[0] - 1)
if index_parent_1 != index_parent_2:
for weight_index in range(offspring_size[1]):
if random.uniform(0, 1) < 0.5:
offspring[offspring_index, weight_index] = parents[index_parent_1, weight_index]
else:
offspring[offspring_index, weight_index] = parents[index_parent_2, weight_index]
break
return offspring
def mutation(offspring_crossover):
"""Mutating the offsprings generated from crossover to maintain variation in the population."""
# We cycle though the offspring_crossover population and we change x random weights, where x is a parameter
# inside the config file. We select a random index, generate a random value between -1 and 1 and then
# we sum the original weight with the random_value, so that we have a variation inside the population
for offspring_index in range(offspring_crossover.shape[0]):
for _ in range(offspring_crossover.shape[1]):
if random.uniform(0, 1) == config.MUTATION_PERCENTAGE:
index = random.randint(0, offspring_crossover.shape[1] - 1)
random_value = numpy.random.choice(numpy.arange(-1, 1, step=0.001), size=1, replace=False)
offspring_crossover[offspring_index, index] = offspring_crossover[offspring_index, index] + random_value
return offspring_crossover
我的神经网络是使用 7 个输入形成的:
is_left_blocked, is_front_blocked, is_right_blocked, apple_direction_vector_normalized_x,
snake_direction_vector_normalized_x, apple_direction_vector_normalized_y,snake_direction_vector_normalized_y
基本上如果你可以向左、前、右、苹果方向和蛇方向。然后我有一个带有 8 个神经元的隐藏层,最后是 3 个输出,指示左、继续或右。
神经网络 forward() 是这样计算的:
self.get_weights_from_encoded()
Z1 = numpy.matmul(self.__W1, self.__input_values.T)
A1 = numpy.tanh(Z1)
Z2 = numpy.matmul(self.__W2, A1)
A2 = self.sigmoid(Z2)
A2 = self.softmax(A2)
return A2
其中 self.__W1 和 self.__W2 是从输入到隐藏层的权重,然后是从隐藏层到输出的权重。Softmax(A2) 返回值最大的矩阵 [1,3] 的索引,然后我使用该索引来指示我的神经网络选择的方向。
这是包含参数的配置文件:
# GENETIC ALGORITHM
NUMBER_OF_POPULATION = 500
NUMBER_OF_GENERATION = 200
NUMBER_PARENTS_CROSSOVER = 50
MUTATION_PERCENTAGE = 0.2
# NEURAL NETWORK
INPUT = 7
NEURONS_HIDDEN_1 = 8
OUTPUT = 3
NUMBER_WEIGHTS = INPUT * NEURONS_HIDDEN_1 + NEURONS_HIDDEN_1 * OUTPUT
这是主要的:
for generation in range(config.NUMBER_OF_GENERATION):
snakes_fitness = genetic_algorithm.calculate_fitness(population)
# Selecting the best parents in the population.
parents = genetic_algorithm.select_best_individuals(population, snakes_fitness)
# Generating next generation using crossover.
offspring_crossover = genetic_algorithm.crossover(parents,
offspring_size=(pop_size[0] - parents.shape[0], config.NUMBER_WEIGHTS))
# Adding some variations to the offspring using mutation.
offspring_mutation = genetic_algorithm.mutation(offspring_crossover)
# Creating the new population based on the parents and offspring.
population[0:parents.shape[0], :] = parents
population[parents.shape[0]:, :] = offspring_mutation
我有两个问题:
1)我没有看到新一代的改进
2)我实际上是在 for 循环中运行游戏,但是等待一代人的所有蛇死去并用新的蛇重复是非常耗时的。有没有办法启动所有或至少一个以上的游戏实例并继续用结果填充数组?
这是 Fitness().fitness(population[i])
def fitness(self, weights):
game_manager = GameManager(weights)
self.__score = game_manager.play_game()
return self.__score
这是在 for 循环中调用它的地方
def calculate_fitness(population):
"""Calculate the fitness value for the entire population of the generation."""
# First we create all_fit, an empty array, at the start. Then we proceed to start the chromosome x and we will
# calculate his fit_value. Then we will insert, inside the all_fit array, all the fit_values for each chromosome
# of the population and return the array
all_fit = []
for i in range(len(population)):
fit_value = Fitness().fitness(population[i])
all_fit.append(fit_value)
return all_fit
这是启动游戏的函数 (GameManager(weights)) 并返回蛇的分数。
这是我第一次接触 AI,所以这段代码可能一团糟,不要担心指出我做错了什么,只是请不要说“全错”,否则我将无法学习。