遗传算法 - 二维世界中的生物不学习

人工智能 遗传算法 游戏-ai
2021-11-13 20:58:29

目标 - 我正在尝试实现一种遗传算法来优化模拟二维世界中某种生物的适应度。这个世界包含可食用的食物,随机放置,还有一群怪物(你的基本僵尸)。我需要算法来找到让生物吃得饱而不死的行为。

我做了什么 -

因此,我首先在 numpy 中生成一个 11x9 二维数组,其中填充了 0 到 1 之间的随机浮点数。然后我使用 np.matmul 遍历数组的每一行并将所有随机权重乘以所有感知(w1+p1*w2+p2....w9+p9) = a1。

运行第一代,然后我使用 (能量 + (死亡时间 * 100)) 评估每个生物的适应度。以此为基础,我建立了一个表现高于平均健康水平的生物列表。然后我把这些“精英”生物中最好的一个放回下一个种群。对于剩余的空间,我使用了一个交叉函数,它需要两个随机选择的“精英”生物并混合它们的基因。我已经测试了两种不同的交叉函数,一个在每一行上进行两点交叉,另一个从每个父母那里取一行,直到新孩子有完整的染色体。我的问题是这些生物似乎并没有真正在学习,在 75 回合时,我每隔一段时间只会得到 1 个幸存者。

我完全意识到这可能还不够,但我真的坚持这一点并且无法弄清楚如何让这些生物学习,即使我认为我正在实施正确的程序。偶尔我会得到 3-4 个幸存者,而不是 1 或 2 个,但它似乎完全随机发生,似乎没有太多的学习发生。

下面是代码的主要部分,它包括我所做的一切,但没有提供用于模拟的代码

#!/usr/bin/env python
from cosc343world import Creature, World
import numpy as np
import time
import matplotlib.pyplot as plt
import random
import itertools


# You can change this number to specify how many generations creatures are going to evolve over.
numGenerations = 2000

# You can change this number to specify how many turns there are in the simulation of the world for a given generation.
numTurns = 75

# You can change this number to change the world type.  You have two choices - world 1 or 2 (described in
# the assignment 2 pdf document).
worldType=2

# You can change this number to modify the world size.
gridSize=24

# You can set this mode to True to have the same initial conditions for each simulation in each generation - good
# for development, when you want to have some determinism in how the world runs from generation to generation.
repeatableMode=False

# This is a class implementing you creature a.k.a MyCreature.  It extends the basic Creature, which provides the
# basic functionality of the creature for the world simulation.  Your job is to implement the AgentFunction
# that controls creature's behaviour by producing actions in response to percepts.
class MyCreature(Creature):

    # Initialisation function.  This is where your creature
    # should be initialised with a chromosome in a random state.  You need to decide the format of your
    # chromosome and the model that it's going to parametrise.
    #
    # Input: numPercepts - the size of the percepts list that the creature will receive in each turn
    #        numActions - the size of the actions list that the creature must create on each turn
    def __init__(self, numPercepts, numActions):

        # Place your initialisation code here.  Ideally this should set up the creature's chromosome
        # and set it to some random state.
        #self.chromosome = np.random.uniform(0, 10, size=numActions)
        self.chromosome = np.random.rand(11,9)
        self.fitness = 0
        #print(self.chromosome[1][1].size)

        # Do not remove this line at the end - it calls the constructors of the parent class.
        Creature.__init__(self)


    # This is the implementation of the agent function, which will be invoked on every turn of the simulation,
    # giving your creature a chance to perform an action.  You need to implement a model here that takes its parameters
    # from the chromosome and produces a set of actions from the provided percepts.
    #
    # Input: percepts - a list of percepts
    #        numAction - the size of the actions list that needs to be returned
    def AgentFunction(self, percepts, numActions):

        # At the moment the percepts are ignored and the actions is a list of random numbers.  You need to
        # replace this with some model that maps percepts to actions.  The model
        # should be parametrised by the chromosome.

        #actions = np.random.uniform(0, 0, size=numActions)

        actions = np.matmul(self.chromosome, percepts)

        return actions.tolist()


# This function is called after every simulation, passing a list of the old population of creatures, whose fitness
# you need to evaluate and whose chromosomes you can use to create new creatures.
#
# Input: old_population - list of objects of MyCreature type that participated in the last simulation.  You
#                         can query the state of the creatures by using some built-in methods as well as any methods
#                         you decide to add to MyCreature class.  The length of the list is the size of
#                         the population.  You need to generate a new population of the same size.  Creatures from
#                         old population can be used in the new population - simulation will reset them to their
#                         starting state (not dead, new health, etc.).
#
# Returns: a list of MyCreature objects of the same length as the old_population.

def selection(old_population, fitnessScore):
    elite_creatures = []
    for individual in old_population:
        if individual.fitness > fitnessScore:
            elite_creatures.append(individual)

    elite_creatures.sort(key=lambda x: x.fitness, reverse=True)

    return elite_creatures

def crossOver(creature1, creature2):
    child1 = MyCreature(11, 9)
    child2 = MyCreature(11, 9)
    child1_chromosome = []
    child2_chromosome = []

    #print("parent1", creature1.chromosome)
    #print("parent2", creature2.chromosome)

    for row in range(11):
        chromosome1 = creature1.chromosome[row]
        chromosome2 = creature2.chromosome[row]

        index1 = random.randint(1, 9 - 2)
        index2 = random.randint(1, 9 - 2)

        if index2 >= index1:
            index2 += 1
        else:  # Swap the two cx points
            index1, index2 = index2, index1

        child1_chromosome.append(np.concatenate([chromosome1[:index1],chromosome2[index1:index2],chromosome1[index2:]]))
        child2_chromosome.append(np.concatenate([chromosome2[:index1],chromosome1[index1:index2],chromosome2[index2:]]))

    child1.chromosome = child1_chromosome
    child2.chromosome = child2_chromosome

    #print("child1", child1_chromosome)

    return(child1, child2)

def crossOverRows(creature1, creature2):
    child = MyCreature(11, 9)

    child_chromosome = np.empty([11,9])

    i = 0

    while i < 11:
        if i != 10:
            child_chromosome[i] = creature1.chromosome[i]
            child_chromosome[i+1] = creature2.chromosome[i+1]
        else:
            child_chromosome[i] = creature1.chromosome[i]

        i += 2

    child.chromosome = child_chromosome

    return child

    # print("parent1", creature1.chromosome[:3])
    # print("parent2", creature2.chromosome[:3])
    # print("crossover rows ", child_chromosome[:3])


def newPopulation(old_population):
    global numTurns

    nSurvivors = 0
    avgLifeTime = 0
    fitnessScore = 0
    fitnessScores = []

    # For each individual you can extract the following information left over
    # from the evaluation.  This will allow you to figure out how well an individual did in the
    # simulation of the world: whether the creature is dead or not, how much
    # energy did the creature have a the end of simulation (0 if dead), the tick number
    # indicating the time of creature's death (if dead).  You should use this information to build
    # a fitness function that scores how the individual did in the simulation.
    for individual in old_population:

        # You can read the creature's energy at the end of the simulation - it will be 0 if creature is dead.
        energy = individual.getEnergy()

        # This method tells you if the creature died during the simulation
        dead = individual.isDead()

        # If the creature is dead, you can get its time of death (in units of turns)
        if dead:
            timeOfDeath = individual.timeOfDeath()
            avgLifeTime += timeOfDeath
        else:
            nSurvivors += 1
            avgLifeTime += numTurns

        if individual.isDead() == False:
            timeOfDeath = numTurns

        individual.fitness = energy + (timeOfDeath * 100)
        fitnessScores.append(individual.fitness)
        fitnessScore += individual.fitness
        #print("fitnessscore", individual.fitness, "energy", energy, "time of death", timeOfDeath, "is dead", individual.isDead())

    fitnessScore = fitnessScore / len(old_population)

    eliteCreatures = selection(old_population, fitnessScore)

    print(len(eliteCreatures))

    newSet = []

    for i in range(int(len(eliteCreatures)/2)):
        if eliteCreatures[i].isDead() == False:
            newSet.append(eliteCreatures[i])

    print(len(newSet), " elites added to pop")

    remainingRequired = w.maxNumCreatures() - len(newSet)

    i = 1

    while i in range(int(remainingRequired)):
        newSet.append(crossOver(eliteCreatures[i], eliteCreatures[i-1])[0])
        if i >= (len(eliteCreatures)-2):
            i = 1
        i += 1

        remainingRequired = w.maxNumCreatures() - len(newSet)


    # Here are some statistics, which you may or may not find useful
    avgLifeTime = float(avgLifeTime)/float(len(population))
    print("Simulation stats:")
    print("  Survivors    : %d out of %d" % (nSurvivors, len(population)))
    print("  Average Fitness Score :", fitnessScore)
    print("  Avg life time: %.1f turns" % avgLifeTime)

    # The information gathered above should allow you to build a fitness function that evaluates fitness of
    # every creature.  You should show the average fitness, but also use the fitness for selecting parents and
    # spawning then new creatures.


    # Based on the fitness you should select individuals for reproduction and create a
    # new population.  At the moment this is not done, and the same population with the same number
    # of individuals is returned for the next generation.

    new_population = newSet

    return new_population

# Pygame window sometime doesn't spawn unless Matplotlib figure is not created, so best to keep the following two
# calls here.  You might also want to use matplotlib for plotting average fitness over generations.
plt.close('all')
fh=plt.figure()

# Create the world.  The worldType specifies the type of world to use (there are two types to chose from);
# gridSize specifies the size of the world, repeatable parameter allows you to run the simulation in exactly same way.
w = World(worldType=worldType, gridSize=gridSize, repeatable=repeatableMode)

#Get the number of creatures in the world
numCreatures = w.maxNumCreatures()

#Get the number of creature percepts
numCreaturePercepts = w.numCreaturePercepts()

#Get the number of creature actions
numCreatureActions = w.numCreatureActions()

# Create a list of initial creatures - instantiations of the MyCreature class that you implemented
population = list()
for i in range(numCreatures):
   c = MyCreature(numCreaturePercepts, numCreatureActions)
   population.append(c)

# Pass the first population to the world simulator
w.setNextGeneration(population)

# Runs the simulation to evaluate the first population
w.evaluate(numTurns)

# Show the visualisation of the initial creature behaviour (you can change the speed of the animation to 'slow',
# 'normal' or 'fast')
w.show_simulation(titleStr='Initial population', speed='normal')

for i in range(numGenerations):
    print("\nGeneration %d:" % (i+1))

    # Create a new population from the old one
    population = newPopulation(population)

    # Pass the new population to the world simulator
    w.setNextGeneration(population)

    # Run the simulation again to evaluate the next population
    w.evaluate(numTurns)

    # Show the visualisation of the final generation (you can change the speed of the animation to 'slow', 'normal' or
    # 'fast')
    if i==numGenerations-1:
        w.show_simulation(titleStr='Final population', speed='normal')
1个回答

我认为你的方法有两个问题。

首先,您的遗传算法包含交叉,但根本没有突变。在 GA 中,交叉导致收敛,而变异是唯一的“探索”操作。这意味着你的生物被困在它们最初的小种群中存在的任何基因中,即使有适度的选择压力,它们也会迅速收敛到彼此相同。添加突变的一种常见方法是以小概率(例如,0.01 * 1/number_of_genes)为每个孩子基因组中的每个位置分配一个随机值。一些研究人员更喜欢更高的值。我不确定是否已经明确显示哪个更好,但这可能取决于您的问题。

其次,丢弃死亡的代理可能不是最好的选择机制。如果您将繁殖与其他事物联系起来(例如,在您还活着的时候吃很多食物),您可能会得到更有趣的行为。现在,你的适应度函数可能会激励智能体躲在角落里而不做任何事情,因为这可以最大限度地提高它们存活到模拟结束的机会。

希望这个对你有帮助。