当前位置:文档之家› AI实验四实验报告课件

AI实验四实验报告课件

实验四博弈搜索(3 学时)班级:计科041 班姓名:陆宇海学号:0407100232一实验目的熟悉和掌握博弈(对抗)搜索基本思想和实现关键技术,使用Python 语言实现通用的极大极小算法与Alpha-Beta剪枝算法,并进行实验验证。

二实验原理博弈是人工智能取得巨大成功的领域,著名的有深蓝系统等。

所有的计算机博弈程序(或系统)的基础Alpha-Beta剪枝算法,即在极大极小算法基础再进行剪枝。

熟练掌握该两种算法,能够解决博弈领域的大部分问题(当然可能需要大型数据库的支撑)。

三实验条件1 Python解释器,及IDLE等程序开发调试环境。

2 本实验所提供的几个Python文件,请解压文件gameproject.rar.四实验内容1 MiniMax算法实现2 AlphaBeta剪枝算法实现3 应用于一字棋游戏(TicTacToe,) 进行算法测试4 应用于抓三堆游戏(Nim),进行算法测试五实验步骤1 一字棋游戏的搜索问题形式化import tictactoeinitialTTTState = tictactoe.TicTacToeGameState()你先试着和一字棋随机Agent(它只会随机乱走,碰运气)对弈一局import gamesimport gameagentsgames.runGame(initialTTTState, {"X" : gameagents.HumanGameAgent(),"O" : gameagents.RandomGameAgent()},False, False)#输出结果为:-------------2 | | | |-------------1 | | | |-------------0 | | | |-------------0 1 2Your move? 0,0Opponent's move was (1, 1)-------------2 | | | |-------------1 | | O | |-------------0 | X | | |-------------0 1 2Your move? 0,1Opponent's move was (2, 0)-------------2 | | | |-------------1 | X | O | |-------------0 | X | | O |-------------0 1 2Your move? 0,2-------------2 | X | | |-------------1 | X | O | |-------------0 | X | | O |-------------0 1 2Game finished with utilities {'X': 1, 'O': -1}{'X': 1, 'O': -1}#由于智能体的行棋策略是随机的,故人可以毫不费力地战胜它2 实现一个简单的Agent,它会抢先占据中心位置,但是之后只会按照格子的顺序下子(其实不能算是智能体),试着和它玩一局:把步骤1的语句中的RandomGameAgen替t()换成SimpleTTTGameAgen即t()可。

输出结果为:-------------2 | | | |-------------1 | | | |-------------0 | | | |-------------0 1 2Your move? 2,0Opponent's move was (1, 1) -------------2 | | | |-------------1 | | O | |-------------0 | | | X |-------------0 1 2Your move? 2,2Opponent's move was (0, 0) -------------2 | | | X |-------------1 | | O | |-------------0 | O | | X |-------------0 1 2Your move? 0,2Opponent's move was (1, 0) -------------2 | X | | X |-------------1 | | O | |-------------0 | O | O | X |-------------0 1 2Your move? 0,1Opponent's move was (2, 1) -------------2 | X | | X |-------------1 | X | O | O |-------------0 | O | O | X |-------------0 1 2Your move? 1,2-------------2 | X | X | X |-------------1 | X | O | O |-------------0 | O | O | X |-------------0 1 2Game finished with utilities {'X': 1, 'O': -1}#在以上的下棋步骤中,我有意让棋,才得以观察到智能体的下棋顺序:先下中间的格(1,1),若(1,1)#已被占据,则沿着从第0 行到第 2 行,每行从第0 列到第 2 列的规律探索可下棋的格子然后再让随机Agent和SimpleAgen两t 个所谓的智能体对弈:把步骤1的语句中的HumanGameAgen替t()换成SimpleTTTGameAgen即t()可。

#输出结果为:#回合1:-------------2 | X | O | X |-------------1 | X | O | X |-------------0 | O | X | O |-------------0 1 2Game finished with utilities {'X': 0, 'O': 0}#双方平局#回合2:-------------2 | X | X | X |-------------1 | X | O | O |-------------0 | O | O | X |-------------0 1 2Game finished with utilities {'X': 1, 'O': -1}#随机智能体(RandomGameAge)nt获胜#回合3:-------------2 | O | X | |-------------1 | X | O | X |-------------0 | X | O | O |-------------0 1 2Game finished with utilities {'X': -1, 'O': 1}#固定方式智能体(SimpleTTTGameAge)n t获胜#经以上三次对弈,可以看出由于这两种智能体在下棋时都不考虑效用,而在无目的地下棋,故两者对弈#时,获胜的概率是相同的3 实现极大极小算法#极大极小算法是封装在一个名叫Minimax的类里:class Minimax(GameTreeSearcher):def getBestActionAndValue(self, currentState):#通过此函数获得最优行动方案self.player = currentState.currentPlayer()#获取当前的玩家(“O”或“X”)return self.maximin(currentState)#返回最优行动和此行动的最终效用值def maximin(self, currentState):#轮到Max先生行棋时的最优行动计算函数utility = currentState.utility()#返回针对Max先生的当前棋盘的效用值if utility: return None, utility[self.player]#若棋局已结束,则直接返回Max先生的最终效用值bestAction = None#初始化最优行动bestValue = -9999.0#负无穷#初始化当前行棋的最终效用值for (action, succ) in currentState.successors():#检索所有合法的行棋val = self.minimax(succ)[1]#将每一种合法行棋所产生的棋局交给Min先生,并获得他的最优行棋效用值if val > bestValue:bestAction, bestValue = action, val#找出Min先生针对Max先生的不同棋局所产生的最优行动和最优(最大——针对Max#先生来说)效用值return bestAction, bestValue#返回Max先生的最优行动和此行动的最终效用值def minimax(self, currentState):utility = currentState.utility()if utility: return None, utility[self.player]bestAction = NonebestValue = 9999.0#正无穷for (action, succ) in currentState.successors():val = self.maximin(succ)[1]if val < bestValue:bestAction, bestValue = action, valreturn bestAction, bestValue#此函数是针对Min先生行棋时的最优行动计算函数,由于它和上面的maxmini函数是对称的,故不#再做过多的说明让极大极小智能体和SimpleAgen对t 弈:minimaxAgent = gameagents.UtilityDirectedGameAgent(gameagents.Minimax())games.runGame(initialTTTState, {"X" : minimaxAgent,"O" : gameagents.SimpleTTTGameAgent()},True, True)注意,极大极小算法可能在开局考虑会耗时数分钟,请耐心等待(或者你让SimpleAgen先t走,这样会快些)#输出结果:-------------2 | | | |-------------1 | | | |-------------0 | | | |-------------0 1 2Got state value 0 with best action (0, 0)Player X agent returned action (0, 0)actions called 294778 times and successor called 549945 times in 23.2628102175 seconds-------------2 | | | |-------------1 | | | |-------------0 | X | | |-------------0 1 2Player O agent returned action (1, 1)actions called 0 times and successor called 0 times in 0.00254697175183 seconds-------------2 | | | |-------------1 | | O | |-------------0 | X | | |-------------0 1 2Got state value 0 with best action (1, 0)Player X agent returned action (1, 0)actions called 3864 times and successor called 7331 times in 0.318435164246 seconds -------------2 | | | |-------------1 | | O | |-------------0 | X | X | |-------------0 1 2Player O agent returned action (2, 0)actions called 0 times and successor called 0 times in 0.00385188620339 seconds-------------2 | | | |-------------1 | | O | |-------------0 | X | X | O |-------------0 1 2Got state value 0 with best action (0, 2)Player X agent returned action (0, 2)actions called 104 times and successor called 197 times in 0.0147457288565 seconds-------------1 | | O | |-------------0 | X | X | O |-------------0 1 2Player O agent returned action (0, 1)actions called 0 times and successor called 0 times in 0.00271906066268 seconds -------------2 | X | | |-------------1 | O | O | |-------------0 | X | X | O |-------------0 1 2Got state value 0 with best action (2, 1)Player X agent returned action (2, 1)actions called 8 times and successor called 13 times in 0.00692462310144 seconds -------------2 | X | | |-------------1 | O | O | X |-------------0 | X | X | O |-------------0 1 2Player O agent returned action (1, 2)actions called 0 times and successor called 0 times in 0.00309676229813 seconds --------------------------0 | X | X | O |-------------0 1 2Got state value 0 with best action (2, 2)Player X agent returned action (2, 2)actions called 1 times and successor called 1 times in 0.00675309292114 seconds-------------2 | X | O | X |-------------1 | O | O | X |-------------0 | X | X | O |-------------0 1 2Game finished with utilities {'X': 0, 'O': 0}#结果有些出人意料,是平局,或许是极大极小智能体(minimaxAgen)t 正好踩在了固定方式智能体#(SimpleTTTGameAge)n t的行动路线上你也可以和极大极小智能体对弈:games.runGame(initialTTTState, {"X" : gameagents.HumanGameAgent(),"O" : minimaxAgent },True, True)#输出结果为:-------------2 | | | |-------------1 | | | |0 1 2Your move? 1,1Player X agent returned action (1, 1)actions called 1 times and successor called 0 times in 10.2226939457 seconds-------------2 | | | |-------------1 | | X | |-------------0 | | | |-------------0 1 2Got state value 0 with best action (0, 0)Player O agent returned action (0, 0)actions called 29633 times and successor called 55504 times in 2.45857560109 seconds -------------2 | | | |-------------1 | | X | |-------------0 | O | | |-------------0 1 2Opponent's move was (0, 0)-------------2 | | | |-------------1 | | X | |-------------0 | O | | |-------------Your move? 1,0Player X agent returned action (1, 0)actions called 1 times and successor called 0 times in 8.68631013159 seconds-------------2 | | | |-------------1 | | X | |-------------0 | O | X | |-------------0 1 2Got state value 0 with best action (1, 2)Player O agent returned action (1, 2)actions called 582 times and successor called 1054 times in 0.0557467499361 seconds -------------2 | | O | |-------------1 | | X | |-------------0 | O | X | |-------------0 1 2Opponent's move was (1, 2)-------------2 | | O | |-------------1 | | X | |-------------0 | O | X | |-------------0 1 2Your move? 0,2Player X agent returned action (0, 2)actions called 1 times and successor called 0 times in 17.3988976253 seconds-------------2 | X | O | |-------------1 | | X | |-------------0 | O | X | |-------------0 1 2Got state value 0 with best action (2, 0)Player O agent returned action (2, 0)actions called 32 times and successor called 52 times in 0.0087340709506 seconds -------------2 | X | O | |-------------1 | | X | |-------------0 | O | X | O |-------------0 1 2Opponent's move was (2, 0)-------------2 | X | O | |-------------1 | | X | |-------------0 | O | X | O |-------------0 1 2Player X agent returned action (0, 1)actions called 1 times and successor called 0 times in 6.34426554213 seconds-------------2 | X | O | |-------------1 | X | X | |-------------0 | O | X | O |-------------0 1 2Got state value 0 with best action (2, 1)Player O agent returned action (2, 1)actions called 3 times and successor called 4 times in 0.00612368331713 seconds -------------2 | X | O | |-------------1 | X | X | O |-------------0 | O | X | O |-------------0 1 2Opponent's move was (2, 1)-------------2 | X | O | |-------------1 | X | X | O |-------------0 | O | X | O |-------------0 1 2Your move? 2,2actions called 1 times and successor called 0 times in 9.0314******* seconds-------------2 | X | O | X |-------------1 | X | X | O |-------------0 | O | X | O |-------------0 1 2Game finished with utilities {'X': 0, 'O': 0}#极大极小智能体(minimaxAgen)t 的实力的确不容小觑,如果人不失误,那么最好的结局就是和智能体#平局4 实现AlphaBeta剪枝算法class AlphaBeta(GameTreeSearcher):def getBestActionAndValue(self, currentState):#此函数用于获得最优行旗和此行棋的效用值self.player = currentState.currentPlayer()#获取当前玩家return self.maximin(currentState, -1, 1)#返回针对当前棋局的最优行动和此行动的效用值def maximin(self, currentState, alpha, beta):#Max先生的行棋策略utility = currentState.utility()#如果棋局结束,则返回Max先生的效用值bestAction = None#初始化最优行棋for (action, succ) in currentState.successors():#穷举针对当前棋局的所有可能行棋val = self.minimax(succ, alpha, beta)[1]#将其中一种行棋后产生的棋局传给Min先生,并由他返回关于他行棋的最优效用值if val > alpha:bestAction, alpha = action, val#如果Min先生返回的行棋效用大于当前的最低效用,则更新最优行棋与最低效用if alpha >= beta: break#如果当前的最低效用值大于最高效用值(事实上是上层节点的最高效用值)时,将不考虑#余下未判断的所有行棋可能(即剪枝)。

相关主题