Fermat's Library | DeepStack: Expert-Level Artificial Intelligence in Heads-Up No-Limit Poker annotated/explained version.

exceedsexperthumanplayersinmanygames,e.g.,backgammon(1),checkers(2),chess(3),

Jeopardy!(4),Atarivideogames(5),andgo(6).Thesesuccessesallinvolvegameswith

informationsymmetry,whereallplayershaveidenticalinformationaboutthecurrentstateof

thegame.This propertyofperfect informationisalso attheheart ofthealgorithms thatenabled

these successes, e.g., local search during play (7,8).

Thefounderofmoderngametheoryandcomputingpioneer,vonNeumann,envisioned

reasoning in games without perfect information.“Real life is not like that.Real life consists of

blufﬁng,oflittletacticsofdeception,ofaskingyourselfwhatistheothermangoingtothink

Imeantodo.Andthatiswhatgamesareaboutinmytheory.”(9)Onegamethatfascinated

vonNeumannwaspoker,whereplayersaredealtprivatecardsandtaketurnsmakingbetsor

blufﬁng on holdingthe strongest hand, calling opponents’ bets, or folding and giving up onthe

handandthebetsalreadyaddedtothepot.Pokerisagameofimperfectinformation,where

players’ private cards give them asymmetric information about the state of game.

Heads-upno-limitTexashold’em(HUNL)isatwo-playerversionofpokerinwhichtwo

cardsareinitiallydealtface-downtoeachplayer,andadditionalcardsaredealtface-upin

threesubsequentrounds.Nolimitisplacedon thesizeofthebetsalthoughthereisan overall

limittothetotalamountwageredineachgame(10).AItechniqueshavepreviouslyshown

success inthe simpler gameof heads-uplimit Texas hold’em, whereall betsare of aﬁxed size

resultinginjustunder10

decisionpoints(11).Bycomparison,computershaveexceeded

experthumanperformanceingo(6),aperfectinformationgamewithapproximately10

170

decision points (12).The imperfect information game HUNLis comparable insize to go,with

the number of decision points exceeding 10

160

(13).

Imperfectinformationgamesrequiremorecomplexreasoningthansimilarlysizedperfect

informationgames.Thecorrectdecisionataparticularmomentdependsupontheprobability

distributionoverprivateinformationthattheopponentholds,whichisrevealedthroughtheir

pastactions.However,howouropponent’sactionsrevealthat informationdependsupontheir

knowledgeofourprivateinformationandhowouractionsrevealit.Thiskindofrecursive

reasoningiswhyonecannoteasilyreasonaboutgamesituationsinisolation,whichisatthe

heartofheuristicsearchmethodsforperfectinformationgames.CompetitiveAIapproaches

in imperfect information games typically reason about the entiregame and produce a complete

strategypriortoplay(14–16).Counterfactualregretminimization(CFR)(14,17,18)isonesuch

techniquethatusesself-playtodorecursivereasoningthroughadaptingitsstrategyagainstitself

oversuccessiveiterations.Ifthe gameistoolarge tobesolved directly,thecommonresponse

istosolve asmaller,abstractedgame.Toplaytheoriginalgame,onetranslatessituationsand

actions from the original game to the abstract game.

AlthoughthisapproachmakesitfeasibleforprogramstoreasoninagamelikeHUNL,it

doessobysqueezingHUNL’s10

160

situationsdowntotheorderof10

abstractsituations.

Likelyasaresultofthislossofinformation,suchprogramsarebehindexperthumanplay.In

2015, thecomputer programClaudico lost toa teamof professional pokerplayers by amargin

of91mbb/g(19),whichisa“hugemarginofvictory”(20).Furthermore,ithasbeenrecently

shownthatabstraction-basedprogramsfromtheAnnualComputerPokerCompetitionhave

Let's block ads! (Why?)

from Poker

Low Flops

Header Ads

Fermat's Library | DeepStack: Expert-Level Artificial Intelligence in Heads-Up No-Limit Poker annotated/explained version.

Post a Comment

No comments

Like us on Facebook

Ad

Categories

Popular Posts

Blog Archive

Ads

Tags

Low Flops

Header Ads

Fermat's Library | DeepStack: Expert-Level Artificial Intelligence in Heads-Up No-Limit Poker annotated/explained version.

Related Posts

Post a Comment

No comments

Like us on Facebook

Ad

Categories

Popular Posts

Blog Archive

Ads

Tags