Tip:
Highlight text to annotate it
X
[intro music]
Hello, Time Itself here with part two of my gamer’s guide to TrueSkill. This guide is
aimed at giving gamers a better understanding of Microsoft’s TrueSkill system and how
it fits in their favorite games. No proficiency in calculus or advanced statistics required.
In part two we’ll talk about what a skill rating actually represents and how the TrueSkill
system is tuned for different games. As noted in part one, when two players face
off the higher rated players is only more likely to win, not certain. Rating systems
attempt to tie the difference in player ratings to the increased likelihood of winning. With
TrueSkill we’ll talk about the number of points difference between players so that
we expect the higher rated player to win roughly eighty percent of the time. This is the first
tuning parameter of the TrueSkill system. Each group of players who we can separate
by that eighty-twenty split is called a link in the skill chain. Because the TrueSkill
system rates players from zero to fifty the number of ratings points that represents a
link in the skill chain needs to be found and tuned. So the length of the skill chain
could be different, not just every game but also for each playlist or game mode in each
game. If that number for the eighty-twenty difference is small the game is said to have
a long skill chain because there are many links and be a high skill game. If the number
is high it has a short skill chain and is said to be more luck based. As an example,
in Age of Empires 3 a difference of about four TrueSkill rating points corresponded
to an eighty percent chance of the higher rated player winning.
But these terms luck based or high skill games are a bit misleading so let me say a bit more
before you tell me how your favorite game has the longest skill chain ever. A game like
poker may feature a significant amount of randomness, and thus luck, and require many
hands and games to be played to decide the better player. But we get back to this needing
an eighty-twenty difference in winning to add another link to our skill chain. I’d
say that a good part of this isn’t just how the top tiers players stack up, but how
many levels of noobs you can establish. My guess is that games with long skill chains
also tend to be those with long, steep learning curves.
In part one we talked about TrueSkill’s two values for every player, rating and uncertainty
about that rating. Something that I need to be very clear about is that the result of
any game will not increase a player’s uncertainty number. Either the current ratings correctly
predicted the outcome and the system is more certain of a player’s rating or, an unexpected
outcome occurred and the system now has more information to make better ratings. Since
player’s skill levels can and do change TrueSkill adds another parameter that only
serves to increase each player’s uncertainty before calculating new updated ratings after
each game. This is what keeps the system dynamic. If the amount of uncertainty added is too
high the whole system won’t settle down and player’s ratings while fluctuate wildly.
Too low, and players find that their once accurate rating fails to match current level
of skill. Something to note is, that because conservative
ratings are what players see, as covered in part one, if the result of a game was already
very likely, a player with a much higher rating winning, the reduction in uncertainty from
the result of the game may not be enough to offset the uncertainty added as part of every
ratings calculation. And the winning player may see a slight decrease in conservative
rating. Often as part tuning and maintaining the rating system this will be prevented from
happening but you may see it when some pro goes noob bashing. Rating always goes up for
wins and down for losses but in some situations the conservative rating may not.
We’ve also hit on one of the interesting parts of designing a rating system. That is,
while we’ve been talking about the number of rating points for a link in our skill chain,
what happens when players who are two links apart play? What about players half a link
apart? These assumptions are built into the design of the system. But TrueSkill does have
one additional way to handle this in yet another tuning parameter. That is the probability
of a tie, where neither player wins. Usually, with a large enough sample games this can
just be the percentage of ties that have happened in previously played games. If two nearly
equally skilled players face off, and a tie is a possible result, the system will gain
more information. This gets back to the whole only using wins or losses to update ratings.
In fact, when a game doesn’t result in a tie when it could have, gives more weight
to the result. So, the parameter for the probability of a tie needs to go into every new rating
calculation. Something to note; one of the assumptions of the TrueSkill system is that
the game has no practical skill ceiling. For example tic-tac-toe has a skill ceiling. Most
people who play can expect to tie in every single game.
In the end, players within one link on the skill chain should see a reasonable correlation
between difference in rating and likelihood of wining. It’s when players three or more
links apart face off that the system can act strangely. The odds of the lower rated player
winning the game outright are very small as far as the system is concerned. Instead other
issues, like bugs or connectivity problems are just as likely to be a factor in online
multiplayer games that will influence the result of the game as the player’s respective
skills. The TrueSkill system doesn’t just work out
of the box. There are a few parameters that need to be tuned to make it work as intended.
The TrueSkill rating system is great at quickly getting players rating to their proper values.
However in some cases it can be painfully slow to change after that.
I hope this has helped give some additional insight into the TrueSkill system. Join me
next time for part three where we get to put all of this together and take on the phrase,
“get my 50”. If you’d like to know even more about the TrueSkill system check out
the links in the description below. Thanks for watching guys and I’ll see you
next time.
[outro music]