Mathemagic – Breaking Down Probabilities [Archive]

View Full Version : Mathemagic – Breaking Down Probabilities

Phoenix Ignition

08-26-2013, 02:37 PM

Recently I’ve seen a lot of decks pop up that try to get 3+ card combos to beat the opponent, with players stating facts like “one out of every 10 times I goldfish, I get a turn 2 win!” While anecdotal evidence gets in the way of what will truly happen when you play that deck, so do small sample sizes. Maybe that did happen to the person who posted that deck, we can’t spend time goldfishing decks we think won’t work on our own. Should I play this deck?

But what about normal decks? A real question that came up recently from my friend was whether he should play Mox Diamonds or Green Sun's Zenith for a mana boost in a classic style Rock deck. He said he would very often win if he could play a turn 1 Dark Confidant, justifying Mox Diamonds, but if that doesn’t happen very frequently, he would rather have GSZ, since you can always get Dryad Arbor on turn one.

Chance for getting turn 1 DC? 0.834(chance of getting 2 lands out of a 23 land deck)* 0.309 (chance of getting at least 1 Mox Diamond)* 0.259 (chance of getting at least 1 Dark Confidant) = 6.77% see note at bottom if mathematically inclined. Chance of getting a turn 1 GSZ that can be used for a Dryad Arbor with 3 GSZ and 1 Dryad Arbor in the deck? 0.315 (chance of getting at least 1 GSZ) * 0.948 (chance of getting at least 1 land that isn’t Dryad Arbor)* 0.914 (chance of NOT drawing Dryad Arbor) = 27.3%

------------ Quick Probability Discussion (skip if you know probabilities at least a little)--------
The probability of getting one thing is usually easy to do in games. What’s the chance of my top card being a land? Well if I have 23 lands in my deck and 60 cards in total, it’s 23/60= 0.3833 (or 38.33%). If I draw that card and get a land, the chance that the top card is a land is 22/59= 0.3729, so slightly lower.

The probability of both of these being a land is just the two probabilities multiplied together (0.3833 * 0.3729 = 0.143)
When we do this repeatedly (what’s the chance that at least one of the top 7 cards in my deck is a land) we call this a Hypergeometric Distribution. Big word, easy enough concept, you just have to keep track of the number of cards in the “Population,” or for magic terms, the “Deck,” and how many cards you specifically care about (in this example, the lands), known as the “Successes in the Population.”
------------ Quick Discussion Over -----------------------------

How to do these for your own deck and situation

Now we can get to the thing I really wanted to, how to use a Hypergeometric Distribution Calculator to find your own probabilities for your own deck. ***Disclaimer: I am not at all affiliated with this website, it’s just a really easy to use, free, Hypergeometric Distribution Calculator***
Here’s the one I usually use, when calculating simpler things: http://stattrek.com/online-calculator/hypergeometric.aspx although if you want to get into the meat of probabilities in magic, learning how to use the Excel functions will get you more precise numbers and more in depth scenarios (like calculating what using a Brainstorm will do to your chances of finding a particular card).
How to use it:
There are 4 necessary fields to fill in.
Population Size: How many cards are left in the deck when you start drawing cards?
Number of Successes in Population: How many cards are you interested in? (ex. If you are doing a calculation on the probability of drawing one of your 4 Force of Wills, this would be 4)
Sample Size: How many cards are you going to draw (7 for your opening hand, 8 for "on the draw" hands, 1 if you care only about the top card of the deck)
Number of Success in Sample: How many of the “Success” cards do you want to draw?

Quick Guide to calculating multiple things:

Getting back to the 3 card combo example. So let’s say you need to draw 1 of each of 3 different cards in your deck. You run 4 of each of these cards.
Step 1: Find the chance of getting the first thing
(Population size = 60, # of successes in population = 4, Sample size = 7, Successes in sample = 1)
Plug those in, hit calculate, and the bottom number listed is the one you’re most interested in (the probability of getting X greater than or equal to 1) = 0.399

Step 2: Find the chance of getting the second thing. The difference from step 1 is that now you have 59 cards to choose from, and only 6 more cards in your hand that are available (since 1 is already the card from step 1)
(Population size = 59, # of successes in population = 4, Sample size = 6, Successes in sample = 1) = 0.356

Step 3: Find the chance of getting the third thing. Now you have 58 cards left of interest in the deck, and 5 cards left in your hand available to draw that card.
(Population size =58, # of successes in population = 4, Sample size 5, Successes in sample = 1) = 0.310

Step 4: Multiply the probabilities from everything you wanted. 0.399 * 0.356 * 0.310 = 0.044

Overall, the point of me writing this is to hopefully get a few more people thinking in terms of the math. Hitting the “nuts” hand may win you the game immediately, but is it worth playing that deck in the first place?

Let me know if there are any typos or scenarios you have questions about.
Note: The probability given is very slightly off, as I use the probability of getting AT LEAST one of the proposed card, but for following calculations use a population of if I had ONLY gotten 1 card. Although not completely accurate, this error amounts to almost nothing overall

apple713

08-26-2013, 03:46 PM

I'm confused, whats the point of this post? Are you trying to say 3 card combos shouldnt be played or what?

I dont think anyone would say that a 3 card combo is preferable to a 2 card combo. This is the same reason that Sneak attack is more consistent than omni. 2 card combo vs 3 card combo. Adnauseum is almost a 1 card combo in many ways. There are not many 3 card combos that are successful because of the disruption in the format.

Phoenix Ignition

08-26-2013, 03:56 PM

Nope, not making any claims on what should or shouldn't be played, just trying to show people how to find probabilities for their own decks. I've seen a lot of random combo decks popping up with erroneous claims made on how good they are.

This method is useful for calculating the probability of anything, from my second example (should I play Mox Diamond for the chance I can play my 2cmc card on turn 1?) to the difference in playing a 1 land vs 2 land belcher deck. How many blue cards do I need to reliably cast Force of Will?

Pro poker players know all the percentages of drawing into X card, I just think it's useful for many people on these forums to know similar probabilities while deck building.

apple713

08-26-2013, 04:15 PM

Phoenix Ignition

08-26-2013, 04:28 PM

cockatrice runs all the probabilities for you i believe. All you have to do is build your deck and hit analyze.

It has an analyze button, but you can't tell it to do some specific things.

Tell me, using Cockatrice, is it better for me to run Treasure Hunt or Mulch in my 43 land deck (which plays 39 lands, believe it or not), if I want to see more cards and don't care where they go? If I'm running 8 free cycling cards (Gitaxian Probe and Street Wraith), how does that affect my chance of drawing a hand that lets me play a Violent Outburst by turn 2?

While these are slightly more difficult problems than I elaborated on in the OP, I gave the tools on how to calculate this sort of thing. If you don't want to do it, by all means don't, I don't think less of people for not knowing the probabilities, everyone has fun playing this card game in different ways. For those who were interested, hopefully they see it's easy enough to do for themselves.

Koby

08-26-2013, 09:13 PM

I fully support this movement.

Excel also has a hypergeometric function which is insanely useful if you know how to setup the scenarios. Just ask online, and I'm sure someone can help set it up.

thecrav

08-26-2013, 09:15 PM

I fully support this movement.

Excel also has a hypergeometric function which is insanely useful if you know how to setup the scenarios. Just ask online, and I'm sure someone can help set it up.

Koby, you're someone, right?

HALP

Phoenix Ignition

08-26-2013, 10:42 PM

Excel also has a hypergeometric function which is insanely useful if you know how to setup the scenarios. Just ask online, and I'm sure someone can help set it up.

Oh, I agree completely, here's one I made for finding out how many cards Treasure Hunt would likely draw you based on the number of land cards in the deck (I assumed you were on turn 2 and had drawn an average distribution of cards out of your deck already):
http://i.imgur.com/MV5Iv0D.jpg?1

It's actually pretty easy to use after you've learned what numbers to plug in and where. I'll help people out if they have specific questions, and if there's enough interest I'll write a tutorial on using Excel for the more advanced stuff. Like I said, especially useful for knowing your belcher %s with 1 land left in the deck.

phazonmutant

08-27-2013, 01:58 AM

I'm definitely a fan of maths getting used more in discussion. I've "livened up" storm discussions with some hypergeometric shenanigans, but you're right that this really is a fundamental tool that other deck designers should use as well.

.Ix

08-27-2013, 02:30 AM

0.948 (chance of getting at least 1 land that isn’t Dryad Arbor)* 0.914 (chance of NOT drawing Dryad Arbor)

I think this is a little off. The chance of drawing one Arbor in the opener is 0.1167, so the chance of not drawing Arbor should be 1 - 0.1167 = 0.8833 or 88.33%.

Also, in a 23-land Junk / Rock deck, there are 4-6 lands that will be unable to cast GSZ (Wastelands, Maze of Ith, Scrubland, Tower of the Magistrate, Basics), so the first number is probably also off. There is also the probability of getting a hand with Deathrite Shaman, which is strictly better than Turn 1 GSZ -> Arbor.

edit: And then you have to compare the chance of it being good to the chance of it being bad (aka drawing the Arbor, which is even worse in Junk/Rock than in Elves).

I am also in full support of this. Probabilities relating to Dryad Arbor have been a point of discussion in the Elves thread as well.

Phoenix Ignition

08-27-2013, 03:07 AM

I think this is a little off. The chance of drawing one Arbor in the opener is 0.1167, so the chance of not drawing Arbor should be 1 - 0.1167 = 0.8833 or 88.33%.
While this is accurate, it's not correct in context. Since I am already taking 2 cards out of the deck at this point, what with having GSZ and one of the 22 lands (thus leaving the deck at 58, instead of 60), I am also constraining my hand to have only 5 available slots with which to have Dryad Arbor. Therefore, only those 5 cards need to NOT be Dryad arbor, giving us Sample size = 5 Population size = 58. We have to be careful when just doing straight up probabilities since when we draw cards we do not put them back, which is the probability most people learned in school.

Also, in a 23-land Junk / Rock deck, there are 4-6 lands that will be unable to cast GSZ (Wastelands, Maze of Ith, Scrubland, Tower of the Magistrate, Basics), so the first number is probably also off. There is also the probability of getting a hand with Deathrite Shaman, which is strictly better than Turn 1 GSZ -> Arbor.

edit: And then you have to compare the chance of it being good to the chance of it being bad (aka drawing the Arbor, which is even worse in Junk/Rock than in Elves).
I totally agree with your analysis, but for the specific question I was asked (How often will I get a turn 1 Dark Confidant into play?), the relative goodness of each card was already decided upon. Basically he liked GSZ better, but turn 1 Dark Confidant won him games before. If you had lands that wouldn't do well on turn 1, like Tower, you just don't count that in the 23 (I only counted 22 lands as being viable turn 1 plays, since Dryad Arbor work). I suppose I was wrong in that he ran a basic swamp and a basic plains, bringing the "playable" turn 1 lands down to 20, but either way, the example still works.

catmint

08-28-2013, 07:46 AM

Awesome OP. Like the idea to talk more math in mtg. People are so often polarized in their opinions seeing cards/states/matchups "bad" or "good", "often", "never" by short term results. It is the same mechanisms that makes bad poker players, which in this case makes bad magic players. Hence understanding variance should also be a priority.

It would be great to edit the opening post into a collection of the most common generic "math problems" and/or give these guides/templates on how to solve them, which can be applied to the different situations. Then everyone can refer to this post when talking math, so problems don't have to resolved redundantly (and sometimes incorrectly).

It is also very common that people post something with the comment "I don't have the math for this". These math problems could be posted here and some "geeks" or people liking to solve these problems can work on them.

FTW

08-28-2013, 11:07 AM

I'm confused, whats the point of this post? Are you trying to say 3 card combos shouldnt be played or what?

I think the point is that some people (myself included) use math to analyze situations and inform decisions. Some people playtest and implicitly assume their sample size was big enough to give them a good sense of the matchup (10 games really isn't...). Some just use anecdotal evidence (OMG I drew a T1 win 6 times against Luke last week, sucka! Nourishing Lich FTW). And others just don't care either way.

I assume this thread is for the first type of people. I, personally, have been figuring out the probabilities by doing the combinations (nCr) and factorials and stuff by hand with a calculator and that can be a pain in the ass. Didn't even think to look for a Hypergeometric calculator. That looks incredibly useful! Thanks for the link!

apple713

08-28-2013, 12:29 PM

i've made a few tables that I can post when i get home. but im not sure where to post an excel file?

It's got the chance of getting x lands on x turns. So like if you play 18 lands, what are your chances of naturally drawing 3 lands by turn 3. Naturally meaning without brainstorms/ponders.

It also has basic probabilities like chance of getting 1 of 10 cards by any turn.

I wonder i can design a more user friendly version so people can easily input what they want. It becomes more difficult when you want to find the percentage chance of drawing 1 out of 8 cards and then 1 out of another set of 8. You just multiply the results but what if you brainstorm somewhere in there? you see an additional 3 cards and that changes the percentages. If you ponder then shuffle then brainstoorm, you've seen 6 additional cards or same if you fetch in between. These affect percentages a lot. It's almost like drawing a new hand.

I'll see if I can brew something up. I still need to knowwhere I can post an excel document tho?

Koby

08-28-2013, 12:59 PM

Google Docs (docs.google.com) is a good start.

Phoenix Ignition

08-28-2013, 09:23 PM

It would be great to edit the opening post into a collection of the most common generic "math problems" and/or give these guides/templates on how to solve them, which can be applied to the different situations. Then everyone can refer to this post when talking math, so problems don't have to resolved redundantly (and sometimes incorrectly).
I'd be happy to format the OP to include additional things. Actually I could write a similar "primer" on things such as variance, how to factor in cantrips, how many of X card to play if you want to draw Y of them by turn Z, and other common math problems in Magic. It'll be a few weeks though, I'm way too busy right now.

It is also very common that people post something with the comment "I don't have the math for this". These math problems could be posted here and some "geeks" or people liking to solve these problems can work on them.
This sounds like a good idea as well, whenever I get the time I enjoy finding out the probabilities for random things happening in magic.

I think the point is that some people (myself included) use math to analyze situations and inform decisions. Some people playtest and implicitly assume their sample size was big enough to give them a good sense of the matchup (10 games really isn't...). Some just use anecdotal evidence (OMG I drew a T1 win 6 times against Luke last week, sucka! Nourishing Lich FTW). And others just don't care either way.

I assume this thread is for the first type of people. I, personally, have been figuring out the probabilities by doing the combinations (nCr) and factorials and stuff by hand with a calculator and that can be a pain in the ass. Didn't even think to look for a Hypergeometric calculator. That looks incredibly useful! Thanks for the link!
I was actually hoping there was a subset of people who are interested enough in Magic to learn the math behind it, just like good poker players have to do for that game. At some point you need to know the chance that the deck you built will work as intended, or that the opponent draws X card and Y card together (the only way they can beat you). So far it appears like the group interested is the ones who already know math, but either way people should really start using free resources like what I posted.

Google docs would be an interesting way to disseminate this idea for general use, although that website calculator is actually very simple.

Freggle

08-28-2013, 10:09 PM

HammafistRoob

08-28-2013, 10:36 PM

http://www.kibble.net/magic/magic10.php
It's an old article, but the math still works and I figured it might be of use. This thread is a great idea and I was actually thinking of starting one like it about a week ago but I'm a pro at procrastination. I'm still a little shaky on some areas but like freggle I've been trying to teach myself new shortcuts and tricks. I feel math is even more useful when building decks for edh than anything else. Good work so far guys, I'm eager to see what else comes up here, this deserves a sticky no doubt.

apple713

08-28-2013, 11:20 PM

The following link is an excel document I created and use.

https://docs.google.com/file/d/0B3LnTitfm-SAOG16d0NqOVpRajA/edit?usp=sharing

Its a good start. Includes an interface that allows for calculating the probability of drawing 1 of x cards in a deck of Y after drawing Z cards. Also will calculate the probability of 2 situations occurring. For example; What are the chances that I will get 1 of the 4 leylines of sanctity on my opening hand AND also get 1 of the 4 brainstorms in my deck.

You can use it for EDH calculations if you change the # of cards in deck to 99.

There are tables included that allow for visual representations

X lands By turn X is maybe the most valuable since mana curves are incredibly important.

Let me know what you think, or if anyone want me to add something to it.

Also if you notice any errors Let me know so I can correct them

Phoenix Ignition

08-29-2013, 12:02 AM

I had been looking for this resource for a while, and had literally had been teaching myself this for the last two years piecing together tidbits here and there. This is very useful in deciding chance & odds of an event or a combination. Since this is here has anyone developed methods in developing the worthiness of a combo or it's "power level in a given "meta?"

This is an extremely non-trivial problem, in more things than just Magic. It boils down to the ability/inability of people to accurately describe incredibly complex systems, and I can guarantee no one in magic has come up with any way to "solve" the changing metagame. The problem with a lot of math is that connecting it to complex situations relies so much on the individual correctly "translating" the real world into a math problem that you're going to get enough errors just from that to make it all worthless.

Now, on the other hand, you can do a few things to increase your odds. For one, deck decisions can be made if you can trust people's playtesting statistics. Counterbalance decks, in playtesting, have a positive matchup versus combo decks. Using that percentage (let's say 75% for the sake of argument) paired with a general idea of how many people are playing that type of deck at a tournament (of course this is even less accurate, but you can at least ballpark it pretty well) can give you a slice of what to expect.

If you can get semi-accurate playtesting with semi-accurate guesses at the decks in the metagame you can actually take a good guess at which deck to play.

But as you can see, everything I just said means basically nothing since you can already kind of do that in your head, with similarly accurate results.

One thing I do want to say, though, is that you can do the same thing with your sideboard, determining the % increase in winrate of each card against the deck it's planned to beat. This helps TONS, as some people sideboard in things that increase their chance to win by ~5% against the popular deck, whereas it's better EV sometimes to just pack the dream killer cards for the slightly less popular decks.

FTW

08-29-2013, 12:38 AM

This is an extremely non-trivial problem, in more things than just Magic. It boils down to the ability/inability of people to accurately describe incredibly complex systems, and I can guarantee no one in magic has come up with any way to "solve" the changing metagame. The problem with a lot of math is that connecting it to complex situations relies so much on the individual correctly "translating" the real world into a math problem that you're going to get enough errors just from that to make it all worthless.

Hmm.. This is actually quite an intriguing problem. I want to put my Math degrees up to the test...

Essentially you are looking for the ways to quantify "viability", to construct some sort of index to measure this. One method would be to boil the problem down into components and find a way to model them

For example, one could say the power of a combo deck is described by:
-the distribution of goldfishes by turn (i.e. probability you goldfish on turn 1, on turn 2, etc...)
-the probability of drawing/assembling the pieces (1 card combo > 2-card combo > 3 card combo, and tutors/cantripping helps improve odds)
-the number of hate cards available and the frequency of those cards being run in events with similar metagames (e.g. % of decks running Leyline of Sanctity, Chalice of the Void, etc.)
-the probability of you finding your SB answer to those hate cards (based on number of answers you run and the chance of you seeing it within X turns given your deck's digging power)

Combining those dimensions you could probably construct some kind of measure to quantify strength of a combo deck relative to a meta.

Another approach would be empirical... strictly based on matchup percentages (based on testing) * distribution of decks in meta (estimated by turnout at similar recent events).

It would be interesting to compare these, for say TES and Belcher and AllSpells and OmniTell, and see how the measures perform.

Phoenix Ignition

08-29-2013, 12:53 AM

Hmm.. This is actually quite an intriguing problem. I want to put my Math degrees up to the test...

Essentially you are looking for the ways to quantify "viability", to construct some sort of index to measure this. One method would be to boil the problem down into components and find a way to model them

For example, one could say the power of a combo deck is described by:
-the distribution of goldfishes by turn (i.e. probability you goldfish on turn 1, on turn 2, etc...)
-the probability of drawing/assembling the pieces (1 card combo > 2-card combo > 3 card combo, and tutors/cantripping helps improve odds)
-the number of hate cards available and the frequency of those cards being run in events with similar metagames (e.g. % of decks running Leyline of Sanctity, Chalice of the Void, etc.)
-the probability of you finding your SB answer to those hate cards (based on number of answers you run and the chance of you seeing it within X turns given your deck's digging power)

Combining those dimensions you could probably construct some kind of measure to quantify strength of a combo deck relative to a meta.

Another approach would be empirical... strictly based on matchup percentages (based on testing) * distribution of decks in meta (estimated by turnout at similar recent events).

It would be interesting to compare these, for say TES and Belcher and AllSpells and OmniTell, and see how the measures perform.

Honestly I think the best and only way to go about it is to strictly use win percentages, which then says that the average is more important than it really is (playskill is a HUGE factor when playing, even more so for combo).

Just a few things I think you'd need to include in addition to your list:
-How damaged the win chances are when interruption is used against it (and at what time, since counterspell on the final Goblin Charbelcher are more of a blow out than a discard spell on it)
-Comparing goldfish turn to "Disruption Density" alongside other deck's goldfish (winning on turn 3 is just fine if the opponent's disruption is bad and they don't have a quicker goldfish)
-Tournament size affecting playerbase. Even if you can guess correctly the deck distribution at a GP, the players playing Merfolk and the like are going to be worse on average (probably) than the Merfolk players at smaller tournaments (cheapest deck drawing new players)

I still think after getting all of it you'd be stuck forever tweaking how important each of these things is, relative to all of the other ones. I'd absolutely love to see (and help with) some attempts, though. If you can get a decent constraint going I could even write something up in Python to run it.

Koby

08-29-2013, 02:03 AM

One method would be to boil the problem down into components and find a way to model them

For example, one could say the power of a combo deck is described by:
-the distribution of goldfishes by turn (i.e. probability you goldfish on turn 1, on turn 2, etc...) Fundamental Turn
-the probability of drawing/assembling the pieces (1 card combo > 2-card combo > 3 card combo, and tutors/cantripping helps improve odds) See note 1
-the number of hate cards available and the frequency of those cards being run in events with similar metagames (e.g. % of decks running Leyline of Sanctity, Chalice of the Void, etc.) See note 2
-the probability of you finding your SB answer to those hate cards (based on number of answers you run and the chance of you seeing it within X turns given your deck's digging power) See note 3

Combining those dimensions you could probably construct some kind of measure to quantify strength of a combo deck relative to a meta.

Another approach would be empirical... strictly based on matchup percentages (based on testing) * distribution of decks in meta (estimated by turnout at similar recent events).

It would be interesting to compare these, for say TES and Belcher and AllSpells and OmniTell, and see how the measures perform.

You bring up some good points about how to quantify this model. In the context of comparing combo decks, we can isolate a few key parameters that will allow us to compare one deck to another even though they are using completely different cards and strategies.

The first and easiest to measure is the goldfish fundamental turn. This is just a simple measure of Monte Carlo simulations until victory condition is achieved. We've been using this informally on these forums to measure the speed/strength of combo decks in comparison to one another. Outside of combo complexity, this helps us to categorize the speed of the deck. In the context of Legacy's efficiency, winning sooner means giving the opponent less options to impede the victory. Thus, fundamental turn is a strong measure of a deck's ability to win games, and win them early.

1. This again speaks volumes about how well a deck can achieve A + B. For instance, decks like Sneak & Show play a set of 8 enablers and 8 creatures; giving the deck very easy wins when random luck gives you both moving pieces. In situations those don't exist, cantrips help to filter the draws to find either missing piece. Also in the boat is pretty much every two-card combo that's built into a blue cantrip shell. This isn't like Belcher or Storm decks however, as those are much more variant and are designed to play a hand, rather than a combination of two cards. The Storm engine has much different criteria.

2/3. Both this and your third point/criteria are adding a bit more complexity to the first pass of a potential model. I think we can devise a way to consistently measure effectiveness vs sideboard cards or our own sideboard cards. The main issue with this criteria is the variance between pilots and decks; there is no control in these parameters. Effectively, when a swap occurs between sideboard and maindeck, we would now need to examine a new deck altogether, and test potential builds vs opponent's decks with other variances. It increases the amount of data and analysis exponentially. We would need a method to again isolate very specific instances of the main/side system in order to measure benefits from such transformations. Additionally, each new build would invalidate past studies. This is certainly what you pointed out as a complex system.

There are a number of strategies and cards that could speed up or slow down the deck in order to make sideboard cards effective. For instance, Lim-dul's Vault can effectively tutor up your deck to find a card, but the permutations of the other cards both in your hand and the other remaining 4 cards from its resolution will lead to increased complexity.

Thus, for your points #2 and #3 we would want to use a qualitative measurement rather than quantitative, at least until we have better ways to describe them. Doing so doesn't change the value of the measurements, but they should be notes as being "soft" factors rather than pure probabilities.

One key point that I think we could look at is the Belcher simulation program that was posted a few weeks back in its eponymous thread.

catmint

08-29-2013, 03:19 AM

I once did a table of sideboard cards for a given deck and set an "improve Matchup%" do find out better synergies aka using sideboard cards which have good use against different decks. The easy thing to calculate is how likely you are going to draw the SB card - the difficult calculation is how large the impact is. Some calculations are relatively "easy": Let's assume we are goblins and board in 4 Thalia vs storm. Then we assume that the average opponent has 1 Karakas, 3 decay and 1 massacre and in average X draw steps with Y cantrips to find an anti-hate spell you can get some number for Thalia. What is (at least for me) impossible to calculate. How much slower is storms goldfish by using expensive cantrips to not only find the combo but also anti-hate. This factor is important because if our aggro-goldfish is then in avg. faster then theirs Thalia might be a lot more effective that the original number suggested.

Other sideboard cards are so complex and especially skill intensive that an impact calculation seems impossible. Examples for this are meddling mage or cabal therapy (vs. storm - I guess vs. show it is somehow easier).

Other sideboard cards are so gamestate dependent. For example blood moon in sneak attack vs. canadian (btw. I think it is a bad sideboard tech), but how do you calculate if there is already a threat down and if so how fast the clock in avg is and,.... but also other factors like you have to fetch a red source before you go off - how often is it wasteable and how does that effect future tax counters,...

Other sideboard cards are dependent on other card choices. Like surgical extraction is a lot better versus combo if you play discard yourself.

I guess my point is that empirical analyses (aka testing) seems way more useful than the math approach for these sideboard "matchup impact%". To test with a high enough sample is the other question so I guess it is also up to you to actively ask yourself (how good would have been or not would have been the card in a lot of given situations).

The math problems which would be nice to have in a template:
How much more likely is it to draw a card not only counting draw steps but the DIFFERENT cantrips. Aka you can fill in the number of brainstorm, ponder, preordain, cylclers you run and then fill in the number of draw steps according to your estimation of how much time you have for the hate to be effective. If and how much this would be influenced by "keepable hands" would be interesting. Like 5 cantrips with 0 blue sources does not make sense. So maybe a basic reduction of this % based on the number of keepable hands would be nice. Either generic -x% or dependent on the spell/land mix maybe?

FTW

08-29-2013, 07:52 AM

Honestly I think the best and only way to go about it is to strictly use win percentages, which then says that the average is more important than it really is (playskill is a HUGE factor when playing, even more so for combo).

An easy fix is to use the pilot's own win percentages from testing/past events. Combine that with expected frequency of those decks in the meta and you can get a crude expected win % at an event, not factoring in how those decks themselves perform. This would also give you a better measure of the performance of a specific list and pilot instead of generalizing for the whole archetype. However, this would necessitate a lot of testing to have enough sample to have reliable win%s...

This is what a lot of magic players already do, intuitively. They test against DTB, discover deck X has a positive matchup against RUG and Shardless BUG or whatever, and then bring that deck out if they expect to see a lot of those. It's a very natural (and rational) decision process and would not be difficult to just do it more quantitatively.

-How damaged the win chances are when interruption is used against it (and at what time, since counterspell on the final Goblin Charbelcher are more of a blow out than a discard spell on it)
Good point. You could multiply the chance of seeing that hate by a "severity" factor between 0-1 that represents how much that card cripples your combo. With Belcher getting countered being somewhere between 0.9-0.99, Empty the Warrens getting countered being a 0, and someone casting Tome Scour on your pass-the-turn Doomsday pile being a 1.

-Tournament size affecting playerbase. Even if you can guess correctly the deck distribution at a GP, the players playing Merfolk and the like are going to be worse on average (probably) than the Merfolk players at smaller tournaments (cheapest deck drawing new players)

True. Now suddenly you have a dynamic system, with each deck and pilot having specific matchup %s and traveling through a different path in the tournament and interacting with other decks as they travel through. Simulation would probably be the best approach. Once you have a way to measure how a deck will perform against other decks (based on a model or empirical MU %s or whatever), you could do a Monte Carlo simulation of a tournament process given an expected frequency of decks. Shuffle those decks around randomly for round 1, they play, a deck wins with a certain % depending on its opponent, then it moves on to another randomly assigned table based on its win record, etc. It wouldn't be hard to simulate a tournament process and repeat 10000 times to get good estimates, with number of rounds and expected frequency of decks as input parameters. Wouldn't be hard to implement in software either... hmm...

-Comparing goldfish turn to "Disruption Density" alongside other deck's goldfish (winning on turn 3 is just fine if the opponent's disruption is bad and they don't have a quicker goldfish)
True, which is where the complexity Koby mentioned comes in. Because now we have many many more variables to consider regarding not only how much hate the opponent has but how much of a clock he can stick, the likelihood of him mulling into hate vs a clock, the likelihood of keeping a hand with hate but no clock, how much playing the hate slows down the clock (e.g. Goblins playing T1 Relic/Pithing Needle vs T1 Lackey/Vial is a huge deal), how well he can dig for additional hate, your own sideboard strategy, how much your combo is slowed down by boarding in anti-hate, whether you predicted the right hate card or not, how versatile your anti-hate cards are (Ray of Revelation vs Tormod's Crypt... OOPS), how likely you are to be able to actually cast your answer (Abrupt Decay may not be easy through mana denial), the non-linearity of interaction between hate strategies (e.g. discard+Surgical both get better together), etc.

So there may have to be some sort of "soft" summary measure of how the deck handles hate, at least for starters. Better to start with a simpler model and add complexity then to start with an unwieldy problem.

Economists model this sort of stuff all the time and try to stick an index or measure on something that seems too complex to quantify. It's certainly feasible as long as it's regarded as an estimate and not a perfect exact binding truth.

Could be fun to tackle. Might be fun/easier/more practical to do the tournament path modeling first... It would allow players to ask practical questions like "How will my combo deck perform if a lot of Merfolk shows up but also a lot of Goblins (which may push Merfolk to lower tables)?"

apple713

08-29-2013, 09:18 AM

if yall were looking to calculate all the statistical probabilities and matchups it would be possible but incredibly time consuming. You would have have to use a large amount of subjective data like assigning values to cards on how they effect other matchups and such.

If you wanted to take this approach you would have to program cards into a simulator and have them each programmed to effect each statistical interaction. Think for a moment how many calculations would have to take place in just the first 3 turns of a game... fetchlands, brainstorms, GSZ. The easy part would be setting up a combo, the hard part would be how spells interact with creatures.... what value would you assign a 10/10 knight of the reliquary vs miracles that is floating a terminus on top... This situation amongst many others would make situations way to hard to calculate.

If you just wanna find combo probabilities, you can build off the program already constructed in the belcher discussion. I would have built one for Sneak attack or omnitell but I couldnt figure out how to get C# running on my mac. If you can help me get it running i'll build the program. Outside of combo decks this approach would be worthless.

Emperical evidence would be best for non combo decks. It would be advisable to use the small pool of decks that made it to top 8 so you could eliminate player error as a possible factor. You would of course have to assume that players making it to top 8 didnt make any but thats a much more reasonable assumption that creating a margin of error for all players in a larger pool I think.

I know the document I posted earlier in the thread doesnt have the extremly detail calculations that yall are now getting into but it can still be used for most releveant constuction concerns.

FTW

08-29-2013, 10:41 AM

Emperical evidence would be best for non combo decks.

Yeah, I was only suggesting applying said detailed model to combo decks. The proposed tournament simulation would use empirical data for non-combo decks (e.g. Goblins beats Death and Taxes 60% of the time, or something crude like that). It would just be a way to visualize how, given a set of expected matchup percentages and expected meta composition, your deck might perform in the actual structure of a tournament. Computing an expectation assumes you have equal chance of playing everyone, which as pointed out, does not really represent what happens when you play in a tournament. You only face some of the field, yet the rest of the field somewhat determines what decks make it to the tables you are at.

As a really simple example, let's say you are playing Rock. You are 100% vs Scissors and 0% vs Paper. If the composition of the field is roughly 50-50, you might expect your deck to perform about even at 50%. However, if you simulated the tournament process, Scissors would consistently crush Paper and so as you move up each table, you are much more likely to face Scissors than Paper and your deck will perform better and better each round you survive, so you would likely either drop out early or win a lot. If the split is not exactly 50-50, this changes too.

The Belcher simulation program (and similar ones) give a great estimate of Fundamental Turn (and more importantly, the entire goldfish distribution, not just the expected number of turns). But that is only one aspect. I do think the other aspects have value as well. I mean, Belcher clearly goldfishes earlier than TES yet that doesn't necessarily make it a better combo deck. So there's value in quantifying the other dimensions as well.

Re: number of calculations, once a program shell is computed to do them it is less of an issue, and there aren't that many interactions when you restrict yourself to just combo pieces, combo hate, and combo anti-hate. Especially if you make some oversimplifying assumptions.

rufus

08-29-2013, 10:51 AM

Freggle

08-29-2013, 12:40 PM

When you hit some complexity level (and these proposals are IMO well past that point) the usual method for doing things is to run a monte carlo simulation. Instead of trying to work out hard probabilities, you make a simulation and run it a bunch of time to get statistical approximations.

That does mean that the notion of 'deck' gets more complex, but I think that that sort of involvement is inevitable anyway.

Yes exactly. I know that in the math world this is the accepted approch becuase the varince between making assumptions on a bunch of nubers and he statistical approximations are (Im assuming) for the purposes of many negligeable. ...so are there any resources for the modeling softwares? ...I'd really like to compute the amount of cards seen changing fetch lands and Mirri's Guile amounts to better understand their effcect per card slot ratio. ...and can not figure it out.

Phoenix Ignition

08-29-2013, 11:07 PM

...I'd really like to compute the amount of cards seen changing fetch lands and Mirri's Guile amounts to better understand their effcect per card slot ratio. ...and can not figure it out.
Can you elaborate on the scenario you're interested in? I could try to take a crack at it next week.

As a really simple example, let's say you are playing Rock.
I laughed thinking about all the mirror matches going to time. In this situation, they'd be 0w-0L-1draw and you'd ONLY be able to play a loser or winner, depending on your first match.

But yeah, that's nothing to do with anything, just a funny scenario to think of.

if yall were looking to calculate all the statistical probabilities and matchups it would be possible but incredibly time consuming. You would have have to use a large amount of subjective data like assigning values to cards on how they effect other matchups and such.
It wouldn't really be that time consuming. I've worked with 2-D simulations of star's convection regions and that's wayyyyy more data to keep track of and equations to solve than an artificial weighting of 60 objects compared to 60 other objects.

But I do think the best way would be to just use the top ~16 or so decks and keep track of 1) the cards they have to screw up your combo, 2) the cantrips and card drawing they use and 3) the average goldfish turn (possibly with some standard deviation of turns from that, if necessary). From this you could also factor in how badly each of the cards they use to screw you up affects you.

Economists model this sort of stuff all the time and try to stick an index or measure on something that seems too complex to quantify. It's certainly feasible as long as it's regarded as an estimate and not a perfect exact binding truth.

Sometimes, although "Quantum Finance" and the like are generally used. This one hits close to home, as sadly most financial institutions have changed their targeted recruiting population from Physics/Math people (me) to Artificial Intelligence people to really model the changing system.