Page 2 of 2 FirstFirst 12
Results 21 to 30 of 30

Thread: Mathemagic – Breaking Down Probabilities

  1. #21
    Member

    Join Date

    Sep 2011
    Posts

    4,777

    Re: Mathemagic – Breaking Down Probabilities

    Quote Originally Posted by Phoenix Ignition View Post
    This is an extremely non-trivial problem, in more things than just Magic. It boils down to the ability/inability of people to accurately describe incredibly complex systems, and I can guarantee no one in magic has come up with any way to "solve" the changing metagame. The problem with a lot of math is that connecting it to complex situations relies so much on the individual correctly "translating" the real world into a math problem that you're going to get enough errors just from that to make it all worthless.
    Hmm.. This is actually quite an intriguing problem. I want to put my Math degrees up to the test...

    Essentially you are looking for the ways to quantify "viability", to construct some sort of index to measure this. One method would be to boil the problem down into components and find a way to model them

    For example, one could say the power of a combo deck is described by:
    -the distribution of goldfishes by turn (i.e. probability you goldfish on turn 1, on turn 2, etc...)
    -the probability of drawing/assembling the pieces (1 card combo > 2-card combo > 3 card combo, and tutors/cantripping helps improve odds)
    -the number of hate cards available and the frequency of those cards being run in events with similar metagames (e.g. % of decks running Leyline of Sanctity, Chalice of the Void, etc.)
    -the probability of you finding your SB answer to those hate cards (based on number of answers you run and the chance of you seeing it within X turns given your deck's digging power)

    Combining those dimensions you could probably construct some kind of measure to quantify strength of a combo deck relative to a meta.

    Another approach would be empirical... strictly based on matchup percentages (based on testing) * distribution of decks in meta (estimated by turnout at similar recent events).

    It would be interesting to compare these, for say TES and Belcher and AllSpells and OmniTell, and see how the measures perform.

  2. #22
    Here I Rule!!!!!!!!!!
    Phoenix Ignition's Avatar
    Join Date

    Oct 2008
    Location

    Minneapolis MN
    Posts

    2,287

    Re: Mathemagic – Breaking Down Probabilities

    Quote Originally Posted by FTW View Post
    Hmm.. This is actually quite an intriguing problem. I want to put my Math degrees up to the test...

    Essentially you are looking for the ways to quantify "viability", to construct some sort of index to measure this. One method would be to boil the problem down into components and find a way to model them

    For example, one could say the power of a combo deck is described by:
    -the distribution of goldfishes by turn (i.e. probability you goldfish on turn 1, on turn 2, etc...)
    -the probability of drawing/assembling the pieces (1 card combo > 2-card combo > 3 card combo, and tutors/cantripping helps improve odds)
    -the number of hate cards available and the frequency of those cards being run in events with similar metagames (e.g. % of decks running Leyline of Sanctity, Chalice of the Void, etc.)
    -the probability of you finding your SB answer to those hate cards (based on number of answers you run and the chance of you seeing it within X turns given your deck's digging power)

    Combining those dimensions you could probably construct some kind of measure to quantify strength of a combo deck relative to a meta.

    Another approach would be empirical... strictly based on matchup percentages (based on testing) * distribution of decks in meta (estimated by turnout at similar recent events).

    It would be interesting to compare these, for say TES and Belcher and AllSpells and OmniTell, and see how the measures perform.
    Honestly I think the best and only way to go about it is to strictly use win percentages, which then says that the average is more important than it really is (playskill is a HUGE factor when playing, even more so for combo).

    Just a few things I think you'd need to include in addition to your list:
    -How damaged the win chances are when interruption is used against it (and at what time, since counterspell on the final Goblin Charbelcher are more of a blow out than a discard spell on it)
    -Comparing goldfish turn to "Disruption Density" alongside other deck's goldfish (winning on turn 3 is just fine if the opponent's disruption is bad and they don't have a quicker goldfish)
    -Tournament size affecting playerbase. Even if you can guess correctly the deck distribution at a GP, the players playing Merfolk and the like are going to be worse on average (probably) than the Merfolk players at smaller tournaments (cheapest deck drawing new players)


    I still think after getting all of it you'd be stuck forever tweaking how important each of these things is, relative to all of the other ones. I'd absolutely love to see (and help with) some attempts, though. If you can get a decent constraint going I could even write something up in Python to run it.

  3. #23
    Vintage

    Join Date

    Apr 2005
    Location

    West Coast Degeneracy
    Posts

    5,135

    Re: Mathemagic – Breaking Down Probabilities

    Quote Originally Posted by FTW View Post
    One method would be to boil the problem down into components and find a way to model them

    For example, one could say the power of a combo deck is described by:
    -the distribution of goldfishes by turn (i.e. probability you goldfish on turn 1, on turn 2, etc...) Fundamental Turn
    -the probability of drawing/assembling the pieces (1 card combo > 2-card combo > 3 card combo, and tutors/cantripping helps improve odds) See note 1
    -the number of hate cards available and the frequency of those cards being run in events with similar metagames (e.g. % of decks running Leyline of Sanctity, Chalice of the Void, etc.) See note 2
    -the probability of you finding your SB answer to those hate cards (based on number of answers you run and the chance of you seeing it within X turns given your deck's digging power) See note 3

    Combining those dimensions you could probably construct some kind of measure to quantify strength of a combo deck relative to a meta.

    Another approach would be empirical... strictly based on matchup percentages (based on testing) * distribution of decks in meta (estimated by turnout at similar recent events).

    It would be interesting to compare these, for say TES and Belcher and AllSpells and OmniTell, and see how the measures perform.
    You bring up some good points about how to quantify this model. In the context of comparing combo decks, we can isolate a few key parameters that will allow us to compare one deck to another even though they are using completely different cards and strategies.

    The first and easiest to measure is the goldfish fundamental turn. This is just a simple measure of Monte Carlo simulations until victory condition is achieved. We've been using this informally on these forums to measure the speed/strength of combo decks in comparison to one another. Outside of combo complexity, this helps us to categorize the speed of the deck. In the context of Legacy's efficiency, winning sooner means giving the opponent less options to impede the victory. Thus, fundamental turn is a strong measure of a deck's ability to win games, and win them early.

    1. This again speaks volumes about how well a deck can achieve A + B. For instance, decks like Sneak & Show play a set of 8 enablers and 8 creatures; giving the deck very easy wins when random luck gives you both moving pieces. In situations those don't exist, cantrips help to filter the draws to find either missing piece. Also in the boat is pretty much every two-card combo that's built into a blue cantrip shell. This isn't like Belcher or Storm decks however, as those are much more variant and are designed to play a hand, rather than a combination of two cards. The Storm engine has much different criteria.

    2/3. Both this and your third point/criteria are adding a bit more complexity to the first pass of a potential model. I think we can devise a way to consistently measure effectiveness vs sideboard cards or our own sideboard cards. The main issue with this criteria is the variance between pilots and decks; there is no control in these parameters. Effectively, when a swap occurs between sideboard and maindeck, we would now need to examine a new deck altogether, and test potential builds vs opponent's decks with other variances. It increases the amount of data and analysis exponentially. We would need a method to again isolate very specific instances of the main/side system in order to measure benefits from such transformations. Additionally, each new build would invalidate past studies. This is certainly what you pointed out as a complex system.

    There are a number of strategies and cards that could speed up or slow down the deck in order to make sideboard cards effective. For instance, Lim-dul's Vault can effectively tutor up your deck to find a card, but the permutations of the other cards both in your hand and the other remaining 4 cards from its resolution will lead to increased complexity.

    Thus, for your points #2 and #3 we would want to use a qualitative measurement rather than quantitative, at least until we have better ways to describe them. Doing so doesn't change the value of the measurements, but they should be notes as being "soft" factors rather than pure probabilities.

    One key point that I think we could look at is the Belcher simulation program that was posted a few weeks back in its eponymous thread.
    West side
    Find me on MTGO as Koby or rukcus -- @MTGKoby on Twitter
    * Maverick is dead. Long live Maverick!
    My Legacy stream
    My MTG Blog - Work in progress

  4. #24
    In response: Snapcaster Mage
    catmint's Avatar
    Join Date

    Feb 2011
    Posts

    923

    Re: Mathemagic – Breaking Down Probabilities

    I once did a table of sideboard cards for a given deck and set an "improve Matchup%" do find out better synergies aka using sideboard cards which have good use against different decks. The easy thing to calculate is how likely you are going to draw the SB card - the difficult calculation is how large the impact is. Some calculations are relatively "easy": Let's assume we are goblins and board in 4 Thalia vs storm. Then we assume that the average opponent has 1 Karakas, 3 decay and 1 massacre and in average X draw steps with Y cantrips to find an anti-hate spell you can get some number for Thalia. What is (at least for me) impossible to calculate. How much slower is storms goldfish by using expensive cantrips to not only find the combo but also anti-hate. This factor is important because if our aggro-goldfish is then in avg. faster then theirs Thalia might be a lot more effective that the original number suggested.

    Other sideboard cards are so complex and especially skill intensive that an impact calculation seems impossible. Examples for this are meddling mage or cabal therapy (vs. storm - I guess vs. show it is somehow easier).

    Other sideboard cards are so gamestate dependent. For example blood moon in sneak attack vs. canadian (btw. I think it is a bad sideboard tech), but how do you calculate if there is already a threat down and if so how fast the clock in avg is and,.... but also other factors like you have to fetch a red source before you go off - how often is it wasteable and how does that effect future tax counters,...

    Other sideboard cards are dependent on other card choices. Like surgical extraction is a lot better versus combo if you play discard yourself.

    I guess my point is that empirical analyses (aka testing) seems way more useful than the math approach for these sideboard "matchup impact%". To test with a high enough sample is the other question so I guess it is also up to you to actively ask yourself (how good would have been or not would have been the card in a lot of given situations).

    The math problems which would be nice to have in a template:
    How much more likely is it to draw a card not only counting draw steps but the DIFFERENT cantrips. Aka you can fill in the number of brainstorm, ponder, preordain, cylclers you run and then fill in the number of draw steps according to your estimation of how much time you have for the hate to be effective. If and how much this would be influenced by "keepable hands" would be interesting. Like 5 cantrips with 0 blue sources does not make sense. So maybe a basic reduction of this % based on the number of keepable hands would be nice. Either generic -x% or dependent on the spell/land mix maybe?
    Currently playing: Elves

  5. #25
    Member

    Join Date

    Sep 2011
    Posts

    4,777

    Re: Mathemagic – Breaking Down Probabilities

    Quote Originally Posted by Phoenix Ignition View Post
    Honestly I think the best and only way to go about it is to strictly use win percentages, which then says that the average is more important than it really is (playskill is a HUGE factor when playing, even more so for combo).
    An easy fix is to use the pilot's own win percentages from testing/past events. Combine that with expected frequency of those decks in the meta and you can get a crude expected win % at an event, not factoring in how those decks themselves perform. This would also give you a better measure of the performance of a specific list and pilot instead of generalizing for the whole archetype. However, this would necessitate a lot of testing to have enough sample to have reliable win%s...

    This is what a lot of magic players already do, intuitively. They test against DTB, discover deck X has a positive matchup against RUG and Shardless BUG or whatever, and then bring that deck out if they expect to see a lot of those. It's a very natural (and rational) decision process and would not be difficult to just do it more quantitatively.

    -How damaged the win chances are when interruption is used against it (and at what time, since counterspell on the final Goblin Charbelcher are more of a blow out than a discard spell on it)
    Good point. You could multiply the chance of seeing that hate by a "severity" factor between 0-1 that represents how much that card cripples your combo. With Belcher getting countered being somewhere between 0.9-0.99, Empty the Warrens getting countered being a 0, and someone casting Tome Scour on your pass-the-turn Doomsday pile being a 1.

    -Tournament size affecting playerbase. Even if you can guess correctly the deck distribution at a GP, the players playing Merfolk and the like are going to be worse on average (probably) than the Merfolk players at smaller tournaments (cheapest deck drawing new players)
    True. Now suddenly you have a dynamic system, with each deck and pilot having specific matchup %s and traveling through a different path in the tournament and interacting with other decks as they travel through. Simulation would probably be the best approach. Once you have a way to measure how a deck will perform against other decks (based on a model or empirical MU %s or whatever), you could do a Monte Carlo simulation of a tournament process given an expected frequency of decks. Shuffle those decks around randomly for round 1, they play, a deck wins with a certain % depending on its opponent, then it moves on to another randomly assigned table based on its win record, etc. It wouldn't be hard to simulate a tournament process and repeat 10000 times to get good estimates, with number of rounds and expected frequency of decks as input parameters. Wouldn't be hard to implement in software either... hmm...


    -Comparing goldfish turn to "Disruption Density" alongside other deck's goldfish (winning on turn 3 is just fine if the opponent's disruption is bad and they don't have a quicker goldfish)
    True, which is where the complexity Koby mentioned comes in. Because now we have many many more variables to consider regarding not only how much hate the opponent has but how much of a clock he can stick, the likelihood of him mulling into hate vs a clock, the likelihood of keeping a hand with hate but no clock, how much playing the hate slows down the clock (e.g. Goblins playing T1 Relic/Pithing Needle vs T1 Lackey/Vial is a huge deal), how well he can dig for additional hate, your own sideboard strategy, how much your combo is slowed down by boarding in anti-hate, whether you predicted the right hate card or not, how versatile your anti-hate cards are (Ray of Revelation vs Tormod's Crypt... OOPS), how likely you are to be able to actually cast your answer (Abrupt Decay may not be easy through mana denial), the non-linearity of interaction between hate strategies (e.g. discard+Surgical both get better together), etc.

    So there may have to be some sort of "soft" summary measure of how the deck handles hate, at least for starters. Better to start with a simpler model and add complexity then to start with an unwieldy problem.

    Economists model this sort of stuff all the time and try to stick an index or measure on something that seems too complex to quantify. It's certainly feasible as long as it's regarded as an estimate and not a perfect exact binding truth.

    Could be fun to tackle. Might be fun/easier/more practical to do the tournament path modeling first... It would allow players to ask practical questions like "How will my combo deck perform if a lot of Merfolk shows up but also a lot of Goblins (which may push Merfolk to lower tables)?"

  6. #26
    Site Contributor
    apple713's Avatar
    Join Date

    Jan 2012
    Location

    Manhattan, NY
    Posts

    2,086

    Re: Mathemagic – Breaking Down Probabilities

    if yall were looking to calculate all the statistical probabilities and matchups it would be possible but incredibly time consuming. You would have have to use a large amount of subjective data like assigning values to cards on how they effect other matchups and such.

    If you wanted to take this approach you would have to program cards into a simulator and have them each programmed to effect each statistical interaction. Think for a moment how many calculations would have to take place in just the first 3 turns of a game... fetchlands, brainstorms, GSZ. The easy part would be setting up a combo, the hard part would be how spells interact with creatures.... what value would you assign a 10/10 knight of the reliquary vs miracles that is floating a terminus on top... This situation amongst many others would make situations way to hard to calculate.

    If you just wanna find combo probabilities, you can build off the program already constructed in the belcher discussion. I would have built one for Sneak attack or omnitell but I couldnt figure out how to get C# running on my mac. If you can help me get it running i'll build the program. Outside of combo decks this approach would be worthless.

    Emperical evidence would be best for non combo decks. It would be advisable to use the small pool of decks that made it to top 8 so you could eliminate player error as a possible factor. You would of course have to assume that players making it to top 8 didnt make any but thats a much more reasonable assumption that creating a margin of error for all players in a larger pool I think.


    I know the document I posted earlier in the thread doesnt have the extremly detail calculations that yall are now getting into but it can still be used for most releveant constuction concerns.
    Play 4 Card Blind!

    Currently Playing
    Legacy: Dark Depths
    EDH: 5-Color Hermit Druid

    Currently Brewing: [Deck] Sadistic Sacrament / Chalice NO Eldrazi

    why cards are so expensive...hoarders

  7. #27
    Member

    Join Date

    Sep 2011
    Posts

    4,777

    Re: Mathemagic – Breaking Down Probabilities

    Quote Originally Posted by apple713 View Post
    Emperical evidence would be best for non combo decks.
    Yeah, I was only suggesting applying said detailed model to combo decks. The proposed tournament simulation would use empirical data for non-combo decks (e.g. Goblins beats Death and Taxes 60% of the time, or something crude like that). It would just be a way to visualize how, given a set of expected matchup percentages and expected meta composition, your deck might perform in the actual structure of a tournament. Computing an expectation assumes you have equal chance of playing everyone, which as pointed out, does not really represent what happens when you play in a tournament. You only face some of the field, yet the rest of the field somewhat determines what decks make it to the tables you are at.

    As a really simple example, let's say you are playing Rock. You are 100% vs Scissors and 0% vs Paper. If the composition of the field is roughly 50-50, you might expect your deck to perform about even at 50%. However, if you simulated the tournament process, Scissors would consistently crush Paper and so as you move up each table, you are much more likely to face Scissors than Paper and your deck will perform better and better each round you survive, so you would likely either drop out early or win a lot. If the split is not exactly 50-50, this changes too.

    The Belcher simulation program (and similar ones) give a great estimate of Fundamental Turn (and more importantly, the entire goldfish distribution, not just the expected number of turns). But that is only one aspect. I do think the other aspects have value as well. I mean, Belcher clearly goldfishes earlier than TES yet that doesn't necessarily make it a better combo deck. So there's value in quantifying the other dimensions as well.

    Re: number of calculations, once a program shell is computed to do them it is less of an issue, and there aren't that many interactions when you restrict yourself to just combo pieces, combo hate, and combo anti-hate. Especially if you make some oversimplifying assumptions.

  8. #28

    Re: Mathemagic – Breaking Down Probabilities

    Quote Originally Posted by apple713 View Post
    if yall were looking to calculate all the statistical probabilities and matchups it would be possible but incredibly time consuming. You would have have to use a large amount of subjective data like assigning values to cards on how they effect other matchups and such.
    ...
    When you hit some complexity level (and these proposals are IMO well past that point) the usual method for doing things is to run a monte carlo simulation. Instead of trying to work out hard probabilities, you make a simulation and run it a bunch of time to get statistical approximations.

    That does mean that the notion of 'deck' gets more complex, but I think that that sort of involvement is inevitable anyway.

  9. #29
    Site Contributor
    Freggle's Avatar
    Join Date

    Apr 2011
    Location

    Orlando, FL
    Posts

    854

    Re: Mathemagic – Breaking Down Probabilities

    Quote Originally Posted by rufus View Post
    When you hit some complexity level (and these proposals are IMO well past that point) the usual method for doing things is to run a monte carlo simulation. Instead of trying to work out hard probabilities, you make a simulation and run it a bunch of time to get statistical approximations.

    That does mean that the notion of 'deck' gets more complex, but I think that that sort of involvement is inevitable anyway.
    Yes exactly. I know that in the math world this is the accepted approch becuase the varince between making assumptions on a bunch of nubers and he statistical approximations are (Im assuming) for the purposes of many negligeable. ...so are there any resources for the modeling softwares? ...I'd really like to compute the amount of cards seen changing fetch lands and Mirri's Guile amounts to better understand their effcect per card slot ratio. ...and can not figure it out.

  10. #30
    Here I Rule!!!!!!!!!!
    Phoenix Ignition's Avatar
    Join Date

    Oct 2008
    Location

    Minneapolis MN
    Posts

    2,287

    Re: Mathemagic – Breaking Down Probabilities

    Quote Originally Posted by Freggle View Post
    ...I'd really like to compute the amount of cards seen changing fetch lands and Mirri's Guile amounts to better understand their effcect per card slot ratio. ...and can not figure it out.
    Can you elaborate on the scenario you're interested in? I could try to take a crack at it next week.
    Quote Originally Posted by FTW View Post
    As a really simple example, let's say you are playing Rock.
    I laughed thinking about all the mirror matches going to time. In this situation, they'd be 0w-0L-1draw and you'd ONLY be able to play a loser or winner, depending on your first match.

    But yeah, that's nothing to do with anything, just a funny scenario to think of.
    Quote Originally Posted by apple713 View Post
    if yall were looking to calculate all the statistical probabilities and matchups it would be possible but incredibly time consuming. You would have have to use a large amount of subjective data like assigning values to cards on how they effect other matchups and such.
    It wouldn't really be that time consuming. I've worked with 2-D simulations of star's convection regions and that's wayyyyy more data to keep track of and equations to solve than an artificial weighting of 60 objects compared to 60 other objects.

    But I do think the best way would be to just use the top ~16 or so decks and keep track of 1) the cards they have to screw up your combo, 2) the cantrips and card drawing they use and 3) the average goldfish turn (possibly with some standard deviation of turns from that, if necessary). From this you could also factor in how badly each of the cards they use to screw you up affects you.


    Quote Originally Posted by FTW View Post
    Economists model this sort of stuff all the time and try to stick an index or measure on something that seems too complex to quantify. It's certainly feasible as long as it's regarded as an estimate and not a perfect exact binding truth.
    Sometimes, although "Quantum Finance" and the like are generally used. This one hits close to home, as sadly most financial institutions have changed their targeted recruiting population from Physics/Math people (me) to Artificial Intelligence people to really model the changing system.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)