View Poll Results: Most bannable card in Legacy? (not that they will touch it)

Voters
192. You may not vote on this poll
  • Brainstorm

    16 8.33%
  • Force of Will

    4 2.08%
  • Lion's Eye Diamond

    35 18.23%
  • Counterbalance

    34 17.71%
  • Sensei's Divining Top

    103 53.65%
  • Tarmogoyf

    46 23.96%
  • Phyrexian Dreadnaught

    2 1.04%
  • Goblin Lackey

    4 2.08%
  • Standstill

    6 3.13%
  • Natural Order

    8 4.17%
Multiple Choice Poll.
Page 1160 of 1178 FirstFirst ... 1606601060111011501156115711581159116011611162116311641170 ... LastLast
Results 23,181 to 23,200 of 23542

Thread: All B/R update speculation.

  1. #23181

    Re: All B/R update speculation.

    Quote Originally Posted by Reeplcheep View Post
    Here is the data from the legacy data project. Hard non-mirror winrates collected by hand. This was a ton of work so if you like stuff like this please consider helping out with data collection or the Patreon.
    TBH I don't really care about MTG anymore to even bother.
    The fact that it needs the community to collect and process the data instead of WotC is not helping.
    As physicist I just (should) know more about statistics and it bugs me when people use them wrong.

    Back on topic, it's striking that even though monke decks are the most played and the meta is tailored for that at least UR is still at >=50% WR.

  2. #23182

    Re: All B/R update speculation.

    Quote Originally Posted by Zoid View Post
    As physicist I just (should) know more about statistics and it bugs me when people use them wrong.
    Quote Originally Posted by Zoid View Post
    Assuming 50% win rate seems a stretch.

  3. #23183

    Re: All B/R update speculation.

    ?

    I still think that is an bad assumption.
    There's no reason to assume that any matchup besides the mirror is 50%.
    Why even assume anything in the first place?
    You just present the matchup data as it is, done.

  4. #23184

    Re: All B/R update speculation.

    Quote Originally Posted by Zoid View Post
    ?

    I still think that is an bad assumption.
    There's no reason to assume that any matchup besides the mirror is 50%.
    Why even assume anything in the first place?
    You just present the matchup data as it is, done.
    If you are a really a physicist you should know the definition of null. You always assume no difference and try to prove otherwise. From thermodynamics you should also now the principle of informational entropy; the null has to be the same as assigning labels at random. Otherwise you would conclude snow basics d&t having a 25-24 head to head winrate vs non-snow D&T is telling you something.

    H_0 is in the vast majority of case p=1/# of options, (in this case 2, W or L)

  5. #23185

    Re: All B/R update speculation.

    Quote Originally Posted by Zoid View Post
    ?

    I still think that is an bad assumption.
    There's no reason to assume that any matchup besides the mirror is 50%.
    Why even assume anything in the first place?
    You just present the matchup data as it is, done.
    It's basic statistics that you test your results against flipping a coin.
    A coin will predict the correct outcome 50% of the time, and if you say that a MU is 75/25 you're claiming you can predict it 75% of the time.

  6. #23186
    Member

    Join Date

    Feb 2014
    Posts

    1,198

    Re: All B/R update speculation.

    Fully agreeing with Zoid here.
    You present the data, transformation of said data is useful only if it is more informative.

    Here I do not see it:
    600-400 is more meaningful than 60-40, itself better than 6-4.
    I do not see what statistical treatment you could do that would make it more easily understandable by the audience, magic players that do understand what a MU is, both in term of result and reliability of said result.

  7. #23187

    Re: All B/R update speculation.

    Quote Originally Posted by dte View Post
    Fully agreeing with Zoid here.
    You present the data, transformation of said data is useful only if it is more informative.

    Here I do not see it:
    600-400 is more meaningful than 60-40, itself better than 6-4.
    I do not see what statistical treatment you could do that would make it more easily understandable by the audience, magic players that do understand what a MU is, both in term of result and reliability of said result.
    If only there was a way to see if the data you collected was as good as flipping coins.
    Oh well, I'm sure it's the field of statistics that's wrong here.

  8. #23188
    Member

    Join Date

    Feb 2014
    Posts

    1,198

    Re: All B/R update speculation.

    Quote Originally Posted by FourDogsinaHorseSuit View Post
    If only there was a way to see if the data you collected was as good as flipping coins.
    Oh well, I'm sure it's the field of statistics that's wrong here.
    The field of statistics cannot answer that. There are formula to tell you whether from some data, you can fix a given range of probability with a given confidence.

    So here you could say that there is >95% probability that deck 1 has a win rate comprised between 55 and 65% over deck 2.

    It would still be reduced data, ie less information than the actual numbers, eg 600-400 (numbers not corresponding to above statement).

    It is very useful to do statistical treatment, but only if it gives you a faster, better understanding. I do not think that it is the case here.

  9. #23189

    Re: All B/R update speculation.

    Quote Originally Posted by dte View Post
    The field of statistics cannot answer that.
    It's literally the definition of null hypothesis testing.

  10. #23190

    Re: All B/R update speculation.

    Quote Originally Posted by dte View Post
    So here you could say that there is >95% probability that deck 1 has a win rate comprised between 55 and 65% over deck 2.
    You can’t say that. The confidence interval is for future reproductions not the current event. You can only say that a random fair sample (in this case flipping a coin) would have produced this result <5% of the time, so you can reject the null that both sides are the same.

  11. #23191
    Member

    Join Date

    Sep 2011
    Posts

    4,776

    Re: All B/R update speculation.

    Quote Originally Posted by dte View Post
    Here I do not see it:
    600-400 is more meaningful than 60-40, itself better than 6-4.
    I do not see what statistical treatment you could do that would make it more easily understandable by the audience, magic players that do understand what a MU is, both in term of result and reliability of said result.
    600-400 is obviously more meaningful than 6-4. But let's look at less extreme cases.

    If HomeBrew.dec goes 6-4 vs Delver, does that mean your homebrew is favored against Delver or was that just a lucky streak? Maybe the matchup is about even? Maybe it's unfavored? (Those are the most common matchup classifications players use)

    Maybe players would intuitively know that's too few games and they need to test more (although some take a single League 5-0 as proof a deck is good, so you never know). But what if that result was 12-8? 18-12? 24-16? 60-40? That's more than 6-4, but is it enough? At what point is it enough games to be reasonably sure HomeBrew is favored against Delver? That's not easy to intuitively know from looking at the raw results. And that's where a statistical treatment adds value. If you do a 1-tailed test with null 50%, it basically tells you whether you had enough games to conclude the matchup is favorable (technically you're rejecting that the matchup is even or unfavorable, but close enough).

    It shouldn't come at the cost of presenting the real data. Sometimes people report only a p-value without presenting any of the actual data, but that isn't the only way to present it. You can show both.

    2 Examples:
    1) 60%*
    (N=60)

    2) 36-24*

    That's clean and simple and still has no information loss from the original data. Both contain enough to tell you that 60 matches were played, 60% were wins, 40% were losses, overall result of 36-24, AND that the matchup was favorable at some standard level of statistical significance you can mention outside the table (e.g. alpha=5%, alpha=10%). The statistical treatment adds value to the result. It tells a player that was "enough" data to classify that as favorable, while 6-4 isn't enough.

    Or you could color-code the cells
    Green = Favorable (statistically significant at X% confidence)
    Yellow = About even (not statistically different from 50-50 at X% confidence)
    Orange = Unfavorable (statistically significant at X% confidence)

    That should be easy to digest and does tell you more than just the matchup data without any further treatment.

  12. #23192
    Member

    Join Date

    Feb 2014
    Posts

    1,198

    Re: All B/R update speculation.

    Quote Originally Posted by Reeplcheep View Post
    You can’t say that. The confidence interval is for future reproductions not the current event. You can only say that a random fair sample (in this case flipping a coin) would have produced this result <5% of the time, so you can reject the null that both sides are the same.
    I wrote "has", not "had"?

    But my question stands : how is decreasing the information, and giving for a given MU a range + confidence, giving a clearer picture than the actual data, i.e. Win-loss ?

    Edit: FTW answered meanwhile. In the example above, I see option 2) as an easier readout than option 1). I still do think that W-L is a better representation, cleaner and simpler, than adding some arbitrarily chosen confidence interval. It is just discretizing the confidence, rather than keeping a continuum.

  13. #23193

    Re: All B/R update speculation.

    Quote Originally Posted by dte View Post
    I wrote "has", not "had"?

    But my question stands : how is decreasing the information, and giving for a given MU a range + confidence, giving a clearer picture than the actual data, i.e. Win-loss ?
    Because you can't record all data so range + confidence also performs validation on how good the data even is. It also provides insight on the greater population while the recorded results only provide information about the sample they are a part of.
    This is what statistics is. It's about understanding the greater population of matches given a sample.

  14. #23194
    Member

    Join Date

    Feb 2014
    Posts

    1,198

    Re: All B/R update speculation.

    Quote Originally Posted by FourDogsinaHorseSuit View Post
    Because you can't record all data so range + confidence also performs validation on how good the data even is. It also provides insight on the greater population while the recorded results only provide information about the sample they are a part of.
    This is what statistics is. It's about understanding the greater population of matches given a sample.
    Recording all data or not has no influence here.
    You have recorded data, in the form of W-L.
    You do not get more or better data by performing whatever treatment you want on it, you are only modifying the representation of said data.
    That you would settle on a given probability threshold, likely 90% or 95% is simply discretizing the confidence, which is dependent of the sample size, which you see from W-L in a pseudo continuous fashion.

  15. #23195
    Member

    Join Date

    Sep 2011
    Posts

    4,776

    Re: All B/R update speculation.

    Quote Originally Posted by dte View Post
    Edit: FTW answered meanwhile. In the example above, I see option 2) as an easier readout than option 1). I still do think that W-L is a better representation, cleaner and simpler
    I presented both because I think players may have different opinions on this when the numbers get messier. For 60-40, it's simple to do the mental math for win% and total number of matches, so the cleaner presentation is sufficient. If it was 47-36, the mental math to get win % is more of a burden on the reader, especially if there are 100+ cells in the table.

    The 2nd is cleaner, but the 1st gives easier access to different information. It depends which are of more interest. But I agree it should be done in a way without information loss.

    Quote Originally Posted by dte View Post
    That you would settle on a given probability threshold, likely 90% or 95% is simply discretizing the confidence, which is dependent of the sample size, which you see from W-L in a pseudo continuous fashion.
    It's also establishing a consistent benchmark for all cells, based on a fixed probability threshold instead of the difference between W and L. Otherwise this is not intuitive looking at W-L with different numbers of matches played in each cell.

  16. #23196
    Member

    Join Date

    Feb 2014
    Posts

    1,198

    Re: All B/R update speculation.

    Quote Originally Posted by FTW View Post
    I presented both because I think players may have different opinions on this when the numbers get messier. For 60-40, it's simple to do the mental math for win% and total number of matches, so the cleaner presentation is sufficient. If it was 47-36, the mental math to get win % is more of a burden on the reader, especially if there are 100+ cells in the table.
    I find 47-36 perfectly clear, but that some would find a win% easier to read is a valid point indeed.

  17. #23197

    Re: All B/R update speculation.

    Quote Originally Posted by Reeplcheep View Post
    If you are a really a physicist you should know the definition of null. You always assume no difference and try to prove otherwise. From thermodynamics you should also now the principle of informational entropy; the null has to be the same as assigning labels at random. Otherwise you would conclude snow basics d&t having a 25-24 head to head winrate vs non-snow D&T is telling you something.

    H_0 is in the vast majority of case p=1/# of options, (in this case 2, W or L)
    Quote Originally Posted by FourDogsinaHorseSuit View Post
    It's basic statistics that you test your results against flipping a coin.
    A coin will predict the correct outcome 50% of the time, and if you say that a MU is 75/25 you're claiming you can predict it 75% of the time.
    I still don't know why you're so stuck up on hypothesis testing.
    There is no reason to assume anything.
    You just present the data and that's it.

    What I was initially was suggesting was how to give an uncertainty to the win rates.
    Here we either take the frequentist approach or use Bayesian statistics where we need a prior.
    That's where you can start to assume things which need to be well motivated and it depends on what you want to show.

  18. #23198
    Member

    Join Date

    Sep 2011
    Posts

    4,776

    Re: All B/R update speculation.

    Quote Originally Posted by Zoid View Post
    I still don't know why you're so stuck up on hypothesis testing.
    There is no reason to assume anything.
    You just present the data and that's it.
    There's no reason not to present the data and also test it, showing both. If some don't trust the testing, they can ignore that part, but for those who do they are given more rather than less.

    Whether you use hypothesis testing or Bayesian methods, both make similar assumptions (prior or null). 50% is reasonable because players tend to classify matchups as:
    Favorable
    Even
    Unfavorable

    A null of 50% allows you to do that. A null of 35% could tell you your deck has >35% win rate against Delver, but that's not how most players want to think about their matchup info, at least not before knowing if it's favorable or not, so the result of that test has less practical value. You can always test different nulls afterwards. 50% makes sense as a starting point.

    This is a 2 player 0-sum game with a lot of chance. If neither player has an edge from the deck construction, you expect 50-50 odds by default. If your data contradict that, it tells you one deck is favored over the other.
    (Player skill is a more relevant factor if you include LGS weeklies with a lot of new players, but if this is ripped from top tournament results then most players are good at their deck)

  19. #23199

    Re: All B/R update speculation.

    Quote Originally Posted by Zoid View Post
    I still don't know why you're so stuck up on hypothesis testing.
    There is no reason to assume anything.
    You just present the data and that's it.

    What I was initially was suggesting was how to give an uncertainty to the win rates.
    Here we either take the frequentist approach or use Bayesian statistics where we need a prior.
    That's where you can start to assume things which need to be well motivated and it depends on what you want to show.
    Amazing

  20. #23200

    Re: All B/R update speculation.

    Seriously, this is pretty dumb.

    On one hand, replacing W & L numbers by the best mle estimator of the winrate %age (pmle=W/(W+L)) + a second value to represent uncertainty (like width of the 95% confidence interval for pmle, or the quasi-std sqrt(pmle*(1-pmle)/n) (*)) doesn't reduce the available information, as from those two values, you can reconstruct both W & L.

    On the other hand, you don't need to assume anything to establish those. There is no hypothesis to make or test against. It's simple mle.

    (*) I'm saying quasi-std as this is improper ; the only actual std is sqrt(p*(1-p)/n) where p is the actual value of the parameter. But :
    - this doesn't change the fact that this allows the reconstruction of original W & L numbers if one so desires,
    - this still does quite adequately match expectations / will properly represent what the standard deviation of the process is, a) given that real matchups never go outside 0.2-0.8 for p, and b) as long as you don't go out of your way to use it wrong, ie if you have like only 5 matches.
    Quote Originally Posted by cdr View Post
    140x Relentless Rats
    Quote Originally Posted by Ben Bleiweiss
    I wish that Wizards would have just gone ahead and done away with the Reserved List entirely. It is nothing but a blight on the game and one that long outlived its purpose. [...] I am wholeheartedly in favor of getting rid of the Reserved List and reprinting higher-dollar staple cards from EDH and Legacy. Pete Hoefling the owner of StarCityGames.com agrees with my point of view as well.
    - Ben Bleiweiss, SCG General Manager, Feb 2010

Thread Information

Users Browsing this Thread

There are currently 1662 users browsing this thread. (0 members and 1662 guests)