Nice analysis! Where is jitte though? Was there not enough data or did you just not include it here?
Edit nevermind I see it now. Looking ftw
Printable View
Nice analysis! Where is jitte though? Was there not enough data or did you just not include it here?
Edit nevermind I see it now. Looking ftw
It's there under Brainstorm.
Edit: Could someone refresh me on what the Chi-square means?
Now this is more what I meant by my request for "productive conversation," rather than "zomg, let's all be bff and share our animal crackers."
I've pretty much always suspected the white splash to be the best one available, so it's not a huge surprise that the data reflects that. Ditto on the fact that maindeck Relic doesn't pack a lot of punch.
I guess the main thing the data analysis makes me wonder is why exactly the green splash seems to be performing SO terribly for people....
Anyhow, it's nice to have the conversation a bit more back on track. :smile:
@Forbiddian: nice analysis, but can you include sideboards? Your post seems to indicate that you only looked at the maindeck, but sideboards conceivably affect up to 2/3s of games. After all, the question isn't simply when to include Kira in the maindeck, but whether to include her at all.
Summarizing recent posts, it seems that for a control/combo heavy meta, we should play mono-U with maindeck bounce, Stifles, Kira, or Jitte depending on expected minority decks. Top contenders nowadays seem to be extra countermagic, Kira, and bounce, though some folks are sticking with maindecked Stifles. Whatever doesn't make it into the first 60 cards are all options for the side.
For an aggro-heavy meta regardless of flavor, we run the white splash with StP in the maindeck, which leaves little room for either Jitte or Kira. With Burn, StP, PtE, and Zoo running around, it seems to make sense to include Kira at least in the board if not trying to find room in the main. Jitte isn't considered nearly as good here, and for extra utility white splash decks would choose to run 4 PtE in the side to boost their maindecked StPs.
Any thoughts/objections on that?
For those who maindeck Kira, how often does it screw up Sovereign and Reejery tricks (assuming you want to untap Merfs and make them unblockable)?
Because it's just for a non-synergistic creature that doesn't solve our problems (aggro matchups)? We could do that without the splash with Wake Thrashers. Grip in the sideboard seems useful, but bounce and annuls could conceivably take care of the problem while maintaining a more resilient manabase.
In the SCG tournies which are arguably the biggest events in the US mono blue and green splash have been doing well and white has not done anything. Larger tournaments are much harder to do well in and white splash has not shown up while the green has. This might have something to do with a phenomenal Merfolk player, but why would such a good player splash green if white was better? Also... Jitte is good.
http://sales.starcitygames.com//deck...AL&city=Boston
http://sales.starcitygames.com//deck...city=Charlotte
http://sales.starcitygames.com//deck...y=Philadelphia
Well I did place 18th at the SCG Philly 5k with a white splash... and my list was terrible.
Chi-square is the square of the deviation between the experimental value and the expected value, divided by the expected value. It's a quick reference number statisticians sometimes use to see if numbers vary significantly from expected values. It's not the best test, but it's easy as hell to calculate, and I don't have statistics mod downloaded into Excel because HP is a bunch of fucks and they didn't ship me an MS Office disk. Anyway.
For example, say you're trying to see if a die is weighted. You roll it 60 times and come up with 7 deuces. A high Chi-square will indicate a large variation between the value and the expected value. In this case, the die would be weighted if the Chi-square is high.
1/6 * 60 = 10 (expected number of deuces).
(7-10)^2 / 10 = 0.9 (the chi-square for that value, expected value - value found, squared, divided by the expected value.).
Any Chi-square higher than about 0.5 is pretty unusual for a binomial distribution/proportion, assuming that they're from the same distribution. In this case, it's 20% likely that a fair die would come up with 60 rolls at least that unfair.
The problem with case studies is that they suck.
The problem with looking solely at number of top 8s/top 16s without seeing a full metagame breakdown is that there's no way to see how many decks scrubbed out. You're only looking at how the luckiest people did, not the average person.
Goblins, for instance, has nearly 500 top 8s, far more than any other deck and more than some STRATEGIES. It has more than every combo deck combined. Yet it has never been considered the best deck in Legacy, ever.
Why? Because it only wins like 50-55% of its matches, and there's always at least one deck over 60%. Historically, there was a six-month period where it only won 40% of its Top 8 matches and was the worst deck in my entire meta analysis. And I looked at, like, 15 different decks. It gets so many top 8s because so many people are playing it. Even if 20 Goblins players go 0-2 drop, there are still 20 more who end up 2-0. Even though it's not actually doing better than flipping a coin, since so many people are playing it, it's bound to get into the Top 8.
Recently it's been doing well, though.
The only thing that I would find useful at all out of this is that we can say:
This deck: http://sales.starcitygames.com//deck...p?DeckID=29171 played a match and lost. Really not enough of a sample size to draw a good conclusion.
Ok, so now that I've been brainwashed by numbers and Chinese squares - is this something worth testing?
4 Cursecatcher
4 Lord
4 Adept
4 Reejery
3 Merfolk Sovereign
2 Kira
4 Vial
4 FoW
4 Daze
4 Standstill
3 Spell Snare
4 Mutavault
4 Wasteland
12 Island
I didn't actually look at the analysis (tl;dr), but if I recall correctly, a low Chi-square means that it has no significant impact on the expected value. That is, a card on that list with a low Chi-square will not significantly improve your chance of winning. (Please correct me if I am wrong about any of this.)
Chi-square has nothing to do with China... the Chi comes from the Greek letter χ...
The chi-value thresholds depend a lot on the power of the test and the amount of difference you expect to see. Anything higher than 0.5 is very unusual, but that doesn't mean that values less than 0.5 are better explained by the null hypothesis than by the alternative.
I didn't really want to get into this, so I won't. If you want more information, you can look up "Power" on a second year statistics text.
If the Chi-square value is low (low means less than 0.1), then the card probably does not have a significant impact on your win percentage. If the Chi-square value is higher than 0.5, it probably does. Obviously there's some breaking point in the middle, but the sample sizes for the test and the expected differential are insufficient for a low-power approximation like Chi-square. I reported the numbers mainly in case anybody who knows statistics is interested, but you can feel free to ignore it.
Again, only about 60% of the variation is explained by a random process. It's clear from the test on correlation that the cards have some impact on the win probabilities.
I am very impress with Forbiddian's analysis because he actually back up what he was saying with numbers. Now if you can include the SB in your analysis, that would be very helpful.
My bad I thought this game had skill involved. There's a reason you see the same people at the top of tournament standings. The average person is bad at magic. I don't know why you would care how the average person does with a deck. They likely made many play mistakes which is what can distort data.
I also don't care if some random top 8's his 20 person tournament. You won the equivalent of an FNM. Good job. The big tournaments are where we get some real results as you can't always just "get lucky". Unfortunately there are not a lot of big Legacy tournaments yet.
Unfortunately, you don't seem to understand the problem at all.
The problem is that only counting top 8 appearances tells you a function of both quality AND popularity. Suppose there are 100 decks running the Green splash and 1 deck running the white splash. 2 Green decks make the Top 8, and the only white splash deck also makes the top 8, I would think that's evidence that the white splash is better than the green splash.
If you're ONLY looking at a raw top 8 count, you might do something stupid like conclude that green splash is better (or even twice as good). Clearly you can't only look at top 8 count when concluding whether a deck (or worse a card within a deck) is good or bad.
Unfortunately, we don't have complete metagame breakdown for every deck. So it's not that easy. We have to make up some method that will isolate quality from popularity. I did that. Your method looks at the popularity of the deck multiplied by the quality (actually, it's a power function of quality). My analysis looks only at the quality.
But if you'd like, we can do a head to head comparison between your method of counting top 8s and my method of actually looking at whether or not the cards are winning. I found a good candidate, within my data. It's the only conclusion from my data that I'm 95% confident about.
Relic of Progenitus made more Top 8s than Kira, but I'm 95% sure that decks running Relic do worse than decks running Kira.
According to you, Relic is a better card, because it made more Top 8s.
According to me, it's a worse card because after making the top 8, it crapped out over half the time, whereas any deck running Kira was a 2:1 favorite. I hypothesize that Relic made more top 8s because it's played far more than Kira.
Most of this makes perfect sense, there are a couple points I would refine a little bit.
For one thing, if you have reason to suspect Jitte is actually going to be good in your meta (and it legitimately might be, it just depends), I think you can still run it in the sideboard of the white splash list. I think space would be too tight in the maindeck, but some enterprising people might cut two cards from the main and run maindeck Jitte in the white splash. For example, if my meta was just crawling with Goblins and Elves, I could see doing some shit like that.
Anyhow. On another note, Kira really is that good from when I've used her. Especially now that point removal spells are the one of the most common things we have to worry about. I find untapping Merfolk with Reejy isn't a very common play, I usually use his trigger to untap my land or tap down their blockers. So no fret there. I can't really speak to how often people are using Sovereign's ability, but it seems like a small loss for the huge gain of Kira's pseudo-shroud.
On still another note, I agree that in a combo/control meta, one is probably better off staying mono blue. I think bounce and extra counters are the strongest spells for the extra slots. As far as the extra counters though, I think one has to think about this and do some guesswork on what decks you think you'll play. I think against aggro/control decks, Spell Snare is probably the best extra counterspell, since those decks are largely defined by their two-drops. Against more pure control or combo decks though, I might try out Spell Pierce (although probably not against Storm decks, because they can mana ramp to avoid getting their stuff countered a lot of the time if they know you use Spell Pierce.)
Those are pretty much my only quibbles, my good sir.
Alright, thanks. I have actually taken statistics and probability theory before, just forgotten most of it. "Above 0.5 means it's probably relevant, below 0.1 means it's probably not" was basically all I was asking for in this case.
Would be sort of cool if someone started a Nate Silver-esque blog (or just thread) about various statistical things in Magic. I'd read it.
I disagree 100%. Has this ever happened to you? We still have 4 Wasteland, 3-4 Daze, 4 Cursecatcher, and possibly Stifles to slow them down. There's always FoW backup for whatever else they may be planning, but honestly I have never in tournament play had an ANT player ramp past a Spell Pierce. We have enough of a clock to force them into bad positions, and can seal the deal with a Standstill. Spell Pierce is awesome against combo.