Friday, April 17, 2015

Rolling Large

Post-game night dinner. The question comes up, “How many dice do you have to roll to see results that are similar to what is expected theoretically?” A fair question.

**Note, for the TL;DR types, you can skip to the very bottom for some graphs that demonstrate the answer to this question. This will be a very long post. But the short answer is pretty worthless. And even as long as this post is, it really should be longer. This still is just scratching the surface.**
A few years back, I wrote a couple posts about some of the basics of probability theory as it pertains to dice. In that, I reference the “Law of Large Numbers” (LLN). Basically, as the number of experiments increases, the average of the outcomes will approach the expected values.

So, how many dice does it take to make a large enough sample? The simple answer – a lot.
I have heard some people state that they don’t ‘believe’ in statistics as an application in games. I doubt such statements are meant literally, but the explanation that follows often refers to the observation that when a quantity of dice is rolled, the outcome usually does not ‘match’ the theoretical probability. Actually, this is completely as expected if you understand statistics and combinatorics.

For example, since I know that each number on a D6 is equally likely, if I roll 6 dice, I might expect to see one each of 1,2,3,4,5,6. In fact, this outcome is far less likely than having 5 numbers with one repeated. I think I should save the mathematical explanation of that for another time, as this is already going to be a very long post. But I do plan to get to it soon-ish.
For this post, I will present the results of a few simulations to try and illustrate this point. However, the fact is that most game events will not require an amount of dice that comes anywhere near “large”. So does this mean that theoretical probability is useless for game strategy? Not exactly. But that explanation will need a lot of expansion, far more than I can cover in one post. So, the examples that will be presented now are just an introduction. We’ll see how this topic progresses, as I plan to return to it often.

Some basic concepts used here:
“Error” refers to the degree by which an observed outcome (empirical data) differs from the expected/theoretical value. In examples below, I may use the term ‘difference’ instead of error as it seems more natural in this context.

“Distribution” describes the ‘shape’ of data or expected values. This is not necessarily confined to graphical representations, but it makes concepts a bit more concrete if there is some picture to refer to. I am actually using a very loose definition of distribution here, as it can include many different specifications, but the full explanation on this could fill a book. In this post, I will specifically be looking at a uniform distribution based on a D6. Later posts will expand on this to consider other distributions based on how we view certain outcomes.
“Sample Space” refers to the set of all possible outcomes/results/events for an experiment/trial. In this case, each D6 die roll is an experiment/trial, the sample space includes {1,2,3,4,5,6}. The “sample” is the set of data obtained from a number of trials. The “theoretical probability” of an event (particular outcome) is a number between 0 and 1 that indicates how likely an event is to occur, generally denoted P[E], where E is one element in the sample space. In the case of a uniform probability distribution, all outcomes are equally likely. In this case, each event in the sample space has a probability of 1/6 = 16.667%. The “Expected Value” technically refers to the average of repeated measures of an event. So for a D6, the expected value is 3.5; practically, this is pretty nonsensical. I only point this one out as it can easily be confused with the expected number of observations, which is arrived at by multiplying the probability of an event by the number of trials.


Now, the fun stuff: Data!
For this first piece, I wanted to take a focused look at a relatively small data set to talk about how small samples can differ greatly from what is expected theoretically, and how this applies to gaming. I used a function in Excel to generate 2 sets (A&B) of 144 numbers between 1 and 6 (note: I will add technical details at the bottom in case you want to do your own simulations).  However, I arranged the data in a certain way so that it can mimic gaming scenarios. I have 6 rows and broke up each full data set into 6, 12, and then the full 24 columns. So, if you read one column, it is like seeing the results of rolling 6 dice. So the sets show 36, 72, and 144 dice rolls. The frequency of each event is counted, and then summed in the last column, compared this total to the expected number and calculated the difference. I have the full data sets available here: DataSetA  DataSetB

Here are the summaries for the first 36 dice:

A = 36 Actual Expected "#" Diff "%" Diff B = 36 Actual Expected "#" Diff "%" Diff
1 4 6 -2 -33.333 1 5 6 -1 -16.667
2 4 6 -2 -33.333 2 5 6 -1 -16.667
3 3 6 -3 -50 3 8 6 2 33.3333
4 13 6 7 116.667 4 6 6 0 0
5 6 6 0 0 5 3 6 -3 -50
6 6 6 0 0 6 9 6 3 50
I swear I didn’t manipulate the data sets at all. But Wow! It really does look like we broke statistics in set A. ‘4’ appears in 13/36 trials, more than twice what would be expected. However, ‘5’ & ‘6’ are right on target with 6 occurrences each. The B set is a bit closer to expectations with ‘5’ being 50% less than expected and ‘6’ 50% more.See how this changes if we double the number of dice to 72:
A = 72 Actual Expected "#" Diff "%" Diff B = 72 Actual Expected "#" Diff "%" Diff
1 7 12 -5 -41.667 1 12 12 0 0
2 14 12 2 16.6667 2 9 12 -3 -25
3 10 12 -2 -16.667 3 14 12 2 16.6667
4 17 12 5 41.6667 4 12 12 0 0
5 11 12 -1 -8.3333 5 11 12 -1 -8.3333
6 13 12 1 8.33333 6 14 12 2 16.6667











In A, ‘4’ is still the most frequent, but it had a pretty huge lead from before. Actually, in the second 36 trials, only 4 ‘4s’ occurred, which is less than the expected 6. Where ‘4’ is over, ‘1’ is under. So, this would be the sort of break from average that we love to see usually! Set B is starting to show less variation, which is exactly what LLN says should occur.

The last set includes the full 144 dice sample:
A = 144 Actual Expected "#" Diff "%" Diff B = 144 Actual Expected "#" Diff "%" Diff
1 23 24 -1 -4.1667 1 26 24 2 8.33333
2 22 24 -2 -8.3333 2 21 24 -3 -12.5
3 21 24 -3 -12.5 3 23 24 -1 -4.1667
4 32 24 8 33.3333 4 20 24 -4 -16.667
5 25 24 1 4.16667 5 26 24 2 8.33333
6 21 24 -3 -12.5 6 28 24 4 16.6667

In A, ‘4’ is still more than expected, but by a much smaller margin, despite gaining more than the expected 12 additional observations from the previous group. All of the others are getting close to the expected frequencies. Set B seems to be showing more difference from the expectation if you look at the percentages, but if you look more closely, the highest and lowest % difference are actually less far apart compared to the 72 sample (72: 16.667 - -25 = 41.667; 144: 16.667 - -16.667 = 33.3).

I could have run a lot more data samples, but I think these are sufficient to show that as the number of trials increases, the observed frequency is overall beginning to get closer to the expected frequency.
Before I move on to the second simulation, I want to take a closer look at the individual columns, representing a roll of 6 dice. You can look at these in the pdfs, I just copied a few here to illustrate the extremes.

Out of the total of 48 columns between the two sets, there were 0 instances where each number occurs exactly once. I know some might use this to justify the belief that the probability has nothing to do with what we observe. Actually, anticipating such arguments may motivate me to write up the mathematical justification for why we aren’t likely to see such a result in such a small sample. Oh, yeah, back to the title – is 144 dice large? Not really. Getting closer, but this is still a small sample set. 
However, I pulled out the columns where there were instances of the next best scenario. A total of 9/48, 18.75%.  Mathematically, each of these occurrences has the exact same likelihood, as they are each a different assignment of the frequency set 0,1,1,1,1,2. However, from a gaming perspective, they are NOT equal at all. For these, I rated them Good – Yeah!, Bad – Boo!, or E – even; these are based on the scenario of wanting to roll a 4+. From this particular sample, most of these sets of 6 are not what we would prefer generally. Only one beats the odds (odds vs. probability is another topic, I am using the term loosely here) in our favor. Again though, this is just the nuance of this sample. Variation is expected if we are truly using random generation. Sometimes it works in our favor, sometimes it doesn’t. As exemplified by the next selection, from the other extreme.

1 2 3 4 5 6
A14 1 1 1 1 2 0 E
A17 2 0 1 1 1 1 E
A18 1 1 2 1 0 1 Boo!
A19 1 1 2 1 1 0 Boo!
A22 2 1 1 1 1 0 Boo!
B7 1 0 1 2 1 1 Yeah!
B21 1 1 1 1 0 2 E
B23 1 2 0 1 1 1 E
B24 2 1 1 0 1 1 Boo!
For these, I pulled all the columns which represent the least likely distributions. In this sample, I took any column that included a frequency of 4 or more for one number. Notice, there were no instances in the sample that had 5 or 6 of the same number. We all have probably experienced a really lucky roll of 5 ‘5s’ and a ‘6’ or something like it. But these are extremely unlikely. On the bright side, 5 ‘2s’ and one ‘1’ have an identically low probability. Of these 3 sets, 2 would probably make us pretty happy.

123456
A5100410Yeah!
A20400020Boo!
B19100041Yeah!

Now to answer that question – how large?
Excel is pretty useful, and it is kinda neato to generate lots of random samples and look at the variation. Okay, maybe it is only really cool if you are a total math nerd like me. Anyway, if you want to do some real statistical computing, you need a proper statistics package. For these simulations I used “R”, which is not really the most user friendly program, but it has the supreme advantage of being free. I will include links and the coding I used at the end. I also just found out that there is a package someone has written specifically for running all kinds of dice scenarios, so may use that for future posts once I have played around with it.

Onto the graphs – I know you’re tired of reading by now. I ran two simulations at each of the following samples sizes: 36, 72, 144, 360, 1200, 6000, 12000. (Actually I ran many more sims of each just to verify that the graphs shown below seemed a reasonable examples. Of course, I didn't realize until I posted these that the 2nd 36 sim had exactly 0 '1s' in it, which is unusual. I did not however cherry pick these examples, I just grabbed the last two after running enough to be satisfied.)













So, the empirical distribution doesn’t really start to settle around the theoretical distribution until I input 12,000 trials. Probably not the answer that most people would hope for. Granted, I bet a standard game probably includes 500 – 1000 dice rolls, so really that is only one or two dozen games.
Okay, that’s way more than enough for one day. Will start working on a post discussing the mathematical side of probabilities as mentioned above. Until then, keep praying to the dice gods if that’s your thing!

Notes on Excel functions and R code:
Excel: Random D6 generation:   =RANDBETWEEN(1,6) 
           Counting “=#”:    =COUNTIF(A1:A6,"=1")
"R" basic program
Rstudio (makes R have a much nicer interface):
"dice" package

Simulation Code, only need to input first two lines once. To alter number or trials, see red text. Also need to adjust the y axis increments accordingly, approximately 1/4 of number of experiments.

p.die <- rep(1/6,6)
die <- 1:6

s <- table(sample(die, size=6000, prob=p.die, replace=T))
lbls = sprintf("%0.1f%%", s/sum(s)*100)
barX <- barplot(s, ylim=c(0,1500))
text(x=barX, y=s+10, label=lbls)












Tuesday, April 14, 2015

What is dead may never die

Kind of apropos to reboot this blog during season 5 premiere week.

So, yeah.... so much for using this as an outlet to document my progression from "too scared to go up and talk to people I don't already know" to "god, why won't she shut up already". Basically, doctoral coursework in mathematics is hard. Being bipolar makes things even more interesting. So, things fell apart. Not just in terms of this blog. When I decide it is time for a change in my life, I don't tend to just walk away. I burn what's behind me to the ground. Not at all a productive adaptation, but it is what it is.

Changed the name, as I really have been thinking a lot lately on probability and game theory, tournament design, and the psychology of gamers (and myself, but I can never escape that). And this just flows better. The dice and decks part should be obvious. As well as the play on TicTacToe. Death, well wargaming involves a lot of pseudo-death. And being crazy means always walking under the shadow to some extent.

So since this was just sitting in a corner getting moldy, it seemed easier to do some spring cleaning rather than start a whole new thing. Plus, I read back over the old posts and I am feeling right proud of some of the prose I had written, though also a bit bemused at what I seemed to care about at that time.

Most importantly, however, right now I need a space where I can talk about my moods on occasion, not necessarily to the world at large, but to the people whom I interact with regularly in the hobby. Several times in the last few days I have sent personal apologies because I was a little manic the last few weeks. And it is always embarrassing to realize your behavior was somewhat out of whack. Or maybe I am the only one who notices. It gnaws at me regardless.

But I recognize that it seems odd that a person would disclose personal information, unsolicited. Addressing that is kind of the main point of this entire post. Because if a person had a back problem and said "I need to just lay on the couch for a few days until I heal" few people would give such disclosure a second thought. But the invisible, psychological, medical problems are still considered, by many, topics that should be hidden. TMI.

Back when I started this blog, I decided I was done trying to put on the impenetrable façade all the time. It was too exhausting, too isolating, and most of all, it was a habit born of fear. And for myself, I needed to be brutally honest and realize that massive walls don't make you strong. So I have made an effort to be as honest as possible about my illness. Because somebody has to, if the stigma is ever going to be reduced. Because it is better to have an explanation for odd behavior than just being labeled a weirdo. Because crazy is just one small, inseverable piece of me and if people have an issue with it, or with me being so candid about it, then obviously there is no friendship there in the first place. At least I am certain I am not boring.

And none of that previous paragraph is directed at anyone I know. It is more a statement of choosing to advocate for myself, and maybe some others in the mix. Really though, it is just the result of a long (lifelong), tortuous exercise in learning not to be afraid. Of learning to let the mask fall. Trying to hide would be a huge fall backwards. Because I wasted too much time there already, and I know that only leads to more pain and decay.

So, this is just me reiterating some truth. And leaving it out here so I can quit feeling anxious about how people react to it. So I can put this link back in my signature, and just own it.

And really, truly - this should be the last deeply personal post for a long while. Because I have been researching some awesome tournament/ranking design theory that I can't wait to write about. And so many probability topics that I think are fascinating in their complexity. And I have actually gotten to a point where I am often quite happy with my painting, enough that I want to share. So there are lots more interesting things to write about. Will try to make it at least once a week.

And thank you to my friends for putting up with me the last few years, and making me feel welcome. It has been a novel and happy experience.