I kept working on data analysis today. Vidushi needed a set of processed data so she could do normalization tests on them, so I gave that to her. I then started working on displaying the 12 graphs, 4 for each text type, with 3 subplots per graph. It took a little while, because matplotlib is this very weird mix of object oriented and just functional, but once I figured it out it made sense. The results were not gratifying though, because it showed what was essentially a random data set concerning whether the A or B was harder. I tried to limit the number of IDs used in the data by only allowing those with an average cps greater than the mean variation across all texts. This limited our data pool to 10 and didn’t really reveal a lot more interesting data. Vidushi said that all the data sets were theoretically normally distributed, which doesn’t help a whole lot. The last thing I did was tweaking my function to graph the word’s cps and the cpss of the preceding 5 words and the succeeding 5 words to get a sense for trends. On monday we will decide which graphs to use and prepare the presentation more.