Most of my day today was spent struggling with numpy. I have to use both the existing data compiled about our text — for instance, in the program as Adam gave it to me, there is already a mapping between time elapsed and acceleration, between acceleration and words on screen, and so on — and the existing features of numpy to try and come up with some useful data analysis. From Prof Medero’s email, a couple of things seemed easily doable, so I looked at those first. For instance, it didn’t take too long to figure out the average word length of the text, although this is currently computed in a pretty awkward spot in the code, so I should probably fix that soon. After this I also computed the standard deviation of the word length, but at the moment haven’t done anything useful with these numbers.
I then set bout finding the average acceleration corresponding to each word. This is now stored in a dictionary as part of the class I was working with (DataAnalyzer), and this should tell us something about how people interact with that word. To be honest it is not very useful information, as I realized after I looked at the numbers in the dictionary. The raw number out of context doesn’t tell you whether the reader slowed down, sped up or just stayed constant at that particular word. This means that those numbers make an interesting indication of the reader’s speed and interaction with the text, but don’t give us information to compare across people.
To fix this, I then thought I could create a similar dictionary which maps each word to either “faster” or “slower” — an average indication of whether people slow down or speed up when they see that particular word. This is what I am in the middle of doing right now, and will hopefully complete next week. This is complicated only because I need both the time data and the acceleration data to be linked to the words, so I just started manually doing this now. I thought “faster” and “slower” are more useful than numbers, because as I said above, raw numbers are not incredibly helpful. Hopefully this pans out next week.