Today, I spent time implementing the interpolation of the general + by-learners corpus. This gave me the completed table:
|Corpora to Train Language Model On||Perplexity on Test Set of Corpus for ESOLs (2k)|
|General English (12m words)||22.5623|
|English written by ESOL (532k words)||23.2972|
|General English + English Written by ESOL (Linear Interpolation)||21.7679|
|General English + English Written by ESOL (Log-linear Interpolation)||21.5548|
|English written for ESOL (230k)||20.7549|
Perfect! The interpolation (kind of) does what we expect it to do, which is to lower the perplexity. The next major question that I have to answer is: how do we know that the differences between two perplexities is statistically significant? This will be useful to compare two perplexities. My next task is to figure this out. Today, I’ve been working toward extending my Bash script to automatically doing this significance testing. Hopefully, I’ll have something working tomorrow. And when I do, the next thing I can start to do is to tinker with the parameters of each language model, and see how varied I can make my results.