Continuing from yesterday, I added illustrations drawn with Photoshop to the poster and ran it by Prof. Medero, who suggested some small changes. After adding some finishing touches to the poster, I continued updating documentation and making sure all my files are readily accessible for future use.
Yesterday, I worked a bit on the Python data analysis program so that I can leave it at a place where the basic building blocks are all there.I also fixed some issues with GitHub uploading with Prof. Medero and Michael and updated the documentation accordingly.
Today I started making the poster I was working on planning out for a portion of yesterday. Below is where it’s currently at. I’m planning to add a few illustrations demonstrating how the app works in the white space at the top right. Hope that it’ll be ready for printing tomorrow after some final touches.
Today, I completed making my custom linear interpolation script work with higher order n-grams. I still need to figure out how to interpolate the back-off weights, but I have placed that on the back burner for now.
Now, my attention has been focused on researching a phrase that Prof. Medero pointed me to, called “domain adaptation.” And this is basically investigating how to train a language model on one distribution of data (for us, it’d be the general corpus and by-learners corpus), and then have it perform well when we test it on another distribution of data (the for-learners corpus). The research I have done so far seems promising. I have ended the day with two key takeaways:
- I should set another baseline for my system, and that is training a language model that simply concatenates the general corpus and by-learners corpus together. This was inspired by the paper “Experiments in Domain Adaptation for Statistical Machine Translation” by Koehn and Schroeder.
- I should look into modifying the corpus that I am training on! One common domain adaptation method involves “selecting, joining, or weighting the datasets upon which the models (and by extension, systems) are trained” (from this paper by Axelrod et al.). So one idea I came up with is training a model with the by-learners corpus, and using it to compute sentence-level perplexities for all of the sentences in the general corpus. I can then filter out all of the sentences that scored above a certain perplexity. And finally, I can train a new model based off of the sentences that remain, and use that to test performance. I am still working on implementing this, but I will hopefully have results by tomorrow.
Today, I finished performing my own linear interpolation of the “general” corpus and “by-learners” corpus. And I know it works because its perplexity across the test set is exactly the same as the perplexity given by the SRILM-interpolated model across the same corpus. The only catch is that the linear interpolation script does not work with higher order n-grams. So, today I am going to spend time making that change. With that, I also need to research how the back-off weights are interpolated for higher order n-grams.
Last Friday and today were spent troubleshooting the weird visual bugs and nil errors that appeared in TestViewController after I transferred the working code from TutorialViewController over. After rearranging the code (and dearranging the code if I made things worse) I fixed the nil exception errors, but then found out that GPUImage can’t handle images larger than 2048×2048 at the moment, and the sample texts in the main experiment have a width of around ~34,000. For the time being, the blur filter stops working for these text samples, and I’m currently searching for a fix. I’ve tried reimplementing other 3rd party extensions I’ve found, but none of them worked, and uninstalling/reinstalling frameworks was quite a headache.
Goals for the end of this week are:
- Make a poster
- Leave TextScroll and documentation in a place where a new researcher can continue
- Get basic data analysis working so that new researchers can easily build on that
Today, I completed work on modifying the Bash script. It can now take in a configuration file that sets the parameters of the language models I want to create. It then creates all of the language models, computes their perplexity across a common test set, and prints out a table that compare the perplexities of each language model. Whew! Now that I have that done, I can now turn my attention to somehow combining the “general” corpus and the “by-learners” corpus (whether it be interpolation or something else) to see if I can lower the perplexity even more. For now, as an exercise, I am going to start by recreating SRILM’s linear interpolation. And for that, I have begun looking at the Python module arpa that lets me parse the language models of the corpora. Then, I can hopefully linearly interpolate the two language models. I should have an update on that tomorrow.
Today, I’m starting to work on re-factoring the Bash script code, so that it can read in a configuration file and train/ test the language model based off of those parameters. This will be helpful to generate the table that I mentioned previously for different configurations of language models (e.g., for bigrams or trigrams). So, I’ve spent most of the day looking up the best Bash approach to passing in the configuration file. Hopefully, I’ll have this running by next week so that I can start experimenting with ways to interpolate the corpora that I have.
- Added a tutorial for the constant acceleration tilt mapping and troubleshooted visual/transition bugs as I was adding these new things in.
- Updated the TestViewController to the new blur filter program. Still a couple visual bugs, such as the blurred text not aligning/ not appearing at all despite me being pretty sure that the code is identical to the working version in TutorialViewController.
- Updated Remote Config with the new stuff
- Got my Python program automatically fetching the .json file from Firebase.
Tomorrow I hope to fix those two bugs and continue working on the data analysis code.
Today, I came to terms with the results from the significance tests that I performed yesterday. Prof. Medero and I tried the Sign significance test as well, and saw that the test also reported that the perplexities were statistically significant. So, they are statistically significant. But why, if the overall perplexities are so close in value? I feel like there could be a few explanations. The best one I came up with is that the overall perplexity statistic that we’re using (in the table) are not directly related to an average of the sentence-level perplexities given by each model. According to the SRILM website, they are computed by the expression
10^(-logprob / (words - OOVs + sentences))
where -logprob is a log-probability of seeing all of the tokens in the test set. So, it could be that this expression is understating the differences between the sentence-level perplexities, which is likely given that the test set–in number of sentences–is small.
Now that we have established significance testing, the goal is to explore ways to change the language model parameters and see how that affects the distributions of perplexities. Some suggestions that Prof. Medero had for me were changing the order of the language models we’re using (using bi- or tri-grams), as well as building unigram language models based off of the non-terminals as well as parts-of-speech in the syntax trees of the corporas. I’m going to need to make changes to the Bash script to make exploring those easier, so that’s going to be my focus for today.
Figured it out! The current layout is two scrollviews stacked on each other: the one on top that holds the moving sharp text is masked to look transparent on the left/right so you can see the blurred text moving on the scrollview underneath. This nice effect is achieved with much fewer lines of code than I expected. It took a while to properly synchronize these two scrollviews with autolayout and everything in the code that interacts with the text, movement, etc.. I also implemented a new “characters visible on screen” feature, where you can set the size of the viewing window (the sharp, dark text in the center) by specifying how many characters you would like to have the user see at a time. It’s not super precise, but works well so far. I also had to adjust the calculation for the text’s travel distance since the size of the blur filter changes dynamically now. At the moment, this new look runs smoothly, the blurred/clear texts are synchronized, and blur amount, font, font size, and characters visible on screen are all easily adjustable.
- Update the documentation
- Implement this update in the TestViewController
- Add “characters visible on screen” setting to Remote Config