During this past week I continued my investigation into Moses. More specifically, I analyzed my results from last week, coded up a simple feature function, and scoured the web for more information about Moses.
Early on in the week, I found out that the simplicity (i.e, small size) of our language model is starting to show in subtle ways. Prof. Julie and I were looking at the output from the system I got up and running last week, and we found a sentence from the PWKP corpus that illustrates the problems a simple language model can trigger. The input sentence was:
the first and broadest sense of art is the one that has remained closest to the older latin meaning , which roughly translates to " skill " or " craft , " and also from an indo-european root meaning " arrangement " or " to arrange " .
And the corresponding output sentence from Moses is
the first and broader sense of art is the one that has remained closer to the older latin meaning , which translates " skill or " craft , and also from an indo-european root meaning " arrangement " or " to arrange " . " " to roughly
Just for reference, this represents approximately what we are looking for:
the first and broadest sense of " art " means " arrangement " or " to arrange . "
Notice that Moses changed “broadest” to “broader.” And after a bit of clever sleuthing by Prof. Julie, it turns out that this happened because the language model had the bigram “and broader” and not “and broadest”! So it ranked the former translation higher during phrase translation, and it found its way into the output. Obviously this is not ideal. Therefore, to increase the quality of our output, creating a better language model is a priority in our list. To get the data required for that, Prof. Julie mentioned getting the ball rolling on getting a membership to the Linguistic Data Consortium. So, hopefully in a few weeks, I can implement a “smarter” language model.
After thinking about the language model, I turned to other issues. Last week, I mentioned standardizing the way that my PPDB parsing script and source translation escapes special characters, such as single-quote, double-quote, and ampersand. That has now been fixed, which results in Moses not thinking phrases such as “"” are not in the phrase table (when they are).
That fix then got me in a position to finally develop a baseline FeatureFunction class. The Moses site has a great tutorial about it on their website, which I was able to follow along without much difficulty. It took a lot less time than I thought it would! I was able to create a feature that penalizes hypotheses with a larger number of words. This changed Moses’s output from what I had to
the first and broader sense of art is the one that has remained closer to the meaning , craft , and also from an indo-european root meaning " arrangement " or " to arrange " " " " or roughly which translates to older latin " skill
Since the translation is different, we know the feature is definitely being incorporated into the translation! After this breakthrough, now I feel like the work will require even more innovation on my part. In the upcoming weeks, I have to figure out what data is available to me, and how to best use it to create an effective quantitative measure of difficulty.