This week, I got more of a sense of what the area of text simplification research looks like. I read a particularly great summary of the state of text simplification research as of 2014 by Advaith Siddharthan, from the University of Aberdeen. His paper offers a good survey of how automatic text simplification systems have evolved since the late 1990s. In particular, he went into detail on contemporary “Text simplification as monolingual machine translation” (section 3.2.2), which is exactly what I want to do; I want to treat text simplification as a monolingual (one language, English) machine translation problem. Siddharthan talked about how researchers such as David Kauchak (a professor at nearby Pomona College!), use phrase-based machine translation systems to accomplish their work–mainly Moses. I explored David Kauchak’s paper, Learning to Simplify Sentences using Wikipedia, and saw that his work mirrors a lot of what Prof. Julie and I talked about. Unfortunately, I don’t have much information on automatic text simplification with the emphasis on measuring difficulty (which is my focus), so I still have some research ahead of me. I’m going to ask Prof. Julie whether composing some sort of “literature review” would be a good idea for this project, and when that would be due.
Alongside doing some preliminary research, I’m also becoming familiar with software used for computational linguistics. In particular, I’m reading about Moses, a statistical machine translation system. Prof. Julie says that it’s the software package she had in mind to test our theory, so we’ll be using it throughout the semester. She also mentioned Box, a machine translation research platform available on AWS. Next week, I’m going to try playing around with it, and see if I can get an instance running. Fingers crossed!