Week 7, October 19-October 23 – Maury

As I mentioned in last week’s post, this week more of a reading and writing week for me. In particular, the main task this week was to assemble a literature review on what’s been done with text simplification, and how I am improving based off of existing approaches. I mostly focused on works achieved simplification with phrase-based machine translation techniques (including Zhu et al. (2010), Coster and Kauchak (2011), Wubben et al. (2012),  and Specia (2010)). I discussed how many of these techniques relied on using Simple Wikipedia as a basis for the simplification corpus, and I pointed out how existing works (like Xu. et al (2015)) critique this reliance on Wikipedia and have a call-to-action for using other datasets for simplification. This is where my work comes in. Like I mentioned in the first post, I am going to avoid using simplification corpuses for phrase-based translation. Instead I plan to use an extensive English paraphrase database. And I will use quantitative difficulty measures to prioritize paraphrases that perform simplifications.

After finishing the review, I did a little to investigate what parts of Moses I will need to modify to add the quantitative difficulty measures. I reckon that, at a preliminary glance, the majority of my changes will happen in the translation model folder. I plan to make this idea more concrete over the next week. Once I finish with that, I plan on modifying the paraphrase database that I created in my previous post to ensure that I am getting the best possible paraphrases.

Works Cited

  • Delphine Bernhard Zhu, Zhemin and Iryna Gurevych. 2010. A monolingual tree-based translation model for sentence simplification. In Proceedings of the 23rd international conference on computational linguistics, pages 1353–1361. Association for Computational Linguistics.
  • William Coster and David Kauchak. 2011. Learning to simplify sentences using wikipedia. In Proceedings of the Workshop on Monolingual Text-To-Text Generation, MTTG ’11, pages 1–9.
  • Antal Van Den Bosch Wubben, Sander and Emiel Krah- mer. 2012. Sentence simplification by monolingual machine translation. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1, pages 1015–1024. Association for Computational Linguistics.
  • Lucia Specia. 2010. Translating from complex to sim- plified sentences. In Computational Processing of the Portuguese Language, pages 30–39. Springer.
  • Chris Callison-Burch Xu, Wei and Courtney Napoles. 2015. Problems in current text simplification research: New data can help. Transactions of the Association for Computational Linguistics, 3:283–297.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s