June 27 (Week 2) – Maury

So, last week, I nailed down how I wanted to simplify text for second-language English speakers. I also determined how I need to find samples of written text that illustrate what they find difficult.

To be honest, for a lot of today, I was boggled down by thinking about how exactly we are going to make the syntactic simplifications. It seems like a lot of other papers, such as Zhu et. al in their paper and Aluisio et al in their’s, relied on elaborate, hand-written rules for simplifications. And in other scenarios, such as in this paper, the rules were learned from Wikipedia corpuses. I honestly got overwhelmed for much of the day. And it took me a while to realize that I cannot think about that stage right now. First, I have to identify what exactly the syntactic simplifications are that we want to make. Only then, can I worry about how we are going to perform them.

That being said, I have identified a few corpuses that can help me finding more about those syntactic simplifications. And they are all corpuses of text written by/targeted toward second-language English speakers. They are:

Tomorrow I am going to continue looking for tools that I can use to analyze them. I will start by trying to find a parser (such as Stanford’s sentence parser) and getting a basic sentence parse on my computer working. After that, I will work on retrieving the dataset.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s