Format of hw tagged text?

greenspun.com : LUSENET : Brandeis CS114 : One Thread

For the third part of the current assignment, we are given a tagged version of a text. It's not implemented in list form, however.. does anybody have a proper tokenized version of the text? (There are a few errors, too... ex 'Gen.' being a NNP every place in the text except for the first instance, where it's tokenized as 'Gen','.')

Even a simple sed/awk script would work.. I'm not handy enough with either to do it myself.

Andrew

-- Anonymous, March 25, 1999

Answers

You can do it in Python with two calls to string.split (once to split by spaces -- the default -- and again to split on '/')

-- Anonymous, March 26, 1999

Moderation questions? read the FAQ