I originally wanted to generate something from a web-scraper, but all the sources I used ended up incredibly incoherent. For example, text generated from twitter is often nonsensical- the nature of twitter lends itself to hashtags and acronyms or various shortenings that require context. Reddit, on the other hand, created sentences that all sounded the same. I started looking for a corpus that had a distinct tone but varied content, and realized that using a real-time source might not be to my advantage.I turned my attention toward finding meaning, and settled on the debate text to explore the state of American politics and the bipartisan system.
Content Rating
Is this a good/useful/informative piece of content to include in the project? Have your say!
You must login before you can post a comment. .