Advertisement

Relationship between texts

Started by November 02, 2005 06:55 AM
2 comments, last by Bessi 19 years ago
How do computers find out if two texts are about the same thing? What methods do sites like google news use to find out which texts are related? Out of curiosity I wanted to learn more about these methods. I have been trying to google them but my searches have returned very few valid results. The problem is that I have no clue where to start. I would therefore appreciate if someone could give me a hint on what these methods are called or give me some links to some papers or books.
Bessi
Finding what text are about is in the area of "Data mining".

Finding if two texts are related is in the area of "Pattern Matching"

Check out scientific journals about those two subjects, and you will likely find articles about that.
Advertisement
'Statistically Improbable Phrases' are another good method of finding related texts from within a larger body of generally unrelated texts, since they are a reasonable guide to what the text is about.
Thanks a lot. Excactly what I needed to get started.
Bessi

This topic is closed to new replies.

Advertisement