How Computer Tells Fictions Apart From Non-Fictions

Alex

All you and I have to do to tell a fiction from a non-fiction is to read a piece of text - but how can a computer tell the difference? It's tricky, but doable:

Joseph Stevanak and Lincoln Carr at the Colorado School of Mines in Golden have come up with a way to do it. They say that the key is to look at the networks that form when you examine how often words appear close together in each type of text.

The type of network they examined creates a graph in which each word in the text forms a vertex. A line connects two vertices if these words appear next to each other in the text. It is possible to explore longer range links by connecting vertices when they appear two or three or four words apart and so on.

Stevanak and Carr say that just two properties of this kind of network can help distinguish fiction from nonfiction stories. The first is the power law that describes the number of links to each vertex in the network. The second is the cluster coefficient which describes how well the vertices are connected to the rest of the network.

Measuring these two quantities alone can identify the type of story with remarkable accuracy. "Our analysis yielded a 73.8±5.15% accuracy for the correct classification of novels and 69.1 ± 1.22% for news stories," say Stevenak and Carr.

Link


Comments (1)

Newest 1
Newest 1 Comment

"Gluten Free" allows up to 20 parts per million of gluten. To those of us whose sensitivity precludes any gluten whatsoever it is of no value. I prefer the term "No Gluten", which applies to things like apples, oranges, potatoes, rice, nuts, etc.

The funny thing is how they use the term "Natural Flavors" to hide their 20 ppm of gluten. You know there is a bean counter whose company makes a million pounds of product insisting on adding twenty pounds of wheat because "it helps the bottom line and is completely legal".

Anything that says "Made in a facility that also processes wheat" is unsuitable.

The fact that they are trying doesn't mean they are succeeding.
Abusive comment hidden. (Show it anyway.)
Login to comment.
Email This Post to a Friend
"How Computer Tells Fictions Apart From Non-Fictions"

Separate multiple emails with a comma. Limit 5.

 

Success! Your email has been sent!

close window
X

This website uses cookies.

This website uses cookies to improve user experience. By using this website you consent to all cookies in accordance with our Privacy Policy.

I agree
 
Learn More