Given a text document, what is the 'best' way to concisely represent the content within say - a 600x600 pixel region ? One procedure that would probably give good output is this:
This would be a time-consuming and expensive option. What is the best automated way to solve the same problem ? Perhaps software that reads the text and automatically produces a summary ? I don't have any experience with the state-of-the-art in auto-summarization but I suspect it often doesn't work very well.
How about tag clouds of the most frequent non-trivial words ? They would highlight high-frequency words but don't show any real structure within the document. I'm sure we can do better.
I suspect software that detects named entities (people, places, organizations, products etc) might be a useful component of a solution. Perhaps something that creates a diagram illustrating the key entities and relationships between them would be useful.
Any ideas ?