So, I figured I'd share a commercial tool that's turned freeware recently. I've been analyzing my own writing to figure out what it conveys and it's more accurate than I'd expect for free without a human being. I love data mining and analysis software. I'm not selling this, I'm not affiliated with them or anything and I'm not advertising it but if you're actually ripping wikipedia or working with XML and large amounts of text, most likely you'd be interested in this. I've been running it on Wikipedia pages and other websites as well, too. I've generated very, very interesting and cool results with it. Without further adieu and long-windedness, 

Designed for Information Science, Market Research, Sociological Analysis and Scientific studies, Tropes is a Natural Language Processing and Semantic Classification software that guarantees pertinence and quality in Text Analysis.

http://www.semantic-knowledge.com/

http://www.semantic-knowledge.com/download.htm

There are French, Brazilian, Portugese and Spanish versions. It's definitely not a waste of your time to test it out. I'm going to be use it often as I write often for myself and as a freelancer. I'm also using it to proof and edit a book I've been writing for the past month, too before I publish it in a few weeks. It's too bad there's only a Windows binary version and no source code available.

From their landing page ( http://www.semantic-knowledge.com/ )

Semantic-Knowledge is a leading provider of Natural Language Processing (NLP) software, including Semantic Search Engine, Text Analysis, Intelligent Desktop Search, Text Mining, Knowledge Discovery and Classification systems:


Tropes Text Analysis

Tropes -  High Performance Text Analysis

Designed for Semantic Classification, Keyword Extraction, Linguistic and Qualitative Analysis, Tropes software is a perfect tool for Information Science, Market Research, Sociological Analysis, Scientific and Medical studies, and more..


Zoom Semantic Search Engine

Zoom -  Semantic Search Engine

With its fast Natural Language Information Retrieval system, Integrated Web Spider, built-in Semantic Networks and on-the-fly Semantic classifications, Zoom is a powerful Windows Search Engine designed for Document Management, Competitive Intelligence, Press Analysis and Text Mining.


Overtext Index Semantic Indexing

Overtext Index -  Semantic Data Processing for Servers

Overtext Index is a classification system designed for large-scale Customer Relationship Management (CRM), Natural Language Third Party Information Retrieval, Knowledge Management, Business Intelligence and Strategic Watch systems


This software is designed to help you to face a increasingly dense information flood:

  • accelerate your reading rate,
  • analyze in-depth and objectively,
  • extract relevant information,
  • classify automatically, therefore structure information.

Because they offer considerable time savings and enhanced visibility of strategic data, Tropes and Zoom software yield an exceptional Return On Investments (ROI). They generally show a profit as of their first use, sometimes in a matter of hours! You don't believe us? Try the free version of Tropes Zoom in our download area.

Our software is based on powerful text analysis technology, using dictionaries that contain hundreds of thousands of preset semantic classifications, and reliable analysis techniques resulting from years of scientific research.

Products available now for the English, French, Spanish, Portuguese and Brazilian languages.


Enjoy!


Thomas C. Stowe
Email/GChat/MS Live Messenger: stowe.thomas@gmail.com 
Texas Computer Services: http://www.txpcservices.com
Portfolio/VCard/Resume: http://www.thomasstowe.info
Blog: http://www.sc3ne.com
Survive2 SHTF/Disaster Prep/Homesteading Information: http://www.survive2.com
Phone/SMS/VoiceMail: +1-210-704-7289
Skype: thomasstowe