Today, the most formidable tools to effectively manage unstructured data include natural languageprocessing, ontologies and data mining, which together support the effective use of unstructured data.
As I previously wrote in Forbes, companies ranging from IBM to Google to Microsoft are racing to combine natural languageprocessing with huge Big Data systems in the cloud that we can access from anywhere.
Deutsche Bank has chosen to use the Reuters NewsScope database which contains data from 2003 on and applies natural languageprocessing algorithms to the wire stories to measure sentiment.
The Cascalog system has combined the Cascading system for abstraction of advanced dataprocessing on Hadoop and other systems, the Datalog declarative programming language that is well suited for expressing database queries in an abstract form, and the Clojure language to create a declarative language for queries that can be run on Apache Hadoop.