Our Team are actively developing and own our own Search Toolkit.
We are developing a scientific text mining system capable of performing complex information extraction over large datasets.
RASP-based Pipeline
We convert PDF papers to SciXML, parse these files with RASP, and then index the RASP-based XML with Solr/Lucene. This work builds on our previous work within the FlySlip project.
Simple Web-based Interface
We are actively developing our web-based interface that allows users to perform complex information extraction.
Intuitive Search
Our users will search in an intuitive manner to produce complex linguistic searches. That is, users will not need to understand in any way the underlying linguistic structure of the complex searches the system helps them construct.
Similar Image Searches
We have successfully utilised the open-source LIRE project to integrate similar image searches into our generic search toolkit.