README for Curator v1.0.6
=========================
Author: Cognitive Computation Group, UIUC
Date: 12/1/2013
We appreciate the help of various Curator users for helping us to improve
this documentation.
Table of Contents
=================
1 README for Curator v1.0.6
1.1 What is the Curator?
1.2 Fine print
2 Requirements
3 Installation
3.1 Prerequisites
3.2 Download
3.3 Compilation and Installation
3.3.1 Uncompress the downloaded tarball
3.3.2 Set environment variables
3.3.3 Setting up MongoDB for Curator
3.3.4 Installing the Curator itself
4 Usage
4.1 Starting the Curator and its components
4.2 Testing the curator
4.3 Using the Curator
5 Understanding the Configuration Files
5.1 curator.properties
5.2 database.properties
5.3 annotators-example.xml
5.4 Specifying Pipelines
6 Troubleshooting
6.1 Problems installing Thrift
6.2 bootstrap.sh
6.3 c++/Charniak
6.4 SRL fails with the message: Error adding attributes to predicate!
7 Known Issues
8 Further reading
8.1 Citation
8.2 Papers that have used the Curator
9 Contact
10 Version History
1 README for Curator v1.0.6
============================
1.1 What is the Curator?
-------------------------
The Curator is a system that acts as a central server in providing
annotations for text. It is responsible for requesting annotations
from multiple natural language processing servers, caching and
storing previous annotations and refreshing stale annotations. The
Curator provides a centralized resource which requests annotations
for natural language text. All the components will be run on one
or more of your machines, but your programs only needs to know
about the main Curator service.
Visit [http://cogcomp.org/page/software_view/Curator] for
more information
1.2 Fine print
---------------
The Curator is available under a Research and Academic use
license. For more details, visit the Curator website and click the
download link.
2 Requirements
===============
The Curator was developed on and for GNU/Linux, specifically CENTOS
(2.6.18-238.12.1.el5) and Scientific Linux
(2.6.32-279.5.2.el6.x86_64). There are no guarantees for running it
under any other operating system. The instructions below assume a
Linux OS.
For the sake of reasonably concise and non-insanity-inducing
instructions, we assume that either you will run all the processes
on a machine with a large (say, 24G+) memory, or that you will
install curator on a partition that is shared between multiple
machines. If this is not available to you, the easiest but most
annoyingly redundant thing to do is to install the entire curator
on each machine that will host curator processes. Beyond that,
feel free to improve our code and tell us how we should have done
it (ideally, together with working examples...) and we'll push the
improvements to the curator.
3 Installation
===============
3.1 Prerequisites
------------------
1. Apache Ant
2. gcc version 4.1.2
3. Java 1.5 or later
4. Boost version 1.33.1 or later: Boost is needed for the Charniak
parser (which in turn is used by the semantic role labeler). It
is also needed to enable the C++ features of Thrift. Your Linux
distribution may come with boost installed. If not, you can
download it from [http://www.boost.org] and follow the "Getting
Started" guide to install Boost.
5. Thrift version 0.8.0: Apache Thrift can be downloaded from
[https://dist.apache.org/repos/dist/release/thrift/0.8.0/thrift-0.8.0.tar.gz].
Follow the instructions here
[http://thrift.apache.org/docs/BuildingFromSource/] to install Thrift.
!! NOTE 0: ***YOU MUST USE THRIFT VERSION 0.8.0***: later versions will
!! create build problems.
NOTE 1: you can ignore the first step (./bootstrap.sh) as you have
downloaded a release version from the URL above.
NOTE 2: assuming you do not wish to use thrift for developing
applications in languages other than those used for Curator,
you can run Thrift's "configure" command with the "--without-XYZ'
flags as shown below (the options avoid installing the bindings
for other languages). The sample configure command also specifies
the 'prefix' flag to force thrift to install to a user-specified
location; this may be necessary if you don't have root access.
> ./configure --without-csharp --without-erlang --without-ruby \
--without-haskell --without-go --prefix="/jsmith/lib/thrift"
Troubleshooting:
- if you see errors when running 'make check' of the form
'no rule to make target /usr/lib/libboost_unit_test_framework.a'
you may have an unexpected (at least by thrift) path to the
relevant library. First, check where (and whether) you have
the relevant boost library installed:
> locate libboost_unit_test
if you see no directories listed, you need to install boost-devel:
as root, run:
> yum install boost-devel
-- and try 'make' and 'make check' again. (The Thrift web site
advises running a setup step for different OSs that includes
a 'yum install' step for CENTOS-like systems that obviates the
need for this command; see the workaround next.)
If the directories are there but the build fails, you may need
a workaround, which requires sudo/root access.
First, look at the output from the 'make check' command, and
look for a line like this:
libtool: link: g++ -Wall -g -O2 -o .libs/AllProtocolsTest AllProtocolTests.o -L/usr/lib64/lib ./.libs/libtestgencpp.a /scratch/downloads/thrift-0.8.0/lib/cpp/.libs/libthrift.so -lssl -lcrypto -lrt -lpthread -Wl,-rpath -Wl,/usr/local/lib
The last entry is where make is looking for the
libboost_unit_test_framework.a library. Suppose your copy of the
library is in /usr/lib64/:
to resolve the problem you would run the commands:
> cd /usr/local/lib
> ln -s /usr/lib64/libboost_unit_test_framework.a .
If you can't locate the libboost_unit_test_framework.a library,
you will need to install boost locally. The Boost website gives
detailed instructions, including paths to use when specifying
include and linker library paths. Suppose you install Boost
to the directory /jsmith/lib/boost/. When you install thrift,
you need to specify the link path with the option
'--with-boost="/jsmith/lib/boost"' as in the example above.
Of course, these different tools with their various install
options may yet conspire to trip you up. The bottom line is that
Thrift 0.8.0 (and possibly other versions of thrift) look for
the Boost static library in a specific place, so even after
installing boost and configuring with the flags set correctly,
you may end up with the library not being found. The good news
is that if boost built the library, you can tell configure
where that link directory is by modifying the value of an
environment variable. In Bash, you modify the ~/.bash_profile
file of the user who will run the code (probably you, but
could be another user with read access to the directories
where you installed everything): open .bash_profile in a
text editor and add the line
$LD_LIB_PATH=$LD_LIB_PATH:/jsmith/lib/boost/stage/lib
-- naturally, the '/jsmith/...' path must be the path to
the boost libraries on your machine.
Alternatively, you can create a symbolic link to the actual
directory in the place where thrift expects to find it. On
the CCG test machine, the local boost install put the
missing library in boost_1_55_0/stage/lib (rather than in
boost_1_55_0/lib). Thrift reports the locations it tried
at the point of failure -- in our case, it was expecting
boost_1_55_0/lib -- which makes this a little easier.
To create the symbolic link (symlink),
> cd boost_1_55_0
> ln -s stage/lib lib
For us, this fixed the problem.
Next obstacle: thrift will try to install to standard locations
for each language, but will ignore the --prefix="/my/preferred/loc"
argument you give to configure. You will need to specify
environment variables for these.
For the languages we use by default, this means specifying
PY_PREFIX, JAVA_PREFIX, PHP_PREFIX, PHP_CONFIG_PREFIX,
and PERL_PREFIX.
Suppose you want everything under your newly created local
directory /jsmith/lib/; you could set these variables in
your/your user's .bash_profile thus:
MAIN_PREFIX="/jsmith/lib"
export PY_PREFIX="$MAIN_PREFIX/python"
export JAVA_PREFIX="$MAIN_PREFIX/java"
export PHP_PREFIX="$MAIN_PREFIX/php"
export PHP_CONFIG_PREFIX="$MAIN_PREFIX/php.d"
export PERL_PREFIX="$MAIN_PREFIX/perl"
If you write new components in these languages, you will need
to either use these environment variables or directly point
the compiler/interpreter to the relevant location for the
thrift library.
NOTE: these environment variables must be set before you
run 'configure' for thrift installation.
6. Mongodb: Get the Mongodb software from
[http://www.mongodb.org/downloads]. Installation instructions can
be found at [http://www.mongodb.org/display/DOCS/Quickstart].
You will need to make sure that the Java and Ant binary directories
are on your system's PATH. Java and Ant require environment variables
to be set when they are installed -- JAVA_HOME and ANT_HOME. You
should check that these are set to a non-empty value:
> echo $JAVA_HOME
/some/path/on/your/machine/sun-jdk-1.6.0-latest-el6-x86_64
> echo $ANT_HOME
/some/other/path/apache-ant-1.8.1
If no value is displayed, consult the documentation for installing
the offending software.
Next, check whether the relevant bin/ files are on your PATH:
> echo $PATH
-- you should see a set of entries including the value of
JAVA_HOME/bin and ANT_HOME/bin (for the example above, this means the line
":some/path/on/your/machine/sun-jdk-1.6.0-latest-el6-x86_64/bin:...
...:/some/other/path/apache-ant-1.81.1/bin"
If these values don't appear in the output, you need To add these
values to your system's PATH. For convenience, you can modify your
.bashrc or .cshrc file in your home directory. For Bash, it is:
PATH=$PATH:$JAVA_HOME/bin:$ANT_HOME/bin
export PATH
for C-shell, it is:
setenv PATH $PATH:$JAVA_HOME/bin:$ANT_HOME/bin
3.2 Download
-------------
The curator can be downloaded from the website of the Cognitive
Computation Group from
[http://cogcomp.org/page/download_view/Curator]. The
download is a 50 MB tarball.
3.3 Compilation and Installation
---------------------------------
The following installation instructions were tested on the bash
shell. If you are using a different shell, you might have to make
minor modifications.
3.3.1 Uncompress the downloaded tarball.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
$ tar xfvz curator-1.0.6.tgz
$ cd curator/
3.3.2 Set environment variables
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The file setEnvVars.sh in the main Curator directory lists all the
environment variables needed. Modify this file based on your
system's configuration (i.e. where you installed Thrift, Boost, and
Curator).
To use the Curator, the following environment variables need to be
set:
1) CURATOR_HOME: The main curator directory that was extracted
from the download.
2) BOOST_INC_DIR: The "include" directory of the Boost
installation. To check if you have the right directory, verify
that the directory contains a subdirectory called boost, which
in turn contains many .hpp files.
3) THRIFT_ROOT: The root directory of your thrift installation.
4) MONGO_BIN: The directory that contains the mongo and mongod
executables.
5) PATH: The path must be extended to include the directory
containing the thrift executable (should be $THRIFT_ROOT/bin)
You will do this by setting the values of the first four variables in the
file setEnvVars.sh, the rest should be correct (based on the
distribution of Thrift we used). After modifying the variables in
the file setEnvVars.sh, export the variables to your shell using
$ source setEnvVars.sh
You can verify that the variables have been set:
$ echo $THRIFT_CPP_INCLUDE
and you should see a path with the prefix being the value to which
you set the variable THRIFT_ROOT, and the suffix being
"/include/thrift".
3.3.3 Setting up MongoDB for Curator
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If you wish to specify non-default locations for the db archive
and log directories, you can set these values in setEnvVars.sh as
well; if you change the defaults you will need to create the new
directories yourself.
There are two modes in which the Curator can interact with the
database. The curator database can be set up to be either local to
the Curator instance. Alternatively, it can be a remote instance.
* OPTION 1: Local Mongodb instance
These instructions assume you will run the mongodb server on the same
machine as you run the main Curator server; Curator will, by default,
try to connect to a mongodb instance running on the same host
machine with the username "curator" and password "curator".
Create the database directories.
$ mkdir -p dist/{db-log,db-archive}
Once you have set the environment variables, you can start the
mongodb server by running
$ ./startMongo.sh
You will need to create the curator database, user and password in
your new db instance:
$ $MONGO_BIN/mongo
MongoDB shell version: 2.0.6
connecting to: test
> use curator
switched to db curator
> db.addUser("curator", "curator")
{ "n" : 0, "connectionId" : 4, "err" : null, "ok" : 1 }
{
"user" : "curator",
"readOnly" : false,
"pwd" : "dcc462829872d978d8d45692952f2bd0",
"_id" : ObjectId("50184a1be500ad3e8fd5329d")
}
> exit
* OPTION 2: Remote Mongodb instance
It is possible to set up a Mongodb server process that listens on
a port, and which may be accessed remotely. On the host machine,
after you have installed Mongodb, create the directory that will
hold the database files and log, and copy the script
'startMongoRemote.sh' there.
Log into the mongodb host machine and navigate to the directory
you have created.
In a text editor, open the file 'startMongoRemote.sh' and change
the value of the variable 'MONGO_BIN' to the directory containing
the Mongodb executable. For example, if you installed Mongodb
into /scratch/, you would change the value of the MONGO_BIN
variable to something like the following (version number could be
different, for example):
MONGO_BIN="/scratch/mongodb-linux-x86_64-2.0.6/bin"
Choose the port number you will use for the mongodb instance,
e.g. 21987, and set the MONGO_PORT value in the
startMongoRemote.sh script:
MONGO_PORT=21987
You can also change the default locations for the Mongodb archive
and log directories; by default, they will be created under the
directory the script is started in.
Exit the text editor and start the script:
$ chmod 755 startMongoRemote.sh
$ ./startMongoRemote.sh
If this is the first time you have run the script, it should
create two new directories for the log and archive of the
database instance. You can check that the process started
correctly by looking at the log file.
Next, you need to log in locally to create the database and user that
the Curator will use to access the database. To connect, run
$
/mongo --port=
In the example here, this command would read:
$ /scratch/mongodb-linux-x86_64-2.0.6/bin/mongo --port=21987
At this point, you should get the mongo prompt:
MongoDB shell version: 2.0.6
connecting to: 127.0.0.1:21987/test
>
and now you can create the curator database, user and password as
described under Option 1 above.
To configure Curator to use this remote mongodb instance: In a
text editor, open the file
curator/curator-server/configs/database.properties
In the line 'database.url' change the value from 'localhost' to
':'
For example, if you will run your mongodb server process on a
machine named "macha.cs.uiuc.edu" on port "21987" you would
change the entry to read:
database.url = macha.cs.uiuc.edu:21987
IMPORTANT: If you have already installed Curator before deciding
to set up a remote Mongodb instance, you will instead need to
change the database.properties file in curator/dist/configs/.
NOTE: to stop a mongo instance, you can log into the mongo shell
and execute the following commands:
$ $MONGO_BIN/mongo
MongoDB shell version: 2.0.6
connecting to: test
> use admin
> db.shutdownServer()
3.3.4 Installing the Curator itself
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0. It is assumed that you have already set Curator's
environment variables:
> source setEnvVars.sh
1. The file 'bootstrap.sh' in the main curator directory will
download all the dependencies and data files that are
unavailable as maven repositories.
First, edit the file to select only the annotators you want to
install. For example, if you want all annotators except the
Wikifier, you should have the following setting:
BASIC=1
NEW_NER=1
STANFORD=1
SRL=1
WIKIFIER=0
The variable CLEANUP, if set to 1, will delete all temporary
directories that are created during the installation process.
Finally, if you don't have curl on your computer, change the
value of the variable NOCURL to "1". To check if you have curl,
use
$ which curl
If you don't get a path like /usr/bin/curl in response, just a
blank line or a warning that it doesn't exist, you have to
change the value of NOCURL.
2. Run
$ ./bootstrap.sh
3. You should now be able to build everything using ant.
$ ant dist
This will create a `dist` directory which contains all the
jars, scripts and documentation required to run the Curator and
the annotators.
NOTE: If you have selected only a subset of the annotators in
step 1, you have to comment them out in the file build.xml. For
example, if you do not wish to install the Wikifier, the lines
110-122 in build.xml look as follows:
4. Non-Java annotators
The non-java annotators (presently, just our server-ized version of
the Charniak parser) are not completely installed by the "ant dist"
process described above.
You will need to compile the Charniak server and set the permissions of
the 'runIt.pl' script:
$ cd dist/CharniakServer/parser05May26fixed/PARSE
$ make charniakThriftServer && make charniakThriftServerKbest
$ cd ../../
$ chmod 755 runIt.pl
5. Installing WordNet files
If you did NOT install wikifier, the WordNet files will probably not be
installed correctly, and they are needed by other tools. To install them,
navigate to the main curator directory and run:
$ tar xzf resources/WordNet.tgz dist/data
and then
$ ls dist/data
should show the directory WordNet/.
Aside: Each individual java component can be built using ant from
the component's directory. Some even have test targets.
4 Usage
========
4.1 Starting the Curator and its components
--------------------------------------------
You will need to determine which ports the component annotators and
the main curator service will listen on, and on which machines they
will reside. You must change curator's configuration file to
reflect these decisions; in the example config,
dist/configs/annotators-example.xml, all machines are set to
"localhost".
The tokenizer server is run by the curator server process (the
tokenizer entry in the config file is specified as "local"), so you
don't need to explicitly start/stop/track it.
You then need to start each component running on the machine and port
specified in the config file. Component startup scripts can be found
in curator/dist/bin/, and are set up to be run from the curator/dist/
directory. The different components may require different arguments
to be passed to them; for now, the examples below should get you
started. The examples assume you use the port specified in
dist/configs/annotators-example.xml.
The example below creates a log directory and directs output from
the annotator processes to corresponding log files; it also runs them
as background processes so that all can be started from the same
terminal.
You can also use the startServers.sh script in the curator/ directory --
copy it to the curator/dist directory and run it from there. It just
replicates the examples below.
$ mkdir logs
$ bin/illinois-pos-server.sh -p 9091 >& logs/pos.log &
$ bin/illinois-lemmatizer-server.sh -p 12345 >& logs/lemma.log &
$ bin/illinois-chunker-server.sh -p 9092 >& logs/chunk.log &
$ bin/illinois-ner-server.sh -p 9093 -c configs/ner.config >& logs/ner.log &
$ bin/illinois-coref-server.sh -p 9094 >& logs/coref.log &
$ bin/stanford-parser-server.sh -p 9095 >& logs/stanford.log &
$ bin/illinois-verb-srl-server.sh -p 14810 >& logs/verb-srl.log &
$ bin/illinois-nom-srl-server.sh -p 14910 >& logs/nom-srl.log &
$ bin/illinois-wikifier-server.sh -p 15231 >& logs/wikifier.log &
The non-java components need to be started too:
$ cd CharniakServer
$ ./start_charniak.sh 9987 charniak9987 >& ../logs/charniak.log
$ cd ..
You can start the curator server in the same way:
$ bin/curator.sh --annotators configs/annotators-example.xml --port 9010 --threads 10 >& logs/curator.log &
NOTE that it may take some time (a few minutes) for the annotators
to load their models into memory, so the curator may initially be
unable to connect to them (i.e., you may see some
"ConnectException" messages in the curator log file; after the
components have loaded their models, curator should nevertheless be
able to connect to them).
4.2 Testing the curator
------------------------
A basic test that the curator (and a subset of its servers) are
running properly can be run from the client-examples/ subdirectory
of dist. Navigate to client-examples/java, and compile using the
command 'ant':
$ cd dist/client-examples/java
$ ant
Create a test sentence, e.g. using the command
$ echo "Mr. Smith saw the dog with a telescope." > test.txt
You can then run the 'runclient' script:
$ ./runclient.sh localhost 9010 test.txt
This should generate a long stream of text outputs corresponding to
different annotation resources.
4.3 Using the Curator
----------------------
In the curator annotators config file, the "field" entry/entries
for a given component identify the fields in a Curator Record
object that the relevant annotator will populate. The bottom line
is that you can call curator with a Record and the name of a field,
and it will populate that field from the relevant resource. The
client only needs to know about the curator and the names of the
fields it can provide.
One relatively painless way to use curator is via the Edison
library ([http://cogcomp.org/software/edison/])
5 Understanding the Configuration Files
=======================================
The Curator uses three main configuration files:
curator.properties
database.properties
annotators-example.xml
5.1 curator.properties
----------------------
There are three flags you may want to change in this file:
client.timeout
specifies the time in seconds before the Curator server throws
an exception due to a component server taking too long.
curator.reporttime
curator.versiontime
specify how often Curator checks its component servers are
active (and what version they return, as a way of checking
for component updates), and how often it logs the information
they report. The units are minutes.
5.2 database.properties
-----------------------
This file specifies properties of the Curator's database
behavior.
database.url
specifies the hostname and port of Curator's MongoDB instance.
database.reporttime
sets the interval in minutes between Curator's report of
database activity for the previous interval.
database.maintenancetime
specifies the time in minutes between Curator's cleanup of
the database by removing files that have not been accessed
recently.
database.updatecount
specifies the number of records that can be accessed before
Curator automatically generates a report, regardless of
time elapsed since the last report (this prevents the
maintenance event from taking too long).
database.expiretime
specifies the time in days before a record that has not
been accessed will be deleted. This is intended to keep
the database compact by keeping only "popular" data.
5.3 annotators-example.xml
This file specifies the hosts, ports, and field names for
each Curator component. The ports must agree with the port
arguments you pass to the component startup scripts, and
the hosts must agree with the machine on which you run
those scripts. By default, everything is set up to run
on a single machine.
The field names are used to label the views containing
the annotations for the corresponding components in the
Curator output (i.e., the Record data structure it returns).
As such, you need to use the same label when you access
the views in the Record.
If you are using Edison
(http://cogcomp.org/page/software_view/Edison),
you should know that it assumes that views are named
as they are in the Curator's default annotators file
(so 4-label Named Entity Recognizer output is expected to
be in a field named "ner", for example).
5.4 Specifying Pipelines
------------------------
Note that each entry in the annotators-example.xml file
may specify a set of prerequisites ("requirement" fields).
These use the same field names just described. If you
want to experiment with different pipelines, you can
specify different annotators that use the same output
labels.
For example: suppose I have a Coreference system that
requires POS and Chunker inputs; also, that I have three
different part-of-speech taggers and two different chunkers.
Assuming that the chunkers use the same set of output labels,
and that the POS taggers do as well, I can assign each of
these a unique field name (e.g. "pos1", "pos2", "pos3",
"chunk1", and "chunk2"). I can then specify combinations
of interest in the "requirements" field of my Coreference
component. If I want to try multiple combinations concurrently,
I will need to add a separate Coreference component for each
combination of requirements I want to use.
(This is inefficient, but relatively simple to set up; we
will explore a more efficient mechanism in future work.)
6 Troubleshooting
==================
6.1 Problems installing Thrift
-------------------------------
1. If there's warning message indicating that the boost library is
not installed, check the path of the boost library.
2. If there is a problem with the ruby configuration, you can
choose to either update ruby or install Thrift without ruby
support. To do so, you need to configure the Makefile using
$ ./configure --without-ruby
Alternatively, you can update ruby and run:
$ gem install bundler
$ bundle exec rake
3. If you get an error indicating that 'groupid attribute not
supported', you may need to use an older version of apache ant:
version 1.8.1 and 1.8.2 worked for us, but version 1.8.4 did
not.
6.2 bootstrap.sh
-----------------
Some users have reported that the tgz packages for NER and Wiki are
too large for curl to handle on their systems, leading to
incomplete copies. In this case you will need to download them to
curator/tmp/ via your browser, and comment out the download lines
in the bootstrap.sh script. You can then run the bootstrap.sh
script, which will move the packages to the appropriate locations.
6.3 c++/Charniak
-----------------
Some users have reported problems with the Charniak annotator. The
fixes required center around missing #include directives; this
varies by platform.
6.3.1: "uint32_t does not name a type"
If you see errors like "uint32_t does not name a type" you need to add
#include
immediately before the other #include directives at the top of Thrift.h
(found in $THRIFT_ROOT/lib/cpp/src/)
Note: for one user, this apparently was not sufficient. He resolved his
problems by adding the same include directive to other files under
curator/curator-interfaces/gen-cpp/ :
- Parser.h
- BaseService.h
- base_types.h
- curator_types.h
- MultiParser.h
6.4 SRL fails with the message: Error adding attributes to predicate!
----------------------------------------------------------------------
If the SRL fails with the message: Error adding attributes to
predicate! Unable to install
net.didion.jwnl.dictionary.FileBackedDictionary
The path to WordNet needs to be set in
curator/dist/configs/jwnl_properties.xml. If you followed the
instructions for installing WordNet, it should be in
curator/dist/data/WordNet/; check that the path entry in the
jwnl_properties.xml file is correct.
NOTE: The value shown assumes you start the SRL component from the
dist/ directory.
7 Known Issues
===============
- Presently, if an annotator takes too long to process a piece of
data, the curator will return an exception indicating timeout.
However, the client may continue to process the data, resulting in
spurious timeout exceptions for subsequent calls until the
annotator is finished. This and other problems will be fixed,
creator willing, in a future release.
- The Stanford Parser component uses an outdated version of the
Stanford Parser.
8 Further reading
==================
7.1 Citation
-------------
To cite the Curator, use the following publication:
An NLP Curator (or: How I Learned to Stop Worrying and Love NLP
Pipelines). J. Clarke and V. Srikumar and M. Sammons and D. Roth,
LREC, 2012.
7.2 Papers that have used the Curator
--------------------------------------
TBA
9 Contact
==========
Please send a message to illinois-ml-nlp-users@cs.uiuc.edu for any
questions about installing or using the Curator.
10 Version History
==================
0.6.x Versions using Thrift 0.4.0, with increments representing
additional components, bug fixes, and improvements to
documentation
1.0.0 Updated Curator to use Thrift 0.8.0
1.0.1-1.0.3 bug fixes/documentation fixes
1.0.4 Added Illinois Lemmatizer; improved documentation
1.0.5 Fixed problems with lemmatizer install
1.0.6 Improved documentation, more bug fixes