References Identifier Service
From WebLichtWiki
Contents |
Introduction
This tutorial presents a workflow for creating a webservice for TCF processing. It imitates reference identifier service. The service processes POST requests containing TCF data with tokens, part-of-speech and named entity annotation layers. It uses processes these annotations to produce reference annotations.
This web-service imitates the case when its processing tool object requires the model for identifying the references. Since a model can consume much memory and/or require much time when loading, the tool instance is created only once (the corresponding model is loaded only once), when the application is created. The example shows the case when the tool is thread-safe, it can be shared among the clients without any synchronization.
Prerequisites
The tutorial assumes you have the following software installed:
- NetBeans IDE 7.2.1
- wget or curl command line tool (optional)
Adding Clarin Repository
The example WebLicht Service is provided as Maven Archetype stored in Clarin Repository. Therefore, you'll need to add Clarin Repository to your list of Maven Repositories. Skip this step if Clarin Repository is already among your Maven Repositories.
In NetBeans IDE, go to the list of Maven Repositories under the Services tab:
Right-click on "Maven Repositories" and select "Add Repository" option. Fill in the following information in the "Add Repository" window:
- Repository ID: clarin
- Repository Name: Clarin Repo
- Repository URL: http://catalog.clarin.eu/ds/nexus/content/repositories/Clarin/
Finish by pressing "Add"
Creating a Project from an Archetype
Once the Clarin Repository is accessible, we can start using the archetype at once. Press the "New Project" button in the menu bar and select: Maven -> Project From Archetype
In the next screen find and select "WebLicht References Webservice Archetype"
Provide a name for your project, a directory to store it in as you would normally do with any NetBeans project. In addition, you have a possibility to provide a group name for your maven artifact and a package name you would like to use.
That's it! You have just created a WebLicht webservice.
Testing Webservices
To test the service, run it on your local server. Right-click on the project and select "Run" option. In the next screen select Tomcat server and click OK button.
The most straightforward way to test a webesrvice is to use wget or curl command line tool. For example, to POST to the service TCF data from "input.xml" and display the output of the service in the terminal window, run curl:
curl -H 'content-type: text/tcf+xml' -d @input.xml -X POST http://localhost:8080/mywlproject/annotate/
Or wget:
wget --post-file=input.xml --header='Content-Type: text/tcf+xml' http://localhost:8080/mywlproject/annotate/
Make sure you actually have in the current directory a file named "input.xml" in TCF0.4 format containing tokens, part-of-speech and named entity annotation layers. Such a file, provided for testing, is located under "Web Pages" in your project, just copy it to your current directory.
What's next?
Of course you would probably like to customize the provided code. Let's take a look at the files we have in the project:
- ReferencesService.java
- Here, the application definition resides, use it to define the path to your application and/or add more resources. In this example, a resource ReferencesResource is added as Singleton resource. It means that only one instance of the resource will be created for the application.
- ReferencesResource.java
- This is the definition of a resource, in case more resources are required you can use it as a template for any further resources. (Don't forget to add them to the ReferencesService.java)
- ReferencesTool.java
- Here, an actual implementation of a tool resides. In this template an imitation of reference detector is provided. In case you are writing a service wrapper for already existing tool, here is where you would call your tool, translating input/output data from/into TCF format. Here, the wlfxb library can be of a help, as used in this resource implementation.