Getting Started

Revision as of 09:50, 25 May 2012

1 Quick Start - Introduction
2 Input Specification
3 Processing Chains
- 3.1 Building a Chain
- 3.2 Running a Chain

Quick Start - Introduction

WebLicht is a service orchestration and execution environment for incremental automatic annotation of text corpora, built upon Service Oriented Architecture principles. Its main components are:

A set of distributed services for data processing
A repository containing metadata about the services
A web application that offers a user-friendly graphical interface for building chains of services and executing them

WebLicht's services are hosted on servers distributed throughout Europe, which allows full control of a tool by its developer. Each tool within WebLicht is registered in a repository containing information about its input requirements and its output, among other things. This information is necessary to determine which tools can be added to the processing chain at any given point while building a chain because each tool in a processing chain uses the output of the previous tool in the chain as its input. Only those tools which are able to work with the data are available as choices for adding to a chain.

The main focus of this Quick Start Guide is on WebLicht's web application and how it can be used to linguistically annotate text and subsequently to visualize the annotations. We will create and run a very simple tool chain to annotate a text and view the results.

Input Specification

This section describes several ways of specifying the input for a processing chain, which must be done before a chain can be built.

Create a new text input

Choose File, then New from the main menu, then select the type of data you would like to enter. A window will then appear where the input can be entered. A title should also be given to the input data, for later reference. If applicable, further specification of the input data (such as its language), can be given.

Use one of the sample files provided

Choose File, then Open Sample from the main menu, then select the input that you would like to use. The contents of the file will be displayed in the Preview area, and any chains that are associated with the selected input will be displayed.

Upload a file from your computer

Choose File, then Upload from the main menu, then select the type of data you would like to upload. The contents of the file will be displayed in the Preview area if possible. Please note that text files should be in UTF-8 format.

Open a previously used input

Choose File, then Open Recent from the main menu, then select the input that you would like to use. The contents of the file will be displayed in the Preview area if possible, and any chains that are associated with the selected input will be displayed.

Processing Chains

Once the input data has been chosen, it is possible to build a processing chain that operates on the input. Typically, each tool in a processing chain has requirements for its input data (it may need to have certain annotation layers already defined, for example), and produces output that has one or more additional linguistic annotation layers. Each time a tool is selected a new set of tools is presented, each of which is able to operate on the output of the chain so far. This process can be repeated until all of the desired annotation layers have been added, or until there are no more tools available.

Building a Chain

To choose a tool, simply drag its icon from the Next Choices area to the Current Tool Chain area. The icons in the Next Choices area will then be automatically updated so that they are compatible with the tools that have already been added to the chain. Repeat this process to add additional tools.

Running a Chain

Run the processing chain by clicking on the Run Tools button:

Notice that as each tool finishes running, several icons are added to the tool's lower icon bar:

Click on the download icon to transfer the result to your local computer.

Click on the view annotations icon to open the TCF Annotations Viewer in a separate browser tab.

Click on the raw results icon to view the raw XML results. If a service returns html code, this icon will display the rendered html page instead of the raw html code.

From WebLichtWiki