Using WebLicht from the Commandline

From WebLichtWiki

Revision as of 09:23, 23 August 2013 by Yana (Talk | contribs)
Jump to: navigation, search

WebLicht web services are implemented as RESTstyle web services: this means, they can be called not only from WebLichts graphical user interface, but also from the commandline.

On Unix-like systems like Linux or Mac OS X, two CLI tools can be used for this task: wget or curl. A generic call of a WebLicht web service looks like this, with curl:

curl -H 'content-type: text/plain' --data-binary @input.tcf -X POST "http://url-to-webservice-with-parameters" -o output.tcf

with wget:

wget --post-file=input.tcf --header='Content-Type: text/plain' "http://url-to-webservice-with-parameters" -O output.tcf

where input.tcf is the input file and output.tcf is the output file (the file extension doesn't play a role). For some web services, it is necessary to specify additional parameters in the form of URL query string parameters.


For example, converting a file with UTF-8 encoded plain text to a TCF file, the whole command looks like this, with curl:

curl -H 'content-type: text/plain' --data-binary @input.tcf -X POST "http://weblicht.sfs.uni-tuebingen.de/rws/convert-all/qp?informat=plaintext&language=de&outformat=tcf04" -o output.tcf

with wget:

wget --post-file=input.tcf --header='Content-Type: text/plain' "http://weblicht.sfs.uni-tuebingen.de/rws/convert-all/qp?informat=plaintext&language=de&outformat=tcf04" -O output.tcf


This command will send the data of the file input.tcf to the converter web service, which sends back TCF data. This TCF data is stored in the file output.tcf. In addition, the converter web service needs some parameters (input format, language and output format) which are appended to the URL as URL query string parameters.

In a next step, the output of the web service (output.tcf) can be used as input for a tokenizer, for example, with curl:

curl -H 'content-type: text/tcf+xml' --data-binary @tcf.tcf -X POST "http://weblicht.sfs.uni-tuebingen.de/rws/service-opennlp/annotate/tok-sentences" -o tokSen.tcf

with wget:

wget --post-file=tcf.tcf --header='Content-Type: text/plain' "http://weblicht.sfs.uni-tuebingen.de/rws/service-opennlp/annotate/tok-sentences" -O tokSen.tcf

Please note that the content-type has now switched from "text/plain" to "text/tcf+xml". This web service doesn't need any additional parameters.