Using WebLicht from the Commandline
From WebLichtWiki
Line 1: | Line 1: | ||
WebLicht web services are implemented as RESTstyle web services: this means, they can be called not only from WebLichts graphical user interface, but also from the commandline. | WebLicht web services are implemented as RESTstyle web services: this means, they can be called not only from WebLichts graphical user interface, but also from the commandline. | ||
− | On Unix-like systems like Linux or Mac OS X, two CLI tools can be used for this task: wget or curl. | + | On Unix-like systems like Linux or Mac OS X, two CLI tools can be used for this task: wget or curl. |
+ | A generic call of a WebLicht web service looks like this, with curl: | ||
− | + | <code>curl -H 'content-type: text/plain' --data-binary @input.tcf -X POST "http://url-to-webservice-with-parameters" -o output.tcf</code> | |
− | + | with wget: | |
− | + | <code>wget --post-file=input.tcf --header='Content-Type: text/plain' "http://url-to-webservice-with-parameters" -O output.tcf</code> | |
− | + | where input.tcf is the input file and output.tcf is the output file (the file extension doesn't play a role). For some web services, it is necessary to specify additional parameters in the form of URL query string parameters. | |
− | <code>curl -H 'content-type: text/plain' - | + | |
+ | For example, converting a file with UTF-8 encoded plain text to a TCF file, the whole command looks like this, with curl: | ||
+ | |||
+ | <code>curl -H 'content-type: text/plain' --data-binary @input.tcf -X POST | ||
"http://weblicht.sfs.uni-tuebingen.de/rws/convert-all/qp?informat=plaintext&language=de&outformat=tcf04" -o output.tcf</code> | "http://weblicht.sfs.uni-tuebingen.de/rws/convert-all/qp?informat=plaintext&language=de&outformat=tcf04" -o output.tcf</code> | ||
− | This command will send the data of the file input.tcf to the converter web service, which sends back TCF data. This TCF data is stored in the file output.tcf. In addition, the converter web service needs some parameters (input format, language and output format) which are appended to the URL as | + | with wget: |
+ | |||
+ | <code>wget --post-file=input.tcf --header='Content-Type: text/plain' "http://weblicht.sfs.uni-tuebingen.de/rws/convert-all/qp?informat=plaintext&language=de&outformat=tcf04" -O output.tcf</code> | ||
+ | |||
+ | |||
+ | This command will send the data of the file input.tcf to the converter web service, which sends back TCF data. This TCF data is stored in the file output.tcf. In addition, the converter web service needs some parameters (input format, language and output format) which are appended to the URL as URL query string parameters. | ||
+ | |||
+ | In a next step, the output of the web service (output.tcf) can be used as input for a tokenizer, for example, with curl: | ||
+ | |||
+ | <code>curl -H 'content-type: text/tcf+xml' --data-binary @tcf.tcf -X POST "http://weblicht.sfs.uni-tuebingen.de/rws/service-opennlp/annotate/tok-sentences" -o tokSen.tcf</code> | ||
− | + | with wget: | |
− | <code> | + | <code>wget --post-file=tcf.tcf --header='Content-Type: text/plain' "http://weblicht.sfs.uni-tuebingen.de/rws/service-opennlp/annotate/tok-sentences" -O tokSen.tcf</code> |
Please note that the content-type has now switched from "text/plain" to "text/tcf+xml". This web service doesn't need any additional parameters. | Please note that the content-type has now switched from "text/plain" to "text/tcf+xml". This web service doesn't need any additional parameters. |
Revision as of 09:23, 23 August 2013
WebLicht web services are implemented as RESTstyle web services: this means, they can be called not only from WebLichts graphical user interface, but also from the commandline.
On Unix-like systems like Linux or Mac OS X, two CLI tools can be used for this task: wget or curl. A generic call of a WebLicht web service looks like this, with curl:
curl -H 'content-type: text/plain' --data-binary @input.tcf -X POST "http://url-to-webservice-with-parameters" -o output.tcf
with wget:
wget --post-file=input.tcf --header='Content-Type: text/plain' "http://url-to-webservice-with-parameters" -O output.tcf
where input.tcf is the input file and output.tcf is the output file (the file extension doesn't play a role). For some web services, it is necessary to specify additional parameters in the form of URL query string parameters.
For example, converting a file with UTF-8 encoded plain text to a TCF file, the whole command looks like this, with curl:
curl -H 'content-type: text/plain' --data-binary @input.tcf -X POST
"http://weblicht.sfs.uni-tuebingen.de/rws/convert-all/qp?informat=plaintext&language=de&outformat=tcf04" -o output.tcf
with wget:
wget --post-file=input.tcf --header='Content-Type: text/plain' "http://weblicht.sfs.uni-tuebingen.de/rws/convert-all/qp?informat=plaintext&language=de&outformat=tcf04" -O output.tcf
This command will send the data of the file input.tcf to the converter web service, which sends back TCF data. This TCF data is stored in the file output.tcf. In addition, the converter web service needs some parameters (input format, language and output format) which are appended to the URL as URL query string parameters.
In a next step, the output of the web service (output.tcf) can be used as input for a tokenizer, for example, with curl:
curl -H 'content-type: text/tcf+xml' --data-binary @tcf.tcf -X POST "http://weblicht.sfs.uni-tuebingen.de/rws/service-opennlp/annotate/tok-sentences" -o tokSen.tcf
with wget:
wget --post-file=tcf.tcf --header='Content-Type: text/plain' "http://weblicht.sfs.uni-tuebingen.de/rws/service-opennlp/annotate/tok-sentences" -O tokSen.tcf
Please note that the content-type has now switched from "text/plain" to "text/tcf+xml". This web service doesn't need any additional parameters.