Chinese Text Project |
CTP API
The Chinese Text Project Application Programming Interface (CTP API) provides methods for integrating content and functionality of the CTP with other web-based or web-aware sites and applications. The API consists of two main components: a plugin API which integrates external functionality into the CTP, and a JSON API which allows CTP functionality to be integrated into external sites.
This page contains technical documentation for those interested in creating their own plugins. If you would like to learn how to use existing plugins from a user perspective, you may wish to read the Plugins page first.
Plugin API
The plugin API defines functional link points within the CTP, allowing these to be connected to external websites and user-defined tools. Users can then choose to install these plugins without requiring technical knowledge. Some examples of existing plugins are shown below.
Plugin | Description | Type | Example | Install |
---|---|---|---|---|
Text tools | Tools for textual analysis. | chapter, book | [Text tools] | [Install] |
Annotate | Tools for textual annotation. | chapter | [Annotate] | [Install] |
Text tools (beta version) | Tools for textual analysis (beta version). | chapter, book | [Text tools (beta version)] | [Install] |
Plain text | Export as plain text. | book, chapter | [Plain text] | [Install] |
TextRef | List editions of a title on TextRef.org. | book | [TextRef] | [Install] |
MHDB | MHDB character lookup. | character, word | [MHDB] | [Install] |
[More...]
Technically, a plugin is a description in XML of a programmatic way of linking to an external resource. Plugins must be valid XML conforming to the CTPPlugin DTD. An example plugin XML file is as follows:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE CTPPlugin PUBLIC "CTPPlugin" "http://ctext.org/plugins/ctpplugin.dtd"> <CTPPlugin xmlns="http://schema.ctext.org/Plugin"> <Plugin> <ShortName xml:lang="en">Plain text</ShortName> <ShortName xml:lang="zh">全文輸出</ShortName> <Description xml:lang="en">Export a chapter as plain text.</Description> <Description xml:lang="zh">輸出原典全文。</Description> <Url template="https://ctext.org/plugins/textexport/#{textRef}" pluginType="chapter" fieldEncoding="utf8" method="get" /> <Update src="https://ctext.org/plugins/textexport/plugin.xml" /> </Plugin> </CTPPlugin>The XML source of the current version of this plugin can be downloaded from the specified update URL.
To create your own plugin, start with this template and modify the appropriate elements as follows:
- ShortName - a name to use as the title of your plugin. Maximum 20 characters.
- Description - a description of the main purpose of your plugin. Maximum 250 characters.
- URL - describes a schematic URL to GET or POST to execute this plugin. Attributes for this element:
- template (required) - a schema into which data is inserted as described below.
- pluginType (required) - a comma separated list of the types of data this plugin can accept. Valid values are "character", "word", "string", "chapter" and "book".
- fieldEncoding - one of the following values:
Value Meaning Example (仁) utf8 (default) UTF-8 encoding %E4%BB%81 gb GB18030 encoding %C8%CA big5 Big5 encoding %A4%AF big5.hex Big5 encoding, expressed as lowercase hexadecimal* a4af big5.HEX Big5 encoding, expressed as uppercase hexadecimal* A4AF codepoint.hex Unicode codepoint, expressed as lowercase hexadecimal* 4ec1 codepoint.HEX Unicode codepoint, expressed as uppercase hexadecimal* 4EC1 - method - one of "get" (default) or "post".
- Update - an HTTP resource containing the XML plugin data for this plugin. If present, the src attribute must contain the URL. The Chinese Text Project system will poll this URL at regular intervals; if the code available at this URL changes (and is still a valid CTP Plugin), the plugin will automatically be updated for users who have enabled automatic updates for this plugin.
If you wish to provide different names and descriptions for English and Chinese users, the elements ShortName and Description can be repeated with xml:lang set to "en" for English, and "zh" for Chinese as shown in the above example.
URL schemas
The "template" element of a plugin must contain the "src" attribute, specifying a URL schema allowing the CTP to programmatically generate appropriate links to the specified resource. The "src" attribute contains a URL containing one or more of the following fields, into which are substituted the appropriate data.
Field | Applicable types | Contents | Example |
---|---|---|---|
searchTerms | character, word, string | One or more Unicode characters. | 仁 |
character.hanyudazidian | character | The page number on which a character appears in the Hanyu Da Zidian. | 107 |
character.gsr | character | The page number on which a character appears in Grammata Serica Recensa. | 388 |
textRef | book, chapter | The CTP URN corresponding to a textual object. | ctp:analects/xue-er |
title | book | The title of a top-level textual object. | 韓非子 |
authority-ctext | data | The identifier corresponding to an entity or date in the Data Wiki. | ctext:291374 |
Installing plugins
Each CTP user has their own personal plugin file, which is an XML file consisting of a list of zero or more CTP plugins. You can view and edit your own plugin file via the Plugins section of the Settings page. Installing a plugin simply means adding it to a user's plugin file.
In order to provide an intuitive user experience for users, a request can be made for a CTP plugin available as an XML file via HTTP to be installed to a user's account by opening a URL in the user's web browser. To request that a user install a particular plugin, first ensure that your plugin code is valid XML, confirms to the CTP plugin format, and is available via HTTP. Then to request that a user installs your plugin, direct the user to a link composed as follows:
https://ctext.org/account.pl?if=en&installplugin=[Plugin URL]If you wish the user to return to your website after installing the plugin, you may also pass the additional parameter return, with the value set to the URL you wish them to be redirected to after they have installed the plugin.
A user following the link and who does not have the specified plugin installed will be given the opportunity to install it. If the user already has the specified plugin installed and a return URL is specified, the user will be redirected to that URL.
Please note that, if specified, the return URL must be on the same domain as the referring URL.
JSON API
CTP API functions are primarily intended to be called from client-side JavaScript applications using CORS in conjunction with the Plugin system. Please note that usage restrictions and other terms and conditions apply to all usage of the API.
If you would like to write code using the JSON API, please start by reading the documentation. Please note that as this is a pre-release version, functions, parameters, and response formats may change slightly with future updates.
CTP URNs
CTP URNs are unique identifiers describing textual items such as books or parts of books. The CTP API deals with textual information by exchanging these identifiers. For example, textual plugins pass a URN to an external website or tool to uniquely identify the textual item that a user wishes to manipulate; this URN can then be passed to JSON API functions to obtain textual data and metadata about the text. API users must treat these as opaque identifiers and must not attempt to parse them in any way, as new identifiers will be created in the future that may be dissimilar to current URNs.
Some examples of CTP URNs are:
As shown in these examples, you can easily transform a CTP URN into a direct link to the corresponding text by linking directly to the getlink API function with "redirect" set to 1. To obtain the URN corresponding to a https://ctext.org URL programmatically, use the readlink API function.Textual data response format
Textual data is obtained by passing a CTP URN to the gettext function. This function returns one or more of the following three elements:
- title - the title in Chinese of the requested item.
- fulltext - an ordered list of paragraphs of text.
- subsections - an ordered list of URNs for subsections of the requested item. This element is only available to authenticated users (i.e. subscribers or those accessing with a valid API key).
Requests for chapters of text, e.g. ctp:analects/xue-er, will return a fulltext element, while requests for larger works or parts of larger works, e.g. ctp:analects, will typically return a subsections element if the client is authenticated, or ERR_REQUIRES_AUTHENTICATION if not.
If a client application is designed to handle only the "fulltext" element, it should use the "chapter" pluginType only; if it can handle both "fulltext" and "subsections", it should use a pluginType of "book,chapter".
Error handling
If an API request cannot be fulfilled, an "error" object is returned in place of the normal response body. This object contains the following fields:
Field | Content |
---|---|
code | a constant (see table below) describing the type of error and which does not vary with user interface selection or other factors |
description | human readable description of the error in html (which may include links to help pages or resolution methods) |
Applications should use the "code" field to handle application-specific responses, and display the html-formatted "description" field to the end user where necessary - this is particularly recommended in the case of errors such as ERR_REQUEST_LIMIT that may require end user action to resolve.
Possible error codes are as follows:
Code | Example text |
---|---|
ERR_NOT_SUPPORTED | Not supported. |
ERR_INVALID_URN | Invalid URN. |
ERR_UNDEFINED_URN | Resource does not exist. |
ERR_MISSING_PARAM | ______: Missing required parameter '______'. |
ERR_REQUEST_LIMIT | Request limit reached. Please log in to allow access to more data. |
ERR_INVALID_VALUE | '______' is not a valid value for parameter '______'. |
ERR_INVALID_PARAM | ______: '______' is not a valid parameter for this function. |
ERR_INVALID_FUNCTION | Unknown function. |
ERR_INVALID_APIKEY | The apikey parameter was supplied, but the key was invalid or expired. |
ERR_GENERIC | [Some other error condition.] |
ERR_REQUIRES_AUTHENTICATION | The requested function requires authentication to continue. Please access the API from a registered IP address or supply a valid API key. |
Rate limiting
The primary purpose of the CTP API is to allow the creation of client-side applications which extend CTP functionality in innovative ways, and to allow offline use of reasonable amounts of textual data. All users are welcome to make use of the API; however the frequency of API requests for textual data is limited according to user group:
- Unauthenticated users - Users who have not logged in to an account and are not accessing the site from a subscribing institution will be able to access a limited amount of data.
- CTP account users - Users who are logged in to their CTP account will be able to access to a larger amount of data.
- Institutional subscribers - Users accessing the API from registered IP addresses will be granted access as provided by their institutional agreement.
To determine the current status of a user, use the getstatus function.
Client libraries
- ctext - Python library for CTP API access.
A series of tutorials using this module, aimed primarily at newcomers to Python, is available on the Digital Sinology site.
JavaScript access
JavaScript clients can access the API using the Cross-Origin Resource Sharing (CORS) mechanism. In order to allow the API server to grant additional access privileges to logged in users, it is recommended that you set the "withCredentials" property of the XMLHttpRequest object to "true" before making your request. For instance:
var xhr = new XMLHttpRequest(); xhr.open('GET', 'https://api.ctext.org/getstatus', true); xhr.withCredentials = true; xhr.send(null);