Follow us on Facebook to receive important updates Follow us on Twitter to receive important updates Follow us on sina.com's microblogging site to receive important updates Follow us on Douban to receive important updates
Chinese Text Project

Annotation client

The Annotation client is a ctext plugin designed to enable efficient semantic annotation of Chinese texts. Before reading these instructions, it is recommended that you familiarise yourself with the principles and goals of semantic annotation on ctext.org.

Installing the annotation plugin

Before using the annotation client to annotate texts in ctext.org, you must create a free account and log in. Next, add the Annotation plugin to your account, by clicking here and then clicking "Install" on the page that appears to confirm.

Loading a text

The easiest way to load a text from ctext.org is to navigate to a chapter of text, and then click the "Annotate" link at the top-right of the screen. (If you do not see this link, you may not have installed the Annotation client plugin.)

The text is loaded into the client, together with any annotations that have previously been saved into the text.

Automatic annotation

The annotation client can suggest possible annotations based on the contents of the text, and information in the Data Wiki knowledge base. Click the "Annotate" link; in a few moments, the system will indicate any identified candidates for linking. These will display as dark grey highlights, with colored underlines indicating the suggested type of annotation and entity. All annotations that are displayed with a grey background are unconfirmed annotations; this means that they will not be saved when saving the text to ctext or exporting the text, because they have not yet been confirmed by a human.

Each unconfirmed annotation proposed by the client will provide information about one or more possible annotations that could be made at that location. For example, in a sentence containing the characters "乾德", these two characters might be a reference to the 乾德 era of 宋太祖, or the 乾德 era of 前蜀後主, or 李乾德, son of 李日尊 - or something else. Obviously, these are different entities; the annotation task is to specify which entity is being referred to in each particular sentence.

To see the suggestions that have been proposed, click on one of the unconfirmed annotations. The system will show a list of one or more suggestions; each suggestion states a particular entity type (e.g. 'era', 'person', 'place', etc.), and shows a name for that entity. Beside each suggestion are a series of one or more links:

Use the links provided to determine which of the choices (if any) is the correct one. To confirm an annotation, click the "Y" link beside the item to which the selected instance refers. The annotation will change to show a solid color background, indicating that the annotation has been confirmed. In case you make a mistake, click the "Change" link to choose a different item, or click "X" to remove the annotation entirely.

Manual annotation

You can create a new annotation by dragging with your mouse to select the region of text to which the annotation should apply. Note that when doing this, the region you select must not contain or overlap with any other annotations - if it does, you should remove each of these first by clicking the "X" at the top right of the popup box for each annotation.

When a new annotation is created, the system may propose some possible matches; if any of these are correct, select the correct choice as in automatic annotation. If none of these are correct, there are two options:

  1. Locate the correct entity by searching - You can do this by typing in search terms (e.g. full or alternative name of the entity, or an entity identifier) in the box marked "Search". If a correct match is shown, click "Y" to confirm it.
  2. Create a new entity - Click the link corresponding to the type of entity and annotation you want to create. This will appear as a new confirmed annotation. When you subsequently save the text to ctext, a new entity will be created.
Always try to confirm that a matching entity does not exist before creating a new one.

Annotating dates

Please read the background notes on date annotation before annotating dates. When an annotation is created that is of the "date" type, additional fields will be available in the popup box, labeled Year, Month, and Day. These must be set to correspond to the meaning of the date in its actual context. For example, a date "三年二月甲子" should have Year set to 3, Month set to 2, and Day set to 甲子. The same should be done in cases where the context of the date makes clear what these values should be, even if the literal date does not contain them. For example, if the text containing the previous example date then went on to refer to a date "庚午", the correct annotation for that date would be Year 3, Month 2, Day 庚午. When a date refers only to a month and not a particular day, the Day field should be set to "N/A"; similarly, when a date refers only to a year, both Month and Day should be set to "N/A".

In addition to year, month, and day, every date annotation must be linked to an era entity (or, for rulers whose reign dates are used without eras, a person entity). The annotation client will offer suggestions based on confirmed era references occurring prior to the selected date in the text. So to correctly annotate a date like "大中祥符三年四月十四日", the simplest way to do this is to first confirm an annotation marking "大中祥符" as referring to the era 大中祥符, and then confirming the suggested entity and values for the date, which will be automatically suggested. In some cases, the correct era (and other values) will not be identified automatically, and it will be necessary to supply the correct values. To choose a different era, type the era name into the "Search" box, and confirm the correct selection.

Saving and exporting

If you are confident that the changes you have made are correct and in accordance with these guidelines, you can contribute the changes you have made to the annotations by clicking the "Save to ctext" link (note: this link is only shown after you have made changes to the annotations). Note: only confirmed annotations are saved - it is therefore not necessary to remove unconfirmed annotations suggested by the annotation client, since these will not be saved.

You can also save a local copy of your annotated text in XML, by clicking the "Export as XML" link. This will allow your web browser to download a file containing your annotations. As in the case of saving to ctext, only confirmed annotations are saved.

Annotating using the keyboard

In some materials, certain types of annotation may be repetitive, and the task of moving to the next annotation and approving it can be inefficient using a mouse. To help in these cases, certain keys on the keyboard can be used together with mouse actions to make the task more efficient:

By default, these keys move through all defined annotations. In some cases - particularly annotation of dates in historical texts - it may be useful to set these keys to move over only certain types of annotation, such as eras and dates. This can be done by deselecting tag types from the list at the top right of the annotation client: only those annotations which are of the selected types (or for which the first suggested candidate is of one of the selected types) will be included in the keyboard navigation functions.

Extracting knowledge claims

Texts that have been partially or completely marked up can be used as evidence for knowledge claims. A knowledge claim represents a piece of information about an historical entity - such as a person or place. For instance, a primary source might contain information like this:

1 庚子欽差大臣林則徐道卒,
This annotated fragment - from the 清史稿 - can be used as primary source evidence that the person 林則徐 (ctext:186523) died on a particular date: specifically, 道光三十年十一月庚子 (which corresponds to 15 December 1850 in the Gregorian calendar).

In the annotation client, there are two ways of creating new knowledge claims:

  1. Manual extraction - using the mouse, drag to select the exact sentence or sentence fragment that supports the claim you want to add (note: this fragment must contain at least one annotation). The annotation client will suggest candidate subjects for your claim; click the appropriate subject - for instance in the above example, we would click on "林則徐". The client will then display on the right a form allowing you to add a claim about the selected entity based on your chosen evidence; select the appropriate verb (e.g. 'died-date') and target (e.g. 道光三十年十一月庚子 1850/12/15 [date:583078/30/11/37] ), and click "Add" to save the new claim.
  2. Automatic extraction - click the "Extract" button. Suggested claims will be extracted and highlighted in the text. To review and add a claim, click the "►" icon at the left of the highlighted section of text. If a complete claim has been identified, and you are satisfied that the evidence highlighted supports the claim, click the "Save" link to save it to the data wiki.
In either case, when annotating please make sure to follow the annotation conventions, and in general try to use existing annotations as a guide.