Title: Chinese Word Segmentation Applied to Written Cantonese Abstract: Chinese text is written without spaces to separate words, and individual characters can used be used both as a whole word and as a part of a compound word: a word of two or more characters, e.g. (learn), Éú (student), (mathematics) all contain a common character. This leads to ambiguous sentences which could be read differently depending on the word boundaries, or word segmentation, inferred by the reader. This causes no problem for human readers who understand the meaning of text they read: most incorrect segmentations would not make sense. However, as a computer cannot understand the meaning of a sentence, it just executes an algorithm, which makes this a challenging problem to automate. Chinese word segmentation (CWS) is incorporated into other text processing tasks, such as information retrieval; text-to-speech and machine translation, and has been the subject of several competitions. The corpora used for these competitions are all standard written Chinese (SWC) and mostly simplified characters. However, SWC uses different words and word order to spoken Cantonese and there is another written form of Cantonese, which is used widely in informal writing and when the exact wording of the speaker is important. There are very few applications of CWS to written Cantonese in the literature. In this talk I will introduce the problem of CWS, including some of the complications of Chinese text in general and problems specific to written Cantonese. I will then describe some CWS algorithms and the application of them to a written Cantonese corpus.