当前位置:
文档之家› 语料库研究中的主题词分析方法及其扩展
语料库研究中的主题词分析方法及其扩展
keywords if the corpus is to be
compared with a general corpus.
Keywords
☻Positive and negative keywords ☺ In a corpus of academic English, words such as morning, afternoon and evening are likely to be negative
Applicatons of…
☺ Compiling a specialized dictionary ☺ Detecting the topic ☺ Genre analysis ☺ Contrastive Interlanguage Analysis
☺ ……
Limitations to…
☻Keywords:
Keywords
☻Ways to find keywords: ☺ Top-down: corpus-based ☺ Buttom-up: corpus-driven
Applicatons of…
☺ ☺ ☺ ☺ Comparison across users Comparison across genres Comparison across times Comparison across (varieties of) languages
Keywords
Chi-square with Yate’s correction
Keywords
Loglikelihood References: /llwizard.html
Keywords
☻Previous research has revealed that loglikelihood is a better measure than chi-square when comparing word frequencies in corpora.
keywords if the corpus is to be
compared with a general corpus.
Keywords
☻Calculating keyness (Rayson et al. 2004, Oakes 1998) ☺ Chi-square
Keywords
Chi-square
☺ Do keywords have to be single words? Phraseology seems more interesting! ☺ Do keywords have to be lexical words?
POS tag sequences may also be interesting.
Limitations to…
☻Buttom-up: the problem is that I have
been given a long list of keywords, only some of which are interesting, buried
among many others which do not seem
corpus.
Keywords
☻Negative keywords: ☺ Words which occur less often than would be expected by chance in comparison with the reference
corpus.
Keywords
☻Positive and negative keywords ☺ In a corpus of business English, words such as business, profit and companies are likely to be positive
☺ Can we bring together the bottom-up
approach and the top-down approach?
Limitations to…
☻Top-down: the problem is I do not yet
Demo
Keywords
☻Keywords: ☺ Keywords are words whose frequency is unusually high (or low) in comparison with some norm.
(Scott, 2003)
Keywords
☻Positive keywords: ☺ Words which occur more often than would be expected by chance in comparison with the reference
语料库研究中的 An extension to 主题词分析方法及其扩展 the keyword approach in corpus analysis
中国外语教育研究中心 梁茂成
主要内容
Keywords Applications of corpus comparison Limitations to the keyword approach Keywords+
interesting at all.
Keywords+
☻Support multiword sequences ☻Support online search ☻Support POS tag sequences ☻Support regex search
Demo
☻demo
Thank you.