句法距离及其跨语言特征

Syntactic Distance and Its Cross-Linguistic Properties

  • 摘要: 本文旨在量化短语结构句法树中任意两个词之间的句法距离,并论证其跨语言特征以及在衡量句法复杂度上的意义。综合路径长度、层级深度和间隔词数量等因素和关系,提出一种基于树路径的句法距离计算方法。基于九种语言树库的计量分析发现:1)句法距离反映了与语言加工有关的认知努力,是句法复杂度的计量指标;2)人类语言的句法距离受工作记忆容量的限制,大小可能介于4~6之间;3)句法距离服从负二项分布,与句子长度正相关;4)句法距离的这些性质具有跨语言性。以上发现表明,句法距离是一个重要的理论概念,其价值与依存距离相当。本研究为短语结构语法补充了一个具有普遍意义的计量指标,使句法计量分析更广泛地应用于语言类型学、比较语言学、二语教学等领域成为可能。

     

    Abstract: This study aims to quantify the syntactic distance between any two words in a phrase-structure syntactic tree and to elucidate its cross-linguistic features as well as its significance in measuring syntactic complexity. Integrating path length, hierarchical depth, and intervening word count, we propose a tree-path-based computational method for syntactic distance. A quantitative analysis based on treebanks across nine languages yields the following findings: 1) Syntactic distance reflects the cognitive effort associated with language processing and is a metric of cross-linguistic syntactic complexity; 2) The mean syntactic distances of human languages are limited to the capacity of working memory and probably fall in the range of 4 to 6; 3) The distribution of syntactic distance abides by the negative binomial distribution and demonstrates a positive correlation with sentence length; 4) These properties of syntactic distance are cross-linguistic. The findings suggest that syntactic distance is a pivotal theoretical construct with a value comparable to that of dependency distance. This study supplements phrase-structure grammar with a universally meaningful quantitative metric, making it possible to apply syntactic quantitative analysis more extensively to such fields as linguistic typology, comparative linguistics and second language teaching.

     

/

返回文章
返回