Towards Semantics-Enhanced Pre-Training: Can Lexicon Definitions Help Learning Sentence Meanings?

Abstract

Self-supervised pre-training techniques, relying on large amounts of text, have enabled rapid growth in bi-directional language representations for natural language understanding. However, as empirical models on sentences, they are subject to the input data distribution, inevitably incorporating data bias and reporting bias, which may lead to inaccurate understanding of sentences. To address the problem, we propose to adopt the human learner’s approach: when we cannot make sense of a word, we often consult the dictionary for specific meanings; but can the same work for empirical models? In this work, we try to inform the pre-trained models of word meanings for a further semantics-enhanced pre-training. To achieve a contrastive and holistic view of word meanings, a definition pair of two related words is presented to the masked language model such that the model can better associate a word with its crucial semantic features. Both intrinsic and extrinsic evaluation validates the proposed approach on semantics-orientated tasks, with an almost negligible increase of training data.

Publication
The Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021 (to appear)
Xuancheng Ren
Xuancheng Ren

My research interests include distributed robotics, mobile computing and programmable matter.