BERTweet：英语Tweets的预训练语言模型（CS CL）---用户7305506--瞎采新闻

我们介绍了BERTweet，这是第一个针对英语Tweets的公共大规模预训练语言模型。我们的BERTweet使用RoBERTa预训练程序进行训练（Liu等人，2019），其模型配置与BERT-base相同（Devlin等人，2019）。实验表明，BERTweet优于基于RoBERTa和XLM-R的强基准（Conneau等人，2020年），在三个Tweet NLP任务上，其性能结果均优于以前的最新模型：语音标记，命名实体识别和文本分类。我们发布BERTweet，以促进将来对Tweet数据的研究和下游应用。我们的BERTweet可在以下网址获得：https URL

原文标题：BERTweet: A pre-trained language model for English Tweets

原文：We present BERTweet, the first public large-scale pre-trained language model for English Tweets. Our BERTweet is trained using the RoBERTa pre-training procedure (Liu et al., 2019), with the same model configuration as BERT-base (Devlin et al., 2019). Experiments show that BERTweet outperforms strong baselines RoBERTa-base and XLM-R-base (Conneau et al., 2020), producing better performance results than the previous state-of-the-art models on three Tweet NLP tasks: Part-of-speech tagging, Named-entity recognition and text classification. We release BERTweet to facilitate future research and downstream applications on Tweet data. Our BERTweet is available at: this https URL

原文作者：Dat Quoc Nguyen, Thanh Vu, Anh Tuan Nguyen

原文地址：https://arxiv.org/abs/2005.10200

BERTweet：英语Tweets的预训练语言模型（CS CL）.pdf ---来自腾讯云社区的---用户7305506

给这篇文章的作者打赏

关于作者: 瞎采新闻

相关文章

热门文章

1渗透利器 | 常见的WebShell管理工具---Bypass

2美国新冠病毒确诊人数统计及预测---用户5908113

3什么时候使用 useMemo 和 useCallback---Nealyang

4Lua table 如何实现最快的 insert?---poslua

5Swift 实现腾讯云 TC3-HMAC-SHA256 签名方法---韦弦zhy