CLUE1.0分类任务排行榜 CLUE1.1/1.0提交规则
| 项目地址
CLUE1.1与CLUE1.0区别:区别与原有的CLUE1.0,CLUE1.1在部分任务启用了新的测试集,训练集和验证集保持不变;CLUE1.0保留CMNLI自然语言推理任务
模型
排行 | 模型 | 研究机构 | 测评时间 | Score1.0 | 认证 | AFQMC | TNEWS1.0 | IFLYTEK | CMNLI | OCNLI_50K | WSC1.0 | CSL |
---|
1 | TI-NLP | 优图实验室 & 腾讯云 | 21-10-19 | 83.251 | 待认证 | 82.7 | 79.3 | 65.23 | 84.31 | 84.57 | 96.55 | 90.1 |
2 | ShenZhou | QQ浏览器实验室(QQ Browser Lab) | 21-09-19 | 83.247 | 待认证 | 80.55 | 74.15 | 67.65 | 86.49 | 86.37 | 96.55 | 90.97 |
3 | HUMAN | CLUE | 19-12-01 | 82.943 | 已认证 | 81 | 71 | 80.3 | 76 | 90.3 | 98 | 84 |
4 | Mengzi | 澜舟科技-创新工场 | 21-09-14 | 82.436 | 待认证 | 81.79 | 75.06 | 65.08 | 86.13 | 82.57 | 96.55 | 89.87 |
5 | BERTSG | Sogou Search | 21-06-25 | 81.991 | 待认证 | 79.85 | 74.15 | 64.54 | 85.3 | 85.93 | 95.17 | 89 |
6 | Motian | QQ浏览器搜索 | 21-06-25 | 81.764 | 待认证 | 78.3 | 73.18 | 65.46 | 85.44 | 84.97 | 94.83 | 90.17 |
7 | Pangu | 华为云-循环智能 | 21-04-23 | 81.016 | 待认证 | 78.11 | 72.07 | 65.19 | 85.19 | 83.3 | 95.52 | 87.73 |
8 | PLUG | Alibaba DAMO NLP | 21-04-18 | 80.614 | 待认证 | 77.44 | 73.06 | 64 | 84.95 | 83.27 | 94.48 | 87.1 |
9 | Bert | lihaiyu | 21-04-08 | 79.663 | 待认证 | 75.6 | 70.32 | 64.92 | 84.55 | 81.73 | 93.45 | 87.07 |
10 | MT-BERTs | Meituan NLP | 21-03-10 | 79.624 | 待认证 | 77.36 | 70.03 | 64.31 | 85.14 | 83.47 | 89.66 | 87.4 |
11 | Knowledge-based | 姜汁柠檬 | 21-07-23 | 79.611 | 待认证 | 76.87 | 70.2 | 63.73 | 87.97 | 81.13 | 92.41 | 84.97 |
12 | LICHEE | 腾讯看点 | 21-01-08 | 79.364 | 待认证 | 76.97 | 70.5 | 64.15 | 84.54 | 81.3 | 90.69 | 87.4 |
13 | roberta_selfrun | OPPO小布助手 | 21-09-29 | 79.269 | 待认证 | 77.88 | 69.37 | 63.92 | 82.94 | 80.4 | 93.1 | 87.27 |
14 | UER-ensemble | TencentPretrain & TI-ONE | 20-11-28 | 79.154 | 待认证 | 76.82 | 72.2 | 64 | 84.09 | 80.8 | 90.34 | 85.83 |
15 | BERTs | BERTs | 20-12-24 | 79.107 | 待认证 | 76.77 | 69.94 | 63.92 | 84.48 | 82.9 | 88.97 | 86.77 |
16 | Archer-24E-SINGLE | search-nlp | 20-12-24 | 79.086 | 待认证 | 77.26 | 69.54 | 62.27 | 85.23 | 83.57 | 90 | 85.73 |
17 | selfrun-ensemble | OPPO小布助手 | 20-12-22 | 78.674 | 待认证 | 76.09 | 69.1 | 63.92 | 82.56 | 80.4 | 91.38 | 87.27 |
18 | dfasdfadfa | dfafdaf | 22-10-12 | 78.661 | 待认证 | 76.72 | 68.31 | 63.31 | 84.98 | 81.1 | 88.28 | 87.93 |
19 | roformer&erlangshen | huangjh | 22-11-09 | 78.661 | 待认证 | 76.72 | 68.31 | 63.31 | 84.98 | 81.1 | 88.28 | 87.93 |
20 | Archer-24l | search-nlp | 20-11-30 | 78.550 | 待认证 | 77.44 | 69.96 | 62.69 | 84.78 | 82.57 | 87.24 | 85.17 |
https://github.com/dbiir/UER-py
tiger
自研TI-NLP预训练大模型;百亿级别参数,6T高质量语料
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://mp.weixin.qq.com/s/PODShmOo0tg9cmchNhzvtw
自研神舟预训练大模型;百亿级别参数量,2TB 高价值数据(ensemble)
10B
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/CLUE
人类测评的得分
-
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://langboat.com/
孟子预训练模型;1B级别参数量;数百G高质量语料(互联网网页、社区、新闻、电子商务、金融等)。基于transformer的denoising 预训练模型。
1B数量级
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
None
BERTSG
BERTSG
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://mp.weixin.qq.com/s/HQL0Hk49UR6kVNtrvcXEGA
摩天预训练大模型;十亿级别参数量,1TB 高价值数据,优化Masked language model遮蔽方案,研发一种相对位置编码方案,大规模/大batch预训练模型训练能力优化
1B
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://mp.weixin.qq.com/s/gHoeUiZ2b4IvAb-S-wMdtw
盘古大模型
1100亿参数,40TB 的行业文本数据和超过 400 万小时的行业语音数据
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://mp.weixin.qq.com/s/DAQomIkDa52Sef-ruyH5qg
270 亿参数、1TB + 训练数据 PLUG
-
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
N / A
Bert ensemble average
bert
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
MT-BERTs
MT-BERTs
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/CLUE
clue
24 layers
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://mp.weixin.qq.com/s/em_mzM71edVA7XzrkeLDnw
大规模预训练,结构创新。https://github.com/BitVoyage/lichee
大规模预训练,结构创新
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/ZeroCLUE.git
na
na
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/dbiir/UER-py
半监督学习;Dynamic masking;Span masking;MLM目标任务;对抗训练;DUMA;WWM;(跨语言)多任务学习;模型集成
36层Transformer
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
BERTs
BERTs
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/zhouxincheng/super-bert
24层模型ALBERT
24层 ALBERT
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/CLUE
na
na
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
dafasd
daffeda
dafa
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/CLUENER2020
-
-
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/zhouxincheng/super-bert
24层 Bert
-
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
ALBERT(Ensemble)
GitHub/模型网址:
提交日期:9月17日
分数:9月17日
更多详情:
参数说明
单任务微调。我们从MNLI为RTE、STS和MRPC优化的模型开始
诊断信息
诊断主混淆矩阵
|
C |
N |
E |
C |
182 |
36 |
40 |
N |
81 |
189 |
116 |
E |
17 |
69 |
374 |
C = 对立
N = 不包含
E = 包含