自然语言推理榜*记最佳得分(目前只有ocnli任务集有效)
排行 | 模型 | 研究机构 | 测评时间 | Score | 认证 | OCNLI_50K | OCNLI_30K |
---|
1 | HUMAN-OCNLI | CLUE | 20-10-07 | 90.300 | 已认证 | 90.300 | 90.300 |
2 | STBert | PCG搜索中台 | 22-11-25 | 84.483 | 待认证 | 86.933 | 82.033 |
3 | ShenZhou | QQ浏览器实验室(QQ Browser Lab) | 21-09-19 | 83.534 | 待认证 | 86.367 | 80.700 |
4 | Motian | QQ浏览器搜索 | 21-06-25 | 82.834 | 待认证 | 84.967 | 80.700 |
5 | wyd1 | wyd1 | 24-05-10 | 80.700 | 待认证 | 80.700 | 80.700 |
6 | NM-BERT | CMB-AI-zh | 22-03-31 | 79.284 | 待认证 | 79.267 | 79.300 |
7 | Macbert_text_understanding | 游击队 | 22-03-22 | 79.267 | 待认证 | 79.167 | 79.367 |
8 | ensemble_model | 奋斗的咸鱼 | 22-04-18 | 79.217 | 待认证 | 79.367 | 79.067 |
9 | ensemble_model | CMB AI Lab | 22-04-19 | 79.217 | 待认证 | 79.367 | 79.067 |
10 | bert | 别动我的汉堡包 | 22-04-21 | 79.217 | 待认证 | 79.367 | 79.067 |
11 | M6-LM | ETONTHEFLY | 21-01-21 | 79.050 | 待认证 | 79.100 | 79.000 |
12 | BERT | ETONTHEFLY | 21-01-15 | 78.916 | 待认证 | 78.833 | 79.000 |
13 | Macbert_text_understanding | 游击队 | 22-04-13 | 77.867 | 待认证 | 76.667 | 79.067 |
14 | robert-base | 我爱吃羊肉 | 22-09-27 | 70.900 | 待认证 | 70.900 | 70.900 |
15 | bert_chinese_base | wyd1 | 24-04-22 | 70.467 | 待认证 | 70.467 | 70.467 |
16 | test | - | 20-11-15 | 69.816 | 待认证 | 71.000 | 68.633 |
17 | bert-chinese | 123 | 24-04-20 | 69.750 | 待认证 | 70.300 | 69.200 |
18 | test | t | 22-03-30 | 52.667 | 待认证 | 52.667 | 52.667 |
19 | adsf | asdf | 21-07-29 | 49.283 | 待认证 | 98.567 | 0.0 |
20 | asd | asdf | 21-07-29 | 49.267 | 待认证 | 98.533 | 0.0 |
https://github.com/CLUEbenchmark/OCNLI
人类测评的得分
-
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
PCG搜索中台
大量无标签数据自训练
340M
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://mp.weixin.qq.com/s/PODShmOo0tg9cmchNhzvtw
自研神舟预训练大模型;百亿级别参数量,2TB 高价值数据(ensemble)
10B
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://mp.weixin.qq.com/s/HQL0Hk49UR6kVNtrvcXEGA
摩天预训练大模型;十亿级别参数量,1TB 高价值数据,优化Masked language model遮蔽方案,研发一种相对位置编码方案,大规模/大batch预训练模型训练能力优化
1B
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/OCNLI/blob/main/data/ocnli/dev.json
1
1
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://cluebenchmarks.com/submit.html
null
~1B
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/Kerry0123/text_understanding.git
基于macbert训练的文本理解模型,无外部数据增加
330M
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://gitee.com/DianaFly/clue-test
ensemble_model
94,250,496
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/CLUE
ensemble_model
94,250,496
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://cluebenchmarks.com/submit.html
bert
~3B
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://www.cluebenchmarks.com/
M6-LM small
M6-LM small
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://www.cluebenchmarks.com/
bert large
bert large
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/Kerry0123/text_understanding
基于macbert训练的文本理解模型,无外部数据增加
330M
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://arxiv.org/abs/1907.11692
在roberta-base上进行fine tuning得到了该结果
总参数量为125M
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/OCNLI/blob/main/data/ocnli/dev.json
1
-
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
-
-
-
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://github.com/CLUEbenchmark/CLUE
qwe
qwe
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
https://cluebenchmarks.com/submit.html
test
300B
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
asdf
asdf
adsf
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
asdf
asdf
asdf
| C | E | N |
---|
C | 0.0 | 0.0 | 0.0 |
E | 0.0 | 0.0 | 0.0 |
N | 0.0 | 0.0 | 0.0 |
ALBERT(Ensemble)
GitHub/模型网址:
提交日期:9月17日
分数:9月17日
更多详情:
参数说明
单任务微调。我们从MNLI为RTE、STS和MRPC优化的模型开始
诊断信息
诊断主混淆矩阵
|
C |
N |
E |
C |
182 |
36 |
40 |
N |
81 |
189 |
116 |
E |
17 |
69 |
374 |
C = 对立
N = 不包含
E = 包含