tag-based-multi-span-extraction

代码:https://github.com/eladsegal/tag-based-multi-span-extraction

论文:A Simple and Effective Model for Answering Multi-span Questions

  • 配置环境变量添加代理
1
scp -r zhaoxiaofeng@219.216.64.175:~/.proxychains ./

修改~/.bashrc,在末尾添加指令别名

1
2
alias proxy=/data0/zhaoxiaofeng/usr/bin/proxychains4 # 77, 175, 206只添加这条 
alias aliasproxy=/home/zhaoxiaofeng/usr/bin/proxychains4 # 154只添加这条
  • 下载代码:
1
git clone https://github.com/eladsegal/tag-based-multi-span-extraction
  • 配置环境
1
2
3
4
5
proxy conda create -n allennlp python=3.6.9
proxy pip install -r requirements.txt
proxy conda install pytorch torchvision torchaudio cudatoolkit=10.1 -c pytorch
pip install en_core_web_sm-2.1.0.tar.gz
加载本地预训练模型(参考 预训练模型——替换为本地文件)
  • 训练模型

    可以使用nohup + &在后台训练

    tail -f nohup.txt 可以实时查看日志

    nohup command >> nohup.out 2>&1 &

    • 2>&1的意思是将标准错误(2)也定向到标准输出(1)的输出文件中。

RoBERTa TASE_IO + SSE

1
allennlp train configs/drop/roberta/drop_roberta_large_TASE_IO_SSE.jsonnet -s training_directory -f --include-package src

服务器运行:

1
nohup allennlp train configs/drop/roberta/drop_roberta_large_TASE_IO_SSE.jsonnet -s training_directory_base -f --include-package src >> base_log.out 2>&1 &

或:

1
allennlp train download_data/config.json -s training_directory --include-package src

Bert_large TASE_BIO + SSE

-f :可以清空训练数据文件夹,重新训练

-r:可以从之前的训练状态恢复

1
allennlp train configs/drop/bert/drop_bert_large_TASE_BIO_SSE.jsonnet -s training_directory_bert -f --include-package src

服务器运行:

1
nohup allennlp train configs/drop/bert/drop_bert_large_TASE_BIO_SSE.jsonnet -s training_directory_bert -f --include-package src >> bertlog.out 2>&1 &
  • 预测模型

    cuda-device 只能使用一个GPU

    后文详细介绍

  • 评估模型

    后文详细介绍

预训练模型——替换为本地文件

定位技巧:

1
find 路径 | grep -ri  “字符串” -l

下载文件:

1
proxy wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt

快速下载文件技巧:

由于python中无法执行proxy,报错sh: 1: proxy: not found只能通过手动方式进行,故编写如下脚本,实现命令生成,打印,复制打印的内容并执行可实现批量下载。

1
2
3
4
5
6
7
8
9
10
import os
BERT_PRETRAINED_MODEL_ARCHIVE_MAP = {
'bert-base-uncased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-uncased-pytorch_model.bin",
'bert-large-uncased': "/data0/maqi/pretrained_model/modeling_bert/bert-large-uncased-pytorch_model.bin",
'bert-base-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-cased-pytorch_model.bin",
'bert-large-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-large-cased-pytorch_model.bin"
}
for url in BERT_PRETRAINED_MODEL_ARCHIVE_MAP.values():
urls= 'proxy wget '+url
print(urls)

执行结果:

1
2
3
4
5
proxy wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json
proxy wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-config.json
proxy wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-config.json
proxy wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-config.json
proxy wget https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-uncased-config.json

涉及文件:

1
2
3
4
5
6
7
8
9
10
11
12
13
/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_roberta.py

/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_roberta.py

/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_roberta.py

/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_utils.py

/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_bert.py

/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_bert.py

/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_bert.py
  • 各服务器路径
服务器 路径
202.199.6.77 /data0/maqi
219.216.64.206 /data0/maqi
219.216.64.175
219.216.64.154
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
scp  /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_roberta.py  maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_roberta.py

scp /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_roberta.py maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_roberta.py

scp /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_roberta.py maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_roberta.py

scp /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_utils.py maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_utils.py

scp /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_bert.py maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_bert.py

scp /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_bert.py maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_bert.py


scp /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_bert.py maqi@202.199.6.77:/data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_bert.py

  • tokenization_roberta.py
1
vim /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/tokenization_roberta.py

原始:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
PRETRAINED_VOCAB_FILES_MAP = {
'vocab_file':
{
'roberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-vocab.json",
'roberta-large': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-vocab.json",
'roberta-large-mnli': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-mnli-vocab.json",
'distilroberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/distilroberta-base-vocab.json",
'roberta-base-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-vocab.json",
'roberta-large-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-vocab.json",
},
'merges_file':
{
'roberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-merges.txt",
'roberta-large': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-merges.txt",
'roberta-large-mnli': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-mnli-merges.txt",
'distilroberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/distilroberta-base-merges.txt",
'roberta-base-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-merges.txt",
'roberta-large-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-merges.txt",
},
}

替换:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
PRETRAINED_VOCAB_FILES_MAP = {
'vocab_file':
{
'roberta-base': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-base-vocab.json",
'roberta-large': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-large-vocab.json",
'roberta-large-mnli': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-large-mnli-vocab.json",
'distilroberta-base': "/data0/maqi/pretrained_model/tokenization_roberta/distilroberta-base-vocab.json",
'roberta-base-openai-detector': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-base-vocab.json",
'roberta-large-openai-detector': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-large-vocab.json",
},
'merges_file':
{
'roberta-base': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-base-merges.txt",
'roberta-large': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-large-merges.txt",
'roberta-large-mnli': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-large-mnli-merges.txt",
'distilroberta-base': "/data0/maqi/pretrained_model/tokenization_roberta/distilroberta-base-merges.txt",
'roberta-base-openai-detector': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-base-merges.txt",
'roberta-large-openai-detector': "/data0/maqi/pretrained_model/tokenization_roberta/roberta-large-merges.txt",
},
}
  • modeling_roberta.py
1
vim /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_roberta.py

原始:

1
2
3
4
5
6
7
8
ROBERTA_PRETRAINED_MODEL_ARCHIVE_MAP = {
'roberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-pytorch_model.bin",
'roberta-large': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-pytorch_model.bin",
'roberta-large-mnli': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-mnli-pytorch_model.bin",
'distilroberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/distilroberta-base-pytorch_model.bin",
'roberta-base-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-openai-detector-pytorch_model.bin",
'roberta-large-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-openai-detector-pytorch_model.bin",
}

替换:

1
2
3
4
5
6
7
8
ROBERTA_PRETRAINED_MODEL_ARCHIVE_MAP = {
'roberta-base': "/data0/maqi/pretrained_model/modeling_roberta/roberta-base-pytorch_model.bin",
'roberta-large': "/data0/maqi/pretrained_model/modeling_roberta/roberta-large-pytorch_model.bin",
'roberta-large-mnli': "/data0/maqi/pretrained_model/modeling_roberta/roberta-large-mnli-pytorch_model.bin",
'distilroberta-base': "/data0/maqi/pretrained_model/modeling_roberta/distilroberta-base-pytorch_model.bin",
'roberta-base-openai-detector': "/data0/maqi/pretrained_model/modeling_roberta/roberta-base-openai-detector-pytorch_model.bin",
'roberta-large-openai-detector': "/data0/maqi/pretrained_model/modeling_roberta/roberta-large-openai-detector-pytorch_model.bin",
}
  • configuration_roberta.py
1
vim /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_roberta.py

原始:

1
2
3
4
5
6
7
8
ROBERTA_PRETRAINED_CONFIG_ARCHIVE_MAP = {
'roberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-config.json",
'roberta-large': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-config.json",
'roberta-large-mnli': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-mnli-config.json",
'distilroberta-base': "https://s3.amazonaws.com/models.huggingface.co/bert/distilroberta-base-config.json",
'roberta-base-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-base-openai-detector-config.json",
'roberta-large-openai-detector': "https://s3.amazonaws.com/models.huggingface.co/bert/roberta-large-openai-detector-config.json",
}

替换:

1
2
3
4
5
6
7
8
ROBERTA_PRETRAINED_CONFIG_ARCHIVE_MAP = {
'roberta-base': "/data0/maqi/pretrained_model/configuration_roberta/roberta-base-config.json",
'roberta-large': "/data0/maqi/pretrained_model/configuration_roberta/roberta-large-config.json",
'roberta-large-mnli': "/data0/maqi/pretrained_model/configuration_roberta/roberta-large-mnli-config.json",
'distilroberta-base': "/data0/maqi/pretrained_model/configuration_roberta/distilroberta-base-config.json",
'roberta-base-openai-detector': "/data0/maqi/pretrained_model/configuration_roberta/roberta-base-openai-detector-config.json",
'roberta-large-openai-detector': "/data0/maqi/pretrained_model/configuration_roberta/roberta-large-openai-detector-config.json",
}
  • tokenization_bert.py

原始:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
PRETRAINED_VOCAB_FILES_MAP = {
'vocab_file':
{
'bert-base-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt",
'bert-large-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-vocab.txt",
'bert-base-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-vocab.txt",
'bert-large-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-vocab.txt",
'bert-base-multilingual-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-uncased-vocab.txt",
'bert-base-multilingual-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-vocab.txt",
'bert-base-chinese': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese-vocab.txt",
'bert-base-german-cased': "https://int-deepset-models-bert.s3.eu-central-1.amazonaws.com/pytorch/bert-base-german-cased-vocab.txt",
'bert-large-uncased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-vocab.txt",
'bert-large-cased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-vocab.txt",
'bert-large-uncased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-finetuned-squad-vocab.txt",
'bert-large-cased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-finetuned-squad-vocab.txt",
'bert-base-cased-finetuned-mrpc': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-mrpc-vocab.txt",
'bert-base-german-dbmdz-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-cased-vocab.txt",
'bert-base-german-dbmdz-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-uncased-vocab.txt",
'bert-base-finnish-cased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-cased-v1/vocab.txt",
'bert-base-finnish-uncased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-uncased-v1/vocab.txt",
}
}

替换:

本地文件路径:/data0/maqi/pretrained_model/tokenization_bert

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
PRETRAINED_VOCAB_FILES_MAP = {
'vocab_file':
{
'bert-base-uncased': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-uncased-vocab.txt",
'bert-large-uncased': "/data0/maqi/pretrained_model/tokenization_bert/bert-large-uncased-vocab.txt",
'bert-base-cased': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-cased-vocab.txt",
'bert-large-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-vocab.txt",
'bert-base-multilingual-uncased': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-multilingual-uncased-vocab.txt",
'bert-base-multilingual-cased': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-multilingual-cased-vocab.txt",
'bert-base-chinese': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-chinese-vocab.txt",
'bert-base-german-cased': "https://int-deepset-models-bert.s3.eu-central-1.amazonaws.com/pytorch/bert-base-german-cased-vocab.txt",
'bert-large-uncased-whole-word-masking': "/data0/maqi/pretrained_model/tokenization_bert/bert-large-uncased-whole-word-masking-vocab.txt",
'bert-large-cased-whole-word-masking': "/data0/maqi/pretrained_model/tokenization_bert/bert-large-cased-whole-word-masking-vocab.txt",
'bert-large-uncased-whole-word-masking-finetuned-squad': "/data0/maqi/pretrained_model/tokenization_bert/bert-large-uncased-whole-word-masking-finetuned-squad-vocab.txt",
'bert-large-cased-whole-word-masking-finetuned-squad': "/data0/maqi/pretrained_model/tokenization_bert/bert-large-cased-whole-word-masking-finetuned-squad-vocab.txt",
'bert-base-cased-finetuned-mrpc': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-cased-finetuned-mrpc-vocab.txt",
'bert-base-german-dbmdz-cased': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-german-dbmdz-cased-vocab.txt",
'bert-base-german-dbmdz-uncased': "/data0/maqi/pretrained_model/tokenization_bert/bert-base-german-dbmdz-uncased-vocab.txt",
'bert-base-finnish-cased-v1': "/data0/maqi/pretrained_model/tokenization_bert/TurkuNLP/bert-base-finnish-cased-v1/vocab.txt",
'bert-base-finnish-uncased-v1': "/data0/maqi/pretrained_model/tokenization_bert/TurkuNLP/bert-base-finnish-uncased-v1/vocab.txt",
}
}
  • configuration_bert.py
1
vim /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/configuration_bert.py

原始:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
BERT_PRETRAINED_CONFIG_ARCHIVE_MAP = {
'bert-base-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json",
'bert-large-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-config.json",
'bert-base-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-config.json",
'bert-large-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-config.json",
'bert-base-multilingual-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-uncased-config.json",
'bert-base-multilingual-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-config.json",
'bert-base-chinese': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese-config.json",
'bert-base-german-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-cased-config.json",
'bert-large-uncased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-config.json",
'bert-large-cased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-config.json",
'bert-large-uncased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-finetuned-squad-config.json",
'bert-large-cased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-finetuned-squad-config.json",
'bert-base-cased-finetuned-mrpc': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-mrpc-config.json",
'bert-base-german-dbmdz-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-cased-config.json",
'bert-base-german-dbmdz-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-uncased-config.json",
'bert-base-japanese': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-config.json",
'bert-base-japanese-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-whole-word-masking-config.json",
'bert-base-japanese-char': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-char-config.json",
'bert-base-japanese-char-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-char-whole-word-masking-config.json",
'bert-base-finnish-cased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-cased-v1/config.json",
'bert-base-finnish-uncased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-uncased-v1/config.json",
}

替换:

本地文件:/data0/maqi/pretrained_model/configuration_bert

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
BERT_PRETRAINED_CONFIG_ARCHIVE_MAP = {
'bert-base-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-config.json",
'bert-large-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-config.json",
'bert-base-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-config.json",
'bert-large-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-config.json",
'bert-base-multilingual-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-uncased-config.json",
'bert-base-multilingual-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-config.json",
'bert-base-chinese': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese-config.json",
'bert-base-german-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-cased-config.json",
'bert-large-uncased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-config.json",
'bert-large-cased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-config.json",
'bert-large-uncased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-finetuned-squad-config.json",
'bert-large-cased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-finetuned-squad-config.json",
'bert-base-cased-finetuned-mrpc': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-mrpc-config.json",
'bert-base-german-dbmdz-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-cased-config.json",
'bert-base-german-dbmdz-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-uncased-config.json",
'bert-base-japanese': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-config.json",
'bert-base-japanese-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-whole-word-masking-config.json",
'bert-base-japanese-char': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-char-config.json",
'bert-base-japanese-char-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-char-whole-word-masking-config.json",
'bert-base-finnish-cased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-cased-v1/config.json",
'bert-base-finnish-uncased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-uncased-v1/config.json",
}
  • modeling_bert.py
1
vim /data0/maqi/.conda/envs/allennlp/lib/python3.6/site-packages/transformers/modeling_bert.py

原始:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
BERT_PRETRAINED_MODEL_ARCHIVE_MAP = {
'bert-base-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-pytorch_model.bin",
'bert-large-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-pytorch_model.bin",
'bert-base-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-pytorch_model.bin",
'bert-large-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-pytorch_model.bin",
'bert-base-multilingual-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-uncased-pytorch_model.bin",
'bert-base-multilingual-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-pytorch_model.bin",
'bert-base-chinese': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese-pytorch_model.bin",
'bert-base-german-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-cased-pytorch_model.bin",
'bert-large-uncased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-pytorch_model.bin",
'bert-large-cased-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-pytorch_model.bin",
'bert-large-uncased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-whole-word-masking-finetuned-squad-pytorch_model.bin",
'bert-large-cased-whole-word-masking-finetuned-squad': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-cased-whole-word-masking-finetuned-squad-pytorch_model.bin",
'bert-base-cased-finetuned-mrpc': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-cased-finetuned-mrpc-pytorch_model.bin",
'bert-base-german-dbmdz-cased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-cased-pytorch_model.bin",
'bert-base-german-dbmdz-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-uncased-pytorch_model.bin",
'bert-base-japanese': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-pytorch_model.bin",
'bert-base-japanese-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-whole-word-masking-pytorch_model.bin",
'bert-base-japanese-char': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-char-pytorch_model.bin",
'bert-base-japanese-char-whole-word-masking': "https://s3.amazonaws.com/models.huggingface.co/bert/cl-tohoku/bert-base-japanese-char-whole-word-masking-pytorch_model.bin",
'bert-base-finnish-cased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-cased-v1/pytorch_model.bin",
'bert-base-finnish-uncased-v1': "https://s3.amazonaws.com/models.huggingface.co/bert/TurkuNLP/bert-base-finnish-uncased-v1/pytorch_model.bin",
}

替换:

本地路径:/data0/maqi/pretrained_model/modeling_bert

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
BERT_PRETRAINED_MODEL_ARCHIVE_MAP = {
'bert-base-uncased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-uncased-pytorch_model.bin",
'bert-large-uncased': "/data0/maqi/pretrained_model/modeling_bert/bert-large-uncased-pytorch_model.bin",
'bert-base-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-cased-pytorch_model.bin",
'bert-large-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-large-cased-pytorch_model.bin",
'bert-base-multilingual-uncased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-multilingual-uncased-pytorch_model.bin",
'bert-base-multilingual-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-multilingual-cased-pytorch_model.bin",
'bert-base-chinese': "/data0/maqi/pretrained_model/modeling_bert/bert-base-chinese-pytorch_model.bin",
'bert-base-german-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-german-cased-pytorch_model.bin",
'bert-large-uncased-whole-word-masking': "/data0/maqi/pretrained_model/modeling_bert/bert-large-uncased-whole-word-masking-pytorch_model.bin",
'bert-large-cased-whole-word-masking': "/data0/maqi/pretrained_model/modeling_bert/bert-large-cased-whole-word-masking-pytorch_model.bin",
'bert-large-uncased-whole-word-masking-finetuned-squad': "/data0/maqi/pretrained_model/modeling_bert/bert-large-uncased-whole-word-masking-finetuned-squad-pytorch_model.bin",
'bert-large-cased-whole-word-masking-finetuned-squad': "/data0/maqi/pretrained_model/modeling_bert/bert-large-cased-whole-word-masking-finetuned-squad-pytorch_model.bin",
'bert-base-cased-finetuned-mrpc': "/data0/maqi/pretrained_model/modeling_bert/bert-base-cased-finetuned-mrpc-pytorch_model.bin",
'bert-base-german-dbmdz-cased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-german-dbmdz-cased-pytorch_model.bin",
'bert-base-german-dbmdz-uncased': "/data0/maqi/pretrained_model/modeling_bert/bert-base-german-dbmdz-uncased-pytorch_model.bin",
'bert-base-japanese': "/data0/maqi/pretrained_model/modeling_bert/cl-tohoku/bert-base-japanese-pytorch_model.bin",
'bert-base-japanese-whole-word-masking': "/data0/maqi/pretrained_model/modeling_bert/cl-tohoku/bert-base-japanese-whole-word-masking-pytorch_model.bin",
'bert-base-japanese-char': "/data0/maqi/pretrained_model/modeling_bert/cl-tohoku/bert-base-japanese-char-pytorch_model.bin",
'bert-base-japanese-char-whole-word-masking': "/data0/maqi/pretrained_model/modeling_bert/cl-tohoku/bert-base-japanese-char-whole-word-masking-pytorch_model.bin",
'bert-base-finnish-cased-v1': "/data0/maqi/pretrained_model/modeling_bert/TurkuNLP/bert-base-finnish-cased-v1/pytorch_model.bin",
'bert-base-finnish-uncased-v1': "/data0/maqi/pretrained_model/modeling_bert/TurkuNLP/bert-base-finnish-uncased-v1/pytorch_model.bin",
}

执行流程

针对tag-based-multi-span-extraction/configs/drop/roberta/drop_roberta_large_TASE_IO_SSE.jsonnet​分析

  • dataset_reader

    “is_training”: true,设置为训练模式

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
"dataset_reader": {
"type": "tbmse_drop",//选择src/data/dataset_readers/drop/drop_reader.py
"answer_field_generators": {
"arithmetic_answer": {
"type": "arithmetic_answer_generator",//选择src/data/dataset_readers/answer_field_generators/arithmetic_answer_generator.py
"special_numbers": [
100,
1
]
},
"count_answer": {
"type": "count_answer_generator"//选择src/data/dataset_readers/answer_field_generators/count_answer_generator.py
},
"passage_span_answer": {
"type": "span_answer_generator",//选择src/data/dataset_readers/answer_field_generators/span_answer_generator.py
"text_type": "passage"//参数
},
"question_span_answer": {
"type": "span_answer_generator",//选择src/data/dataset_readers/answer_field_generators/span_answer_generator.py
"text_type": "question"//参数
},
"tagged_answer": {
"type": "tagged_answer_generator",//选择src/data/dataset_readers/answer_field_generators/tagged_answer_generator.py
"ignore_question": false,
"labels": {
"I": 1,
"O": 0
}
}
},
"answer_generator_names_per_type": {//drop_reader.py的参数
"date": [
"arithmetic_answer",
"passage_span_answer",
"question_span_answer",
"tagged_answer"
],
"multiple_span": [
"tagged_answer"
],
"number": [
"arithmetic_answer",
"count_answer",
"passage_span_answer",
"question_span_answer",
"tagged_answer"
],
"single_span": [
"tagged_answer",
"passage_span_answer",
"question_span_answer"
]
},
"is_training": true,
"lazy": true,
"old_reader_behavior": true,
"pickle": {
"action": "load",
"file_name": "all_heads_IO_roberta-large",
"path": "../pickle/drop"
},
"tokenizer": {
"type": "huggingface_transformers",//选择src/data/tokenizers/huggingface_transformers_tokenizer.py
"pretrained_model": "roberta-large"//参数
}
},
  • model
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
"model": {
"type": "multi_head",//选择src/models/multi_head_model.py
"dataset_name": "drop",
"head_predictor": {
"activations": [
"relu",
"linear"
],
"dropout": [
0.1,
0
],
"hidden_dims": [
1024,
5
],
"input_dim": 2048,
"num_layers": 2
},
"heads": {
"arithmetic": {
"type": "arithmetic_head",//选择src/modules/heads/arithmetic_head.py
"output_layer": {
"activations": [
"relu",
"linear"
],
"dropout": [
0.1,
0
],
"hidden_dims": [
1024,
3
],
"input_dim": 2048,
"num_layers": 2
},
"special_embedding_dim": 1024,
"special_numbers": [
100,
1
],
"training_style": "soft_em"
},
"count": {
"type": "count_head",//选择src/modules/heads/count_head.py
"max_count": 10,
"output_layer": {
"activations": [
"relu",
"linear"
],
"dropout": [
0.1,
0
],
"hidden_dims": [
1024,
11
],
"input_dim": 1024,
"num_layers": 2
}
},
"multi_span": {
"type": "multi_span_head",//选择src/modules/heads/multi_span_head.py
"decoding_style": "at_least_one",
"ignore_question": false,
"labels": {
"I": 1,
"O": 0
},
"output_layer": {
"activations": [
"relu",
"linear"
],
"dropout": [
0.1,
0
],
"hidden_dims": [
1024,
2
],
"input_dim": 1024,
"num_layers": 2
},
"prediction_method": "viterbi",
"training_style": "soft_em"
},
"passage_span": {//继承了src/modules/heads/single_span_head.py
"type": "passage_span_head",//选择src/modules/heads/passage_span_head.py
"end_output_layer": {
"activations": "linear",
"hidden_dims": 1,
"input_dim": 1024,
"num_layers": 1
},
"start_output_layer": {
"activations": "linear",
"hidden_dims": 1,
"input_dim": 1024,
"num_layers": 1
},
"training_style": "soft_em"
},
"question_span": {//继承了src/modules/heads/single_span_head.py
"type": "question_span_head",//选择src/modules/heads/question_span_head.py
"end_output_layer": {
"activations": [
"relu",
"linear"
],
"dropout": [
0.1,
0
],
"hidden_dims": [
1024,
1
],
"input_dim": 2048,
"num_layers": 2
},
"start_output_layer": {
"activations": [
"relu",
"linear"
],
"dropout": [
0.1,
0
],
"hidden_dims": [
1024,
1
],
"input_dim": 2048,
"num_layers": 2
},
"training_style": "soft_em"
}
},
"passage_summary_vector_module": {
"activations": "linear",
"hidden_dims": 1,
"input_dim": 1024,
"num_layers": 1
},
"pretrained_model": "roberta-large",
"question_summary_vector_module": {
"activations": "linear",
"hidden_dims": 1,
"input_dim": 1024,
"num_layers": 1
}
},
  • 数据集
1
2
"train_data_path": "drop_data/drop_dataset_train.json",
"validation_data_path": "drop_data/drop_dataset_dev.json",
  • trainer

    “cuda_device”: -1,表示使用cpu

1
2
3
4
5
6
7
8
9
10
11
12
13
"trainer": {
"cuda_device": 0,
"keep_serialized_model_every_num_seconds": 3600,
"num_epochs": 35,
"num_steps_to_accumulate": 6,
"optimizer": {
"type": "adamw",
"lr": 5e-06
},
"patience": 10,
"summary_interval": 100,
"validation_metric": "+f1"
},
  • validation_dataset_reader

    “is_training”: false,设置为评估

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
"validation_dataset_reader": {
"type": "tbmse_drop",//选择src/data/dataset_readers/drop/drop_reader.py
"answer_field_generators": {
"arithmetic_answer": {
"type": "arithmetic_answer_generator",
"special_numbers": [
100,
1
]
},
"count_answer": {
"type": "count_answer_generator"
},
"passage_span_answer": {
"type": "span_answer_generator",
"text_type": "passage"
},
"question_span_answer": {
"type": "span_answer_generator",
"text_type": "question"
},
"tagged_answer": {
"type": "tagged_answer_generator",
"ignore_question": false,
"labels": {
"I": 1,
"O": 0
}
}
},
"answer_generator_names_per_type": {
"date": [
"arithmetic_answer",
"passage_span_answer",
"question_span_answer",
"tagged_answer"
],
"multiple_span": [
"tagged_answer"
],
"number": [
"arithmetic_answer",
"count_answer",
"passage_span_answer",
"question_span_answer",
"tagged_answer"
],
"single_span": [
"tagged_answer",
"passage_span_answer",
"question_span_answer"
]
},
"is_training": false,//设置为评估
"lazy": true,
"old_reader_behavior": true,
"pickle": {
"action": "load",
"file_name": "all_heads_IO_roberta-large",
"path": "../pickle/drop"
},
"tokenizer": {
"type": "huggingface_transformers",
"pretrained_model": "roberta-large"
}
}

预测

训练模型打包为model.tar.gz

1
allennlp predict training_directory/model.tar.gz drop_data/drop_dataset_dev.json --predictor machine-comprehension --cuda-device 0 --output-file predictions.jsonl --use-dataset-reader --include-package src

预测结果保存在根目录的predictions.jsonl

评估

DROP

1
allennlp evaluate training_directory/model.tar.gz drop_data/drop_dataset_dev.json --cuda-device 3 --output-file eval.json --include-package src

BERT

1
allennlp evaluate training_directory_bert/model.tar.gz drop_data/drop_dataset_dev.json --cuda-device 1 --output-file eval_bert.json --include-package src

预测结果保存在根目录的eval.json

评估结果——DROP

TASE_IO+SSE

em_all_spans f1_all_spans em_multi_span f1_multi_span em_span f1_span
80.6 87.8 60.8 82.6 84.2 89.0

TASE_IO+SSE(BLOCK)

em_all_spans f1_all_spans em_multi_span f1_multi_span em_span f1_span
55.3 62.8 0 0 56.5 64.2

TASE_IO+SSE(BERT_large)

em_all_spans f1_all_spans em_multi_span f1_multi_span em_span f1_span
76.4 83.9 54.5 80.1 80.7 85.2

TASE_IO+SSE(只对包含答案的句子做IO标记)

em_all_spans f1_all_spans em_multi_span f1_multi_span em_span f1_span
57.8 64.5 16.7 23.3 58.1 64.2

论文结果:

image-20210212155315238