jemartin commited on
Commit
dffde1f
·
verified ·
1 Parent(s): a93b8f2

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +170 -0
README.md ADDED
@@ -0,0 +1,170 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: apache-2.0
4
+ model_name: t5-encoder-12.onnx
5
+ tags:
6
+ - validated
7
+ - text
8
+ - machine_comprehension
9
+ - t5
10
+ ---
11
+ <!--- SPDX-License-Identifier: Apache-2.0 -->
12
+
13
+ # T5
14
+
15
+ ## Use-cases
16
+ Transformer-based language model trained on multiple tasks including summarization, sentiment-analysis, q&a, translation etc.
17
+ The implementation in this repo is an adaptation of the [onnxt5 repo](https://github.com/abelriboulot/onnxt5) which makes the export and use of T5 with ONNX easier.
18
+
19
+ ## Description
20
+ [T5](https://arxiv.org/abs/1910.10683) is a transformer model which aims to provide great flexibility and provide better semantic
21
+ understanding through the training of multiple tasks at once.
22
+
23
+ ## Model
24
+
25
+ | Model | Download | Download (with sample test data) | ONNX version | Opset version |
26
+ | ----------- | ---------- |--------------| -------------- | -------------- |
27
+ |T5-encoder |[650.6 MB](model/t5-encoder-12.onnx) | [205.0 MB](model/t5-encoder-12.tar.gz)| 1.7 | 12
28
+ |T5-decoder-with-lm-head |[304.9 MB](model/t5-decoder-with-lm-head-12.onnx) | [304.9 MB](model/t5-decoder-with-lm-head-12.tar.gz)| 1.7 | 12
29
+
30
+
31
+ ### Source
32
+ Huggingface PyTorch T5 + script changes ==> ONNX T5-encoder
33
+
34
+ Huggingface PyTorch T5 + script changes ==> ONNX T5-decoder-with-lm-head
35
+
36
+ Script changes include:
37
+ - reshaping the Huggingface models to combine the lm head with the decoder to allow for a unified model
38
+ - reshaping the encoder to output the hidden state directly
39
+
40
+ ## Inference
41
+ The script for ONNX model conversion and ONNX Runtime inference is [here](dependencies/T5-export.py).
42
+ More complete utilities to export and use the models are maintained in the [onnxt5 repo](https://github.com/abelriboulot/onnxt5).
43
+
44
+ ### Input to model
45
+ This implementation takes as inputs a prompt which begins by the task at hand here. Examples of some tasks include ```summarize: <PROMPT>```,
46
+ ```translate English to French: <PROMPT>```, ```cola sentence: <PROMPT>```, etc.
47
+ For the full list of task you can refer to the appendix D of the [original paper](https://arxiv.org/pdf/1910.10683.pdf).
48
+
49
+
50
+ ### Preprocessing steps
51
+ The easiest way to use the model is to use the onnxt5 utilities (installation instructions: ```pip install onnxt5```).
52
+
53
+ In that case you can use the model with the following piece of code:
54
+ ```python
55
+ from onnxt5 import GenerativeT5
56
+ from onnxt5.api import get_encoder_decoder_tokenizer
57
+ decoder_sess, encoder_sess, tokenizer = get_encoder_decoder_tokenizer()
58
+ generative_t5 = GenerativeT5(encoder_sess, decoder_sess, tokenizer, onnx=True)
59
+ prompt = 'translate English to French: I was a victim of a series of accidents.'
60
+ output_text, output_logits = generative_t5(prompt, max_length=100, temperature=0.)
61
+ # output_text: "J'ai été victime d'une série d'accidents."
62
+ ```
63
+
64
+ Or if you wish to produce the embeddings of a sentence:
65
+ ```python
66
+ from onnxt5.api import get_encoder_decoder_tokenizer, run_embeddings_text
67
+
68
+ decoder_sess, encoder_sess, tokenizer = get_encoder_decoder_tokenizer()
69
+ prompt = 'Listen, Billy Pilgrim has come unstuck in time.'
70
+ encoder_embeddings, decoder_embeddings = run_embeddings_text(encoder_sess, decoder_sess, tokenizer, prompt)
71
+ ```
72
+
73
+ Otherwise you can manually create the Generative model with the following:
74
+
75
+ ```python
76
+ from onnxruntime import InferenceSession
77
+ from transformers import T5Tokenizer
78
+ from .dependencies.models import GenerativeT5
79
+
80
+ tokenizer = T5Tokenizer.from_pretrained('t5-base')
81
+
82
+ # Start from ORT 1.10, ORT requires explicitly setting the providers parameter if you want to use execution providers
83
+ # other than the default CPU provider (as opposed to the previous behavior of providers getting set/registered by default
84
+ # based on the build flags) when instantiating InferenceSession.
85
+ # For example, if NVIDIA GPU is available and ORT Python package is built with CUDA, then call API as following:
86
+ # InferenceSession(path/to/model, providers=['CUDAExecutionProvider'])
87
+ decoder_sess = InferenceSession(str(path_t5_decoder))
88
+ encoder_sess = InferenceSession(str(path_t5_encoder))
89
+ generative_t5 = GenerativeT5(encoder_sess, decoder_sess, tokenizer, onnx=True)
90
+ generative_t5('translate English to French: I was a victim of a series of accidents.', 21, temperature=0.)[0]
91
+ ```
92
+
93
+ ### Output of model
94
+ For the T5-encoder model:
95
+
96
+ **last_hidden_state**: Sequence of hidden-states at the last layer of the model. It's a float tensor of size (batch_size, sequence_length, hidden_size).
97
+
98
+ For T5-decoder-with-lm-head model:
99
+
100
+ **logit_predictions**: Prediction scores of the language modeling head. It's a float tensor of size (batch_size, sequence_length, vocab_size).
101
+
102
+ ### Postprocessing steps
103
+ For the T5-encoder model:
104
+
105
+ ```python
106
+ last_hidden_states = model(input_ids)[0]
107
+ ```
108
+
109
+ For the T5-decoder-with-lm-head model:
110
+
111
+ ```python
112
+ # To generate the encoder's last hidden state
113
+ encoder_output = encoder_sess.run(None, {"input_ids": input_ids})[0]
114
+ # To generate the full model's embeddings
115
+ decoder_output = decoder_sess.run(None, {
116
+ "input_ids": input_ids,
117
+ "encoder_hidden_states": encoder_output
118
+ })[0]
119
+ ```
120
+
121
+ For the generative model, to generate a translation:
122
+ ```
123
+ from onnxt5 import GenerativeT5
124
+ from onnxt5.api import get_encoder_decoder_tokenizer
125
+ decoder_sess, encoder_sess, tokenizer = get_encoder_decoder_tokenizer()
126
+ generative_t5 = GenerativeT5(encoder_sess, decoder_sess, tokenizer, onnx=True)
127
+ prompt = 'translate English to French: I was a victim of a series of accidents.'
128
+ output_text, output_logits = generative_t5(prompt, max_length=100, temperature=0.)
129
+ ```
130
+ <hr>
131
+
132
+ ## Dataset (Train and validation)
133
+ The original model from Google Brain is pretrained on the [Colossal Clean Crawled Corpus](https://www.tensorflow.org/datasets/catalog/c4).
134
+ The pretrained model is referenced in [huggingface/transformers](https://github.com/huggingface/transformers/blob/master/transformers/modeling_t5.py), trained on the same data.
135
+ <hr>
136
+
137
+ ## Validation accuracy
138
+ Benchmarking can be run with the following [script](https://github.com/abelriboulot/onnxt5/blob/master/notebooks/benchmark_performance.ipynb) with initial results in this [post](https://kta.io/posts/onnx_t5).
139
+ <hr>
140
+
141
+
142
+ ## Publication/Attribution
143
+ This repo is based on the work of Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and
144
+ Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu from Google, as well as the implementation of T5 from the
145
+ huggingface team, the work of the Microsoft ONNX and onnxruntime teams, in particular Tianlei Wu, and the work of Thomas Wolf on generation of text.
146
+
147
+ [Original T5 Paper](https://arxiv.org/pdf/1910.10683.pdf)
148
+ ```
149
+ @article{2019t5,
150
+ author = {Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu},
151
+ title = {Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer},
152
+ journal = {arXiv e-prints},
153
+ year = {2019},
154
+ archivePrefix = {arXiv},
155
+ eprint = {1910.10683},
156
+ }
157
+ ```
158
+
159
+ ## References
160
+ This model is converted directly from [huggingface/transformers](https://github.com/huggingface/transformers/blob/master/src/transformers/modeling_t5.py).
161
+ <hr>
162
+
163
+ ## Contributors
164
+ [Abel Riboulot](https://github.com/abelriboulot)
165
+ <hr>
166
+
167
+ ## License
168
+ Apache 2.0 License
169
+ <hr>
170
+