社区首页 >问答首页 >不兼容的形状：[11,768]与[1,5,768] -在生产中使用huggingface保存的模型进行推断

问不兼容的形状：[11,768]与[1,5,768] -在生产中使用huggingface保存的模型进行推断
EN

Stack Overflow用户

提问于 2020-08-29 00:47:39

回答 1查看 277关注 0票数 1

我从huggingface模型中保存了一个预训练版本的distilbert，distilbert-base-uncased-finetuned-sst-2-english，，我正试图通过Tensorflow服务和进行预测来提供它。目前所有的测试都在Colab进行。

我在通过TensorFlow Serve将预测转换为正确的模型格式时遇到了问题。Tensorflow服务已经启动并运行良好，为模型提供了服务，但是我的预测代码不正确，我需要一些帮助来理解如何通过API通过json进行预测。

# tokenize and encode a simple positive instance
instances = tokenizer.tokenize('this is the best day of my life!')
instances = tokenizer.encode(instances)
data = json.dumps({"signature_name": "serving_default", "instances": instances, })
print(data)

{"signature_name"："serving_default"，“实例”：101,2023,2003,1996,2190,2154,1997,2026,2166,999,102}

# setup json_response object
headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/my_model:predict', data=data, headers=headers)
predictions = json.loads(json_response.text)

预测

{'error': '{{function_node __inference__wrapped_model_52602}} {{function_node __inference__wrapped_model_52602}} Incompatible shapes: [11,768] vs. [1,5,768]\n\t [[{{node tf_distil_bert_for_sequence_classification_3/distilbert/embeddings/add}}]]\n\t [[StatefulPartitionedCall/StatefulPartitionedCall]]'}

这里的任何方向都将不胜感激。

tensorflow-serving

distilbert

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-08-31 00:48:40

能够通过为输入形状和注意掩码设置签名来找到解决方案，如下所示。这是一个简单的实现，它为保存的模型使用固定的输入形状，并要求您将输入填充到预期的输入形状384。我已经看到了调用自定义签名和创建模型来匹配预期输入形状的实现，但是下面的简单案例适用于我希望通过TF服务实现的huggingface模型。如果任何人有任何更好的例子或方法来更好地扩展此功能，请张贴以供将来使用。

# create callable
from transformers import TFDistilBertForQuestionAnswering
distilbert = TFDistilBertForQuestionAnswering.from_pretrained('distilbert-base-cased-distilled-squad')
callable = tf.function(distilbert.call)

通过调用get_concrete_function，我们为输入签名跟踪编译模型的TensorFlow操作，该签名由两个形状为None，384的张量组成，第一个是输入ids，第二个是注意掩码。

concrete_function = callable.get_concrete_function([tf.TensorSpec([None, 384], tf.int32, name="input_ids"), tf.TensorSpec([None, 384], tf.int32, name="attention_mask")])

保存带有签名的模型：

# stored model path for TF Serve (1 = version 1) --> '/path/to/my/model/distilbert_qa/1/'
distilbert_qa_save_path = 'path_to_model'
tf.saved_model.save(distilbert, distilbert_qa_save_path, signatures=concrete_function)

检查它是否包含正确的签名：

saved_model_cli show --dir 'path_to_model' --tag_set serve --signature_def serving_default

输出应如下所示：

The given SavedModel SignatureDef contains the following input(s):
  inputs['attention_mask'] tensor_info:
      dtype: DT_INT32
      shape: (-1, 384)
      name: serving_default_attention_mask:0
  inputs['input_ids'] tensor_info:
      dtype: DT_INT32
      shape: (-1, 384)
      name: serving_default_input_ids:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['output_0'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 384)
      name: StatefulPartitionedCall:0
  outputs['output_1'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 384)
      name: StatefulPartitionedCall:1
Method name is: tensorflow/serving/predict

测试模型：

from transformers import DistilBertTokenizer
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-cased')

question, text = "Who was Benjamin?", "Benjamin was a silly dog."
input_dict = tokenizer(question, text, return_tensors='tf')

start_scores, end_scores = distilbert(input_dict)

all_tokens = tokenizer.convert_ids_to_tokens(input_dict["input_ids"].numpy()[0])
answer = ' '.join(all_tokens[tf.math.argmax(start_scores, 1)[0] : tf.math.argmax(end_scores, 1)[0]+1])

对于TF服务(在colab中)：(这是我的初衷)

!echo "deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | tee /etc/apt/sources.list.d/tensorflow-serving.list && \
curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | apt-key add -
!apt update

!apt-get install tensorflow-model-server

import os
# path_to_model --> versions directory --> '/path/to/my/model/distilbert_qa/'
# actual stored model path version 1 --> '/path/to/my/model/distilbert_qa/1/'
MODEL_DIR = 'path_to_model'
os.environ["MODEL_DIR"] = os.path.abspath(MODEL_DIR)

%%bash --bg
nohup tensorflow_model_server --rest_api_port=8501 --model_name=my_model --model_base_path="${MODEL_DIR}" >server.log 2>&1

!tail server.log

发出POST请求：

import json
!pip install -q requests
import requests
import numpy as np

max_length = 384  # must equal model signature expected input value
question, text = "Who was Benjamin?", "Benjamin was a good boy."

# padding='max_length' pads the input to the expected input length (else incompatible shapes error)
input_dict = tokenizer(question, text, return_tensors='tf', padding='max_length', max_length=max_length)

input_ids = input_dict["input_ids"].numpy().tolist()[0]
att_mask = input_dict["attention_mask"].numpy().tolist()[0]
features = [{'input_ids': input_ids, 'attention_mask': att_mask}]

data = json.dumps({ "signature_name": "serving_default", "instances": features})

headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/my_model:predict', data=data, headers=headers)
print(json_response)

predictions = json.loads(json_response.text)['predictions']

all_tokens = tokenizer.convert_ids_to_tokens(input_dict["input_ids"].numpy()[0])
answer = ' '.join(all_tokens[tf.math.argmax(predictions[0]['output_0']) : tf.math.argmax(predictions[0]['output_1'])+1])
print(answer)