Tensorflow Estimators API使用介绍
在上一篇文章Tensorflow Dataset API详解里面,我们说Datasets API和Estimators API是TensorFlow 1.3开始引入的两个高级API。其中,Estimators API 提供了训练模型、测试模型准确率和生成预测的方法。在这篇文章里面,我准备对Estimator API做一个基本讲解,其中主要内容还是来源于官方文档。
首先,还是先看看tensorflow中Estimators API的组成图:

从上图我们可以知道:Estimators分为Pre-made Estimators和custom Estimators两大类。其中,tf.estimator.Estimators是基类(base class),pre-made Estimators是基类的子类,而custom Estimators则是基类的实例(instance)。
Pre-made Estimators和custom Estimators差异主要在于tensorflow中是否有它们可以直接使用的模型函数(model function or model_fn)d的实现。对于前者,tensorflow中已经有写好的model function,因而直接调用即可;而后者的model function需要自己编写。因此,Pre-made Estimators使用方便,但使用范围小,灵活性差;custom Estimators则正好相反。
Your model function could implement a wide range of algorithms, defining all sorts ofhidden layers andmetrics. Like input functions, all model functions must accept a standard group of input parameters and return a standard group of output values. Just as input functions can leverage the Dataset API,model functions can leverage the Layers API and the Metrics API.
总体来说,模型是由三部分构成:Input functions、Model functions 和Estimators(评估控制器,main function)。
- Input functions:主要是由Dataset API组成,可以分为train_input_fn和eval_input_fn。前者的任务(行为)是接受参数,输出数据训练数据,后者的任务(行为)是接受参数,并输出验证数据和测试数据。
- Model functions:是由模型(the Layers API )和监控模块( the Metrics API)组成,主要是实现模型的训练、测试(验证)和监控显示模型参数状况的功能。
- Estimators:在模型中的作用类似于计算机中的操作系统。它将各个部分“粘合”起来,控制数据在模型中的流动与变换,同时控制模型的的各种行为(运算)。
接下来,我们将分别运用Pre-made Estimators和custom Estimators来构建模型,解决Iris problem。该模型的架构图如下:

1. 使用Pre-made Estimators
(1)Create Input Functions
This input function builds an input pipeline that yields batches of (features, labels) pairs,where features is adictionary features.
# An input function for training
def train_input_fn(features, labels, batch_size):
# Convert the inputs to a Dataset.
dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))
# Shuffle, repeat, and batch the examples.
dataset = dataset.shuffle(1000).repeat().batch(batch_size)
# Return the read end of the pipeline.
return dataset.make_one_shot_iterator().get_next()
#An input function for evaluation or prediction
def eval_input_fn(features, labels, batch_size):
features=dict(features)
if labels is None:
# No labels, use only features.
inputs = features
else:
inputs = (features, labels)
# Convert the inputs to a Dataset.
dataset = tf.data.Dataset.from_tensor_slices(inputs)
# Batch the examples
assert batch_size is not None, "batch_size must not be None"
dataset = dataset.batch(batch_size)
(2)Instantiate an Estimator
Pre-made Estimators 是使用tensorflow中已经写好的model function来构建模型。Iris problem 是一个多分类问题,因此我们选择tf.estimator.DNNClassifier 作为模型的Estimators。由 input function 得到的特征是字典特征(dictionary features), 但 model function 只能输入“value”数据,因此首先需要Define the Feature Columns。
# Feature columns describe how to use the input.
my_feature_columns = []
for key in features.keys():
my_feature_columns.append(tf.feature_column.numeric_column(key=key))
# Build 2 hidden layer DNN with 10, 10 units respectively.
classifier = tf.estimator.DNNClassifier(
feature_columns=my_feature_columns,
# Two hidden layers of 10 nodes each.
hidden_units=[10, 10],
# The model must choose between 3 classes.
n_classes=3)
)
在初始化 Pre-made Estimators 时只需要将定义模型结构的超参数传给 estimator 即可,它会自动将其转交给 model function 。同时,有一点需要注意理解,我们是分两次向model function传递参数:在estimator初始化时传递模型结构的超参数,以及在调用model function时传递关于任务类型和特征类型的参数,并且model function的参数都是固定的。这点后面再说。
(3) Train, Evaluate, and Predict
Now that we have an Estimator object, we can call methods to do the following:
- Train the model.
- Evaluate the trained model.
- Use the trained model to make predictions.
# Train the Model.
classifier.train(
input_fn=lambda:train_input_fn(train_x, train_y, args.batch_size),
steps=args.train_steps)
# Evaluate the model.
eval_result = classifier.evaluate(
input_fn=lambda:eval_input_fn(test_x, test_y, args.batch_size))
print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**eval_result))
# Generate predictions from the model
expected = ['Setosa', 'Versicolor', 'Virginica']
predict_x = {
'SepalLength': [5.1, 5.9, 6.9],
'SepalWidth': [3.3, 3.0, 3.1],
'PetalLength': [1.7, 4.2, 5.4],
'PetalWidth': [0.5, 1.5, 2.1],
}
predictions = classifier.predict(
input_fn=lambda:iris_data.eval_input_fn(predict_x,batch_size=args.batch_size))
for pred_dict, expec in zip(predictions, expected):
template = ('\nPrediction is "{}" ({:.1f}%), expected "{}"')
class_id = pred_dict['class_ids'][0]
probability = pred_dict['probabilities'][class_id]
print(template.format(iris_data.SPECIES[class_id],100 * probability, expec))
2. Custom Estimators
custom Estimators与Pre-made Estimators主要不同就是需要自己构建model function。因此本部分主要是介绍model function。
(1)Write a model function
首先看看model function的参数列表:
def my_model(
features, # This is batch_features from input_fn
labels, # This is batch_labels from input_fn
mode, # An instance of tf.estimator.ModeKeys
params): # Additional configuration如前文所说,model function参数是固定的,都是这四个,并且它们传入的时间和方法都不仅相同。其中,params是初始化estimators时就必须定义,它里面包括定义模型结构的超参数; features和labels是初始化的实例调用此函数时传入(定义)的;mode这个参数比较特殊,它不是显示地直接定义传递,而是
estimators地实例再调用方法时隐式地间接传递。具体来说,我们假设先将estimators初始化(实例化):
classifier = tf.estimator.Estimator(...)然后,若我们调用“train”方法
classifier.train(input_fn=lambda: train_input_fn(FILE_TRAIN, True, 500))则此时,"mode"这个参数将被自动赋予
tf.estimator.ModeKeys.TRAIN的值,同理,在调用“Predict”和“evaluation ”方法时,它也会被赋予不同的值,具体参看下表:
这样,我们就可以在model function中通过判断mode的值,确定调用的方法,也因此能够返回不同的值。
下面就是完整的model function:
def my_model(features, labels, mode, params):
"""DNN with three hidden layers, and dropout of 0.1 probability."""
# 定义模型结构
# input layer
net = tf.feature_column.input_layer(features, params['feature_columns'])
# hidden layers, sized according to the 'hidden_units' param.
for units in params['hidden_units']:
net = tf.layers.dense(net, units=units, activation=tf.nn.relu)
# output layer, Compute logits (1 per class).
logits = tf.layers.dense(net, params['n_classes'], activation=None)
# Compute predictions.
predicted_classes = tf.argmax(logits, 1)
# 若调用predict方法
if mode == tf.estimator.ModeKeys.PREDICT:
predictions = {
'class_ids': predicted_classes[:, tf.newaxis],
'probabilities': tf.nn.softmax(logits),
'logits': logits,
}
return tf.estimator.EstimatorSpec(mode, predictions=predictions)
# Compute loss.
loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)
# Compute evaluation metrics.
accuracy = tf.metrics.accuracy(labels=labels,
predictions=predicted_classes,
name='acc_op')
metrics = {'accuracy': accuracy}
tf.summary.scalar('accuracy', accuracy[1])
# 若调用evaluate方法
if mode == tf.estimator.ModeKeys.EVAL:
return tf.estimator.EstimatorSpec(
mode, loss=loss, eval_metric_ops=metrics)
# 若调用train方法
# Create training op.
assert mode == tf.estimator.ModeKeys.TRAIN
optimizer = tf.train.AdagradOptimizer(learning_rate=0.1)
train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step())
return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)
(2)初始化
classifier = tf.estimator.Estimator(
model_fn=my_model,
params={
'feature_columns': my_feature_columns,
# Two hidden layers of 10 nodes each.
'hidden_units': [10, 10],
# The model must choose between 3 classes.
'n_classes': 3,
})
(3)训练模型
# Train the Model.
classifier.train(
input_fn=lambda:train_input_fn(train_x, train_y, args.batch_size),
steps=args.train_steps)通常,我们在input function中将“repeat”次数设置为“无限”次,即是将“repeat”的参数设置为“空白”或者None。而模型训练迭代的次数就是由上面的参数“train_steps”进行控制。
最后,由于我们在model function中定义了对模型accuracy、loss和global_step/sec进行监控,因而我们可以在终端输入以下命令,然后再浏览器上打开 http://localhost:6006 查看其变化“折线图”。
# Replace PATH with the actual path passed as model_dir
tensorboard --logdir=PATH
参考文献:
- https://www.tensorflow.org/get_started/custom_estimators
- https://www.tensorflow.org/get_started/premade_estimators