Tensorflow Estimators API使用介绍

在上一篇文章Tensorflow Dataset API详解里面，我们说Datasets API和Estimators API是TensorFlow 1.3开始引入的两个高级API。其中，Estimators API 提供了训练模型、测试模型准确率和生成预测的方法。在这篇文章里面，我准备对Estimator API做一个基本讲解，其中主要内容还是来源于官方文档。

首先，还是先看看tensorflow中Estimators API的组成图：

从上图我们可以知道：Estimators分为Pre-made Estimators和custom Estimators两大类。其中，tf.estimator.Estimators是基类(base class)，pre-made Estimators是基类的子类，而custom Estimators则是基类的实例(instance)。

Pre-made Estimators和custom Estimators差异主要在于tensorflow中是否有它们可以直接使用的模型函数(model function or model_fn)d的实现。对于前者，tensorflow中已经有写好的model function，因而直接调用即可；而后者的model function需要自己编写。因此，Pre-made Estimators使用方便，但使用范围小，灵活性差；custom Estimators则正好相反。

Your model function could implement a wide range of algorithms, defining all sorts ofhidden layers andmetrics. Like input functions, all model functions must accept a standard group of input parameters and return a standard group of output values. Just as input functions can leverage the Dataset API,model functions can leverage the Layers API and the Metrics API.

总体来说，模型是由三部分构成：Input functions、Model functions 和Estimators（评估控制器，main function)。

Input functions：主要是由Dataset API组成，可以分为train_input_fn和eval_input_fn。前者的任务（行为）是接受参数，输出数据训练数据，后者的任务（行为）是接受参数，并输出验证数据和测试数据。
Model functions：是由模型（the Layers API ）和监控模块（ the Metrics API）组成，主要是实现模型的训练、测试（验证）和监控显示模型参数状况的功能。
Estimators：在模型中的作用类似于计算机中的操作系统。它将各个部分“粘合”起来，控制数据在模型中的流动与变换，同时控制模型的的各种行为（运算）。

接下来，我们将分别运用Pre-made Estimators和custom Estimators来构建模型，解决Iris problem。该模型的架构图如下：

1. 使用Pre-made Estimators

（1）Create Input Functions

This input function builds an input pipeline that yields batches of (features, labels) pairs,where features is adictionary features.

# An input function for training
def train_input_fn(features, labels, batch_size):
    # Convert the inputs to a Dataset.
    dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))
    # Shuffle, repeat, and batch the examples.
    dataset = dataset.shuffle(1000).repeat().batch(batch_size)
    # Return the read end of the pipeline.
    return dataset.make_one_shot_iterator().get_next()

#An input function for evaluation or prediction
def eval_input_fn(features, labels, batch_size):
    features=dict(features)
    if labels is None:
        # No labels, use only features.
        inputs = features
    else:
        inputs = (features, labels)
    # Convert the inputs to a Dataset.
    dataset = tf.data.Dataset.from_tensor_slices(inputs)
    # Batch the examples
    assert batch_size is not None, "batch_size must not be None"
    dataset = dataset.batch(batch_size)

（2）Instantiate an Estimator

Pre-made Estimators 是使用tensorflow中已经写好的model function来构建模型。Iris problem 是一个多分类问题，因此我们选择tf.estimator.DNNClassifier 作为模型的Estimators。由 input function 得到的特征是字典特征（dictionary features），但 model function 只能输入“value”数据，因此首先需要Define the Feature Columns。

# Feature columns describe how to use the input.
my_feature_columns = []
for key in features.keys():
    my_feature_columns.append(tf.feature_column.numeric_column(key=key))

# Build 2 hidden layer DNN with 10, 10 units respectively.
classifier = tf.estimator.DNNClassifier(
      feature_columns=my_feature_columns,
    # Two hidden layers of 10 nodes each.
    hidden_units=[10, 10],
    # The model must choose between 3 classes.
    n_classes=3)
    )

在初始化 Pre-made Estimators 时只需要将定义模型结构的超参数传给 estimator 即可，它会自动将其转交给 model function 。同时，有一点需要注意理解，我们是分两次向model function传递参数：在estimator初始化时传递模型结构的超参数，以及在调用model function时传递关于任务类型和特征类型的参数，并且model function的参数都是固定的。这点后面再说。

（3） Train, Evaluate, and Predict
Now that we have an Estimator object, we can call methods to do the following:

Train the model.
Evaluate the trained model.
Use the trained model to make predictions.

# Train the Model.
classifier.train(
    input_fn=lambda:train_input_fn(train_x, train_y, args.batch_size),
    steps=args.train_steps)

# Evaluate the model.
eval_result = classifier.evaluate(
    input_fn=lambda:eval_input_fn(test_x, test_y, args.batch_size))

print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**eval_result))

# Generate predictions from the model
expected = ['Setosa', 'Versicolor', 'Virginica']
predict_x = {
    'SepalLength': [5.1, 5.9, 6.9],
    'SepalWidth': [3.3, 3.0, 3.1],
    'PetalLength': [1.7, 4.2, 5.4],
    'PetalWidth': [0.5, 1.5, 2.1],
}
predictions = classifier.predict(
    input_fn=lambda:iris_data.eval_input_fn(predict_x,batch_size=args.batch_size))

for pred_dict, expec in zip(predictions, expected):
    template = ('\nPrediction is "{}" ({:.1f}%), expected "{}"')
    class_id = pred_dict['class_ids'][0]
    probability = pred_dict['probabilities'][class_id]
    print(template.format(iris_data.SPECIES[class_id],100 * probability, expec))

2. Custom Estimators

custom Estimators与Pre-made Estimators主要不同就是需要自己构建model function。因此本部分主要是介绍model function。

（1）Write a model function

首先看看model function的参数列表：

def my_model(
   features, # This is batch_features from input_fn
   labels,   # This is batch_labels from input_fn
   mode,     # An instance of tf.estimator.ModeKeys
   params):  # Additional configuration

如前文所说，model function参数是固定的，都是这四个，并且它们传入的时间和方法都不仅相同。其中，params是初始化estimators时就必须定义，它里面包括定义模型结构的超参数； features和labels是初始化的实例调用此函数时传入（定义）的；mode这个参数比较特殊，它不是显示地直接定义传递，而是 estimators地实例再调用方法时隐式地间接传递。具体来说，我们假设先将estimators初始化（实例化）：

classifier = tf.estimator.Estimator(...)

然后，若我们调用“train”方法

classifier.train(input_fn=lambda: train_input_fn(FILE_TRAIN, True, 500))

则此时，"mode"这个参数将被自动赋予 tf.estimator.ModeKeys.TRAIN的值，同理，在调用“Predict”和“evaluation ”方法时，它也会被赋予不同的值，具体参看下表：

这样，我们就可以在model function中通过判断mode的值，确定调用的方法，也因此能够返回不同的值。

下面就是完整的model function：

def my_model(features, labels, mode, params):
    """DNN with three hidden layers, and dropout of 0.1 probability."""
    # 定义模型结构
    # input layer
    net = tf.feature_column.input_layer(features, params['feature_columns'])
    # hidden layers, sized according to the 'hidden_units' param.
    for units in params['hidden_units']:
        net = tf.layers.dense(net, units=units, activation=tf.nn.relu)
    # output layer, Compute logits (1 per class).
    logits = tf.layers.dense(net, params['n_classes'], activation=None)
    
    # Compute predictions.
    predicted_classes = tf.argmax(logits, 1)
    # 若调用predict方法
    if mode == tf.estimator.ModeKeys.PREDICT:
        predictions = {
            'class_ids': predicted_classes[:, tf.newaxis],
            'probabilities': tf.nn.softmax(logits),
            'logits': logits,
        }
        return tf.estimator.EstimatorSpec(mode, predictions=predictions)

    # Compute loss.
    loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)
    # Compute evaluation metrics.
    accuracy = tf.metrics.accuracy(labels=labels,
                                   predictions=predicted_classes,
                                   name='acc_op')
    metrics = {'accuracy': accuracy}
    tf.summary.scalar('accuracy', accuracy[1])
    # 若调用evaluate方法
    if mode == tf.estimator.ModeKeys.EVAL:
        return tf.estimator.EstimatorSpec(
            mode, loss=loss, eval_metric_ops=metrics)

    # 若调用train方法
    # Create training op.
    assert mode == tf.estimator.ModeKeys.TRAIN

    optimizer = tf.train.AdagradOptimizer(learning_rate=0.1)
    train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step())
    return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)

（2）初始化

classifier = tf.estimator.Estimator(
    model_fn=my_model,
    params={
        'feature_columns': my_feature_columns,
        # Two hidden layers of 10 nodes each.
        'hidden_units': [10, 10],
        # The model must choose between 3 classes.
        'n_classes': 3,
    })

（3）训练模型

# Train the Model.
classifier.train(
    input_fn=lambda:train_input_fn(train_x, train_y, args.batch_size),
    steps=args.train_steps)

通常，我们在input function中将“repeat”次数设置为“无限”次，即是将“repeat”的参数设置为“空白”或者None。而模型训练迭代的次数就是由上面的参数“train_steps”进行控制。

最后，由于我们在model function中定义了对模型accuracy、loss和global_step/sec进行监控，因而我们可以在终端输入以下命令，然后再浏览器上打开 http://localhost:6006 查看其变化“折线图”。

# Replace PATH with the actual path passed as model_dir
tensorboard --logdir=PATH

参考文献：