邢台市建设局培训中心网站,本科专业 网站开发,重庆微信营销网站,设计作品集模板Ray.tune官方文档
调整超参数通常是机器学习工作流程中最昂贵的部分。 Tune专为解决此问题而设计#xff0c;展示了针对此痛点的有效且可扩展的解决方案。 请注意#xff0c;此示例取决于Tensorflow 2.0。 Code: ray/python/ray/tune at master ray-project/ray GitHub
E… Ray.tune官方文档
调整超参数通常是机器学习工作流程中最昂贵的部分。 Tune专为解决此问题而设计展示了针对此痛点的有效且可扩展的解决方案。 请注意此示例取决于Tensorflow 2.0。 Code: ray/python/ray/tune at master · ray-project/ray · GitHub
Examples: https://github.com/ray-project/ray/tree/master/python/ray/tune/examples)
Documentation: Tune: Scalable Hyperparameter Tuning — Ray v1.6.0
Mailing List: https://groups.google.com/forum/#!forum/ray-dev
## If you are running on Google Colab, uncomment below to install the necessary dependencies
## before beginning the exercise.# print(Setting up colab environment)
# !pip uninstall -y -q pyarrow
# !pip install -q https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-0.8.0.dev5-cp36-cp36m-manylinux1_x86_64.whl
# !pip install -q ray[debug]# # A hack to force the runtime to restart, needed to include the above dependencies.
# print(Done installing! Restarting via forced crash (this is not an issue).)
# import os
# os._exit(0)## If you are running on Google Colab, please install TensorFlow 2.0 by uncommenting below..# try:
# # %tensorflow_version only exists in Colab.
# %tensorflow_version 2.x
# except Exception:
# pass本教程将逐步介绍使用Tune进行超参数调整的几个关键步骤。
可视化数据。创建模型训练过程使用Keras。通过调整上述模型训练过程以使用Tune来调整模型。分析Tune创建的模型。
请注意这使用了Tune的基于函数的API。 这主要是用于原型制作。 后面的教程将介绍Tune更加强大的基于类的可训练 API。
import numpy as np
np.random.seed(0)import tensorflow as tf
try:tf.get_logger().setLevel(INFO)
except Exception as exc:print(exc)
import warnings
warnings.simplefilter(ignore)from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Densefrom tensorflow.keras.optimizers import SGD, Adam
from tensorflow.keras.callbacks import ModelCheckpointimport ray
from ray import tune
from ray.tune.examples.utils import get_iris_dataimport inspect
import pandas as pd
import matplotlib.pyplot as plt
plt.style.use(ggplot)
%matplotlib inlineVisualize your data
首先让我们看一下数据集的分布。
鸢尾花数据集由3种不同类型的鸢尾花SetosaVersicolour和Virginica的花瓣和萼片长度组成存储在150x4 numpy中。
行为样本列为隔片长度隔片宽度花瓣长度和花瓣宽度。 本教程的目标是提供一个模型该模型可以准确地预测给定的萼片长度萼片宽度花瓣长度和花瓣宽度4元组的真实标签。
from sklearn.datasets import load_irisiris load_iris()
true_data iris[data]
true_label iris[target]
names iris[target_names]
feature_names iris[feature_names]def plot_data(X, y):# Visualize the data setsplt.figure(figsize(16, 6))plt.subplot(1, 2, 1)for target, target_name in enumerate(names):X_plot X[y target]plt.plot(X_plot[:, 0], X_plot[:, 1], linestylenone, markero, labeltarget_name)plt.xlabel(feature_names[0])plt.ylabel(feature_names[1])plt.axis(equal)plt.legend();plt.subplot(1, 2, 2)for target, target_name in enumerate(names):X_plot X[y target]plt.plot(X_plot[:, 2], X_plot[:, 3], linestylenone, markero, labeltarget_name)plt.xlabel(feature_names[2])plt.ylabel(feature_names[3])plt.axis(equal)plt.legend();plot_data(true_data, true_label)创建模型训练过程使用Keras
现在让我们定义一个函数该函数将包含一些超参数并返回一个可用于训练的模型。
def create_model(learning_rate, dense_1, dense_2):assert learning_rate 0 and dense_1 0 and dense_2 0, Did you set the right configuration?model Sequential()model.add(Dense(int(dense_1), input_shape(4,), activationrelu, namefc1))model.add(Dense(int(dense_2), activationrelu, namefc2))model.add(Dense(3, activationsoftmax, nameoutput))optimizer SGD(lrlearning_rate)model.compile(optimizer, losscategorical_crossentropy, metrics[accuracy])return model下面是一个使用create_model函数训练模型并返回训练后的模型的函数。
def train_on_iris():train_x, train_y, test_x, test_y get_iris_data()model create_model(learning_rate0.1, dense_12, dense_22)# This saves the top model. accuracy is only available in TF2.0.checkpoint_callback ModelCheckpoint(model.h5, monitoraccuracy, save_best_onlyTrue, save_freq2)# Train the modelmodel.fit(train_x, train_y, validation_data(test_x, test_y),verbose0, batch_size10, epochs20, callbacks[checkpoint_callback])return model让我们在数据集中快速训练模型。 准确性应该很低。
original_model train_on_iris() # This trains the model and returns it.
train_x, train_y, test_x, test_y get_iris_data()
original_loss, original_accuracy original_model.evaluate(test_x, test_y, verbose0)
print(Loss is {:0.4f}.format(original_loss))
print(Accuracy is {:0.4f}.format(original_accuracy))与tune整合
现在让我们使用Tune优化学习鸢尾花分类的模型。 这将分为两个部分-修改训练功能以支持Tune然后配置Tune。
让我们首先定义一个回调函数以将中间训练进度报告回Tune。
import tensorflow.keras as keras
from ray.tune import trackclass TuneReporterCallback(keras.callbacks.Callback):Tune Callback for Keras.The callback is invoked every epoch.def __init__(self, logs{}):self.iteration 0super(TuneReporterCallback, self).__init__()def on_epoch_end(self, batch, logs{}):self.iteration 1track.log(keras_infologs, mean_accuracylogs.get(accuracy), mean_losslogs.get(loss))整合第1部分修改训练功能
说明按照接下来的2个步骤来修改train_iris函数以支持Tune。
更改函数的签名以接收超参数字典。 该函数将在Ray上调用。def tune_iris(config)将配置值传递到create_model中model create_model(learning_rateconfig[lr], dense_1config[dense_1], dense_2config[dense_2])
def tune_iris(): # TODO: Change me.train_x, train_y, test_x, test_y get_iris_data()model create_model(learning_rate0, dense_10, dense_20) # TODO: Change me.checkpoint_callback ModelCheckpoint(model.h5, monitorloss, save_best_onlyTrue, save_freq2)# Enable Tune to make intermediate decisions by using a Tune Callback hook. This is Keras specific.callbacks [checkpoint_callback, TuneReporterCallback()]# Train the modelmodel.fit(train_x, train_y, validation_data(test_x, test_y),verbose0, batch_size10, epochs20, callbackscallbacks)assert len(inspect.getargspec(tune_iris).args) 1, The tune_iris function needs to take in the arg config.print(Test-running to make sure this function will run correctly.)
tune.track.init() # For testing purposes only.
tune_iris({lr: 0.1, dense_1: 4, dense_2: 4})
print(Success!)第2部分配置Tune以调整超参数。
说明按照接下来的2个步骤来配置Tune以识别顶部的超参数。
指定超参数空间。hyperparameter_space { lr: tune.loguniform(0.001, 0.1), dense_1: tune.uniform(2, 128), dense_2: tune.uniform(2, 128), }增加样品数量。 我们评估的试验越多选择好的模型的机会就越大。num_samples 20常见问题并行在Tune中如何工作
设置num_samples将总共运行20个试验超参数配置示例。 但是并非所有这些都可以一次运行。 最大训练并发性是您正在运行的计算机上的CPU内核数。 对于2核机器将同时训练2个模型。 完成后新的训练过程将从新的超参数配置示例开始。
每个试用版都将在新的Python进程上运行。 试用结束后python进程将被杀死。 常见问题解答如何调试Tune中的内容
错误文件列将显示在输出中。 运行下面带有错误文件路径路径的单元格以诊断您的问题。! cat /home/ubuntu/tune_iris/tune_iris_c66e1100_2019-10-09_17-13-24x_swb9xs/error_2019-10-09_17-13-29.txt 启动Tune超参数搜索
# This seeds the hyperparameter sampling.
import numpy as np; np.random.seed(5)
hyperparameter_space {} # TODO: Fill me out.
num_samples 1 # TODO: Fill me out.####################################################################################################
################ This is just a validation function for tutorial purposes only. ####################
HP_KEYS [lr, dense_1, dense_2]
assert all(key in hyperparameter_space for key in HP_KEYS), (The hyperparameter space is not fully designated. It must include all of {}.format(HP_KEYS))
######################################################################################################ray.shutdown() # Restart Ray defensively in case the ray connection is lost.
ray.init(log_to_driverFalse)
# We clean out the logs before running for a clean visualization later.
! rm -rf ~/ray_results/tune_irisanalysis tune.run(tune_iris, verbose1, confighyperparameter_space,num_samplesnum_samples)assert len(analysis.trials) 20, Did you set the correct number of samples?分析最佳调整的模型
让我们将真实标签与分类标签进行比较。
_, _, test_data, test_labels get_iris_data()
plot_data(test_data, test_labels.argmax(1))# Obtain the directory where the best model is saved.
print(You can use any of the following columns to get the best model: \n{}..format([k for k in analysis.dataframe() if k.startswith(keras_info)]))
print( * 10)
logdir analysis.get_best_logdir(keras_info/val_loss, modemin)
# We saved the model as model.h5 in the logdir of the trial.
from tensorflow.keras.models import load_model
tuned_model load_model(logdir /model.h5)tuned_loss, tuned_accuracy tuned_model.evaluate(test_data, test_labels, verbose0)
print(Loss is {:0.4f}.format(tuned_loss))
print(Tuned accuracy is {:0.4f}.format(tuned_accuracy))
print(The original un-tuned model had an accuracy of {:0.4f}.format(original_accuracy))
predicted_label tuned_model.predict(test_data)
plot_data(test_data, predicted_label.argmax(1))我们可以通过可视化与基本事实相比较的预测来比较最佳模型的性能。
def plot_comparison(X, y):# Visualize the data setsplt.figure(figsize(16, 6))plt.subplot(1, 2, 1)for target, target_name in enumerate([Incorrect, Correct]):X_plot X[y target]plt.plot(X_plot[:, 0], X_plot[:, 1], linestylenone, markero, labeltarget_name)plt.xlabel(feature_names[0])plt.ylabel(feature_names[1])plt.axis(equal)plt.legend();plt.subplot(1, 2, 2)for target, target_name in enumerate([Incorrect, Correct]):X_plot X[y target]plt.plot(X_plot[:, 2], X_plot[:, 3], linestylenone, markero, labeltarget_name)plt.xlabel(feature_names[2])plt.ylabel(feature_names[3])plt.axis(equal)plt.legend();plot_comparison(test_data, test_labels.argmax(1) predicted_label.argmax(1))额外-使用Tensorboard获得结果
您可以使用TensorBoard查看试用表演。 如果未加载图形请单击“切换所有运行”。
%load_ext tensorboard%load_ext tensorboardRay.tune官方文档
调整超参数通常是机器学习工作流程中最昂贵的部分。 Tune专为解决此问题而设计展示了针对此痛点的有效且可扩展的解决方案。 请注意此示例取决于Tensorflow 2.0。 Code: ray/python/ray/tune at master · ray-project/ray · GitHub
Examples: https://github.com/ray-project/ray/tree/master/python/ray/tune/examples)
Documentation: Tune: Scalable Hyperparameter Tuning — Ray v1.6.0
Mailing List: https://groups.google.com/forum/#!forum/ray-dev
## If you are running on Google Colab, uncomment below to install the necessary dependencies
## before beginning the exercise.# print(Setting up colab environment)
# !pip uninstall -y -q pyarrow
# !pip install -q https://s3-us-west-2.amazonaws.com/ray-wheels/latest/ray-0.8.0.dev5-cp36-cp36m-manylinux1_x86_64.whl
# !pip install -q ray[debug]# # A hack to force the runtime to restart, needed to include the above dependencies.
# print(Done installing! Restarting via forced crash (this is not an issue).)
# import os
# os._exit(0)## If you are running on Google Colab, please install TensorFlow 2.0 by uncommenting below..# try:
# # %tensorflow_version only exists in Colab.
# %tensorflow_version 2.x
# except Exception:
# pass本教程将逐步介绍使用Tune进行超参数调整的几个关键步骤。
可视化数据。创建模型训练过程使用Keras。通过调整上述模型训练过程以使用Tune来调整模型。分析Tune创建的模型。
请注意这使用了Tune的基于函数的API。 这主要是用于原型制作。 后面的教程将介绍Tune更加强大的基于类的可训练 API。
import numpy as np
np.random.seed(0)import tensorflow as tf
try:tf.get_logger().setLevel(INFO)
except Exception as exc:print(exc)
import warnings
warnings.simplefilter(ignore)from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Densefrom tensorflow.keras.optimizers import SGD, Adam
from tensorflow.keras.callbacks import ModelCheckpointimport ray
from ray import tune
from ray.tune.examples.utils import get_iris_dataimport inspect
import pandas as pd
import matplotlib.pyplot as plt
plt.style.use(ggplot)
%matplotlib inlineVisualize your data
首先让我们看一下数据集的分布。
鸢尾花数据集由3种不同类型的鸢尾花SetosaVersicolour和Virginica的花瓣和萼片长度组成存储在150x4 numpy中。
行为样本列为隔片长度隔片宽度花瓣长度和花瓣宽度。
本教程的目标是提供一个模型该模型可以准确地预测给定的萼片长度萼片宽度花瓣长度和花瓣宽度4元组的真实标签。
from sklearn.datasets import load_irisiris load_iris()
true_data iris[data]
true_label iris[target]
names iris[target_names]
feature_names iris[feature_names]def plot_data(X, y):# Visualize the data setsplt.figure(figsize(16, 6))plt.subplot(1, 2, 1)for target, target_name in enumerate(names):X_plot X[y target]plt.plot(X_plot[:, 0], X_plot[:, 1], linestylenone, markero, labeltarget_name)plt.xlabel(feature_names[0])plt.ylabel(feature_names[1])plt.axis(equal)plt.legend();plt.subplot(1, 2, 2)for target, target_name in enumerate(names):X_plot X[y target]plt.plot(X_plot[:, 2], X_plot[:, 3], linestylenone, markero, labeltarget_name)plt.xlabel(feature_names[2])plt.ylabel(feature_names[3])plt.axis(equal)plt.legend();plot_data(true_data, true_label)123456789101112131415161718192021222324252627282930创建模型训练过程使用Keras
现在让我们定义一个函数该函数将包含一些超参数并返回一个可用于训练的模型。
def create_model(learning_rate, dense_1, dense_2):assert learning_rate 0 and dense_1 0 and dense_2 0, Did you set the right configuration?model Sequential()model.add(Dense(int(dense_1), input_shape(4,), activationrelu, namefc1))model.add(Dense(int(dense_2), activationrelu, namefc2))model.add(Dense(3, activationsoftmax, nameoutput))optimizer SGD(lrlearning_rate)model.compile(optimizer, losscategorical_crossentropy, metrics[accuracy])return model123456789
下面是一个使用create_model函数训练模型并返回训练后的模型的函数。
def train_on_iris():train_x, train_y, test_x, test_y get_iris_data()model create_model(learning_rate0.1, dense_12, dense_22)# This saves the top model. accuracy is only available in TF2.0.checkpoint_callback ModelCheckpoint(model.h5, monitoraccuracy, save_best_onlyTrue, save_freq2)# Train the modelmodel.fit(train_x, train_y, validation_data(test_x, test_y),verbose0, batch_size10, epochs20, callbacks[checkpoint_callback])return model让我们在数据集中快速训练模型。 准确性应该很低。
original_model train_on_iris() # This trains the model and returns it.
train_x, train_y, test_x, test_y get_iris_data()
original_loss, original_accuracy original_model.evaluate(test_x, test_y, verbose0)
print(Loss is {:0.4f}.format(original_loss))
print(Accuracy is {:0.4f}.format(original_accuracy))12345与tune整合
现在让我们使用Tune优化学习鸢尾花分类的模型。 这将分为两个部分-修改训练功能以支持Tune然后配置Tune。
让我们首先定义一个回调函数以将中间训练进度报告回Tune。
import tensorflow.keras as keras
from ray.tune import trackclass TuneReporterCallback(keras.callbacks.Callback):Tune Callback for Keras.The callback is invoked every epoch.def __init__(self, logs{}):self.iteration 0super(TuneReporterCallback, self).__init__()def on_epoch_end(self, batch, logs{}):self.iteration 1track.log(keras_infologs, mean_accuracylogs.get(accuracy), mean_losslogs.get(loss))整合第1部分修改训练功能
说明按照接下来的2个步骤来修改train_iris函数以支持Tune。
更改函数的签名以接收超参数字典。 该函数将在Ray上调用。def tune_iris(config)将配置值传递到create_model中model create_model(learning_rateconfig[lr], dense_1config[dense_1], dense_2config[dense_2])
def tune_iris(): # TODO: Change me.train_x, train_y, test_x, test_y get_iris_data()model create_model(learning_rate0, dense_10, dense_20) # TODO: Change me.checkpoint_callback ModelCheckpoint(model.h5, monitorloss, save_best_onlyTrue, save_freq2)# Enable Tune to make intermediate decisions by using a Tune Callback hook. This is Keras specific.callbacks [checkpoint_callback, TuneReporterCallback()]# Train the modelmodel.fit(train_x, train_y, validation_data(test_x, test_y),verbose0, batch_size10, epochs20, callbackscallbacks)assert len(inspect.getargspec(tune_iris).args) 1, The tune_iris function needs to take in the arg config.print(Test-running to make sure this function will run correctly.)
tune.track.init() # For testing purposes only.
tune_iris({lr: 0.1, dense_1: 4, dense_2: 4})
print(Success!)第2部分配置Tune以调整超参数。
说明按照接下来的2个步骤来配置Tune以识别顶部的超参数。
指定超参数空间。hyperparameter_space { lr: tune.loguniform(0.001, 0.1), dense_1: tune.uniform(2, 128), dense_2: tune.uniform(2, 128), }增加样品数量。 我们评估的试验越多选择好的模型的机会就越大。num_samples 20
常见问题并行在Tune中如何工作
设置num_samples将总共运行20个试验超参数配置示例。 但是并非所有这些都可以一次运行。 最大训练并发性是您正在运行的计算机上的CPU内核数。 对于2核机器将同时训练2个模型。 完成后新的训练过程将从新的超参数配置示例开始。
每个试用版都将在新的Python进程上运行。 试用结束后python进程将被杀死。
常见问题解答如何调试Tune中的内容
错误文件列将显示在输出中。 运行下面带有错误文件路径路径的单元格以诊断您的问题。! cat /home/ubuntu/tune_iris/tune_iris_c66e1100_2019-10-09_17-13-24x_swb9xs/error_2019-10-09_17-13-29.txt 启动Tune超参数搜索
# This seeds the hyperparameter sampling.
import numpy as np; np.random.seed(5)
hyperparameter_space {} # TODO: Fill me out.
num_samples 1 # TODO: Fill me out.####################################################################################################
################ This is just a validation function for tutorial purposes only. ####################
HP_KEYS [lr, dense_1, dense_2]
assert all(key in hyperparameter_space for key in HP_KEYS), (The hyperparameter space is not fully designated. It must include all of {}.format(HP_KEYS))
######################################################################################################ray.shutdown() # Restart Ray defensively in case the ray connection is lost.
ray.init(log_to_driverFalse)
# We clean out the logs before running for a clean visualization later.
! rm -rf ~/ray_results/tune_irisanalysis tune.run(tune_iris, verbose1, confighyperparameter_space,num_samplesnum_samples)assert len(analysis.trials) 20, Did you set the correct number of samples?分析最佳调整的模型
让我们将真实标签与分类标签进行比较。
_, _, test_data, test_labels get_iris_data()
plot_data(test_data, test_labels.argmax(1))# Obtain the directory where the best model is saved.
print(You can use any of the following columns to get the best model: \n{}..format([k for k in analysis.dataframe() if k.startswith(keras_info)]))
print( * 10)
logdir analysis.get_best_logdir(keras_info/val_loss, modemin)
# We saved the model as model.h5 in the logdir of the trial.
from tensorflow.keras.models import load_model
tuned_model load_model(logdir /model.h5)tuned_loss, tuned_accuracy tuned_model.evaluate(test_data, test_labels, verbose0)
print(Loss is {:0.4f}.format(tuned_loss))
print(Tuned accuracy is {:0.4f}.format(tuned_accuracy))
print(The original un-tuned model had an accuracy of {:0.4f}.format(original_accuracy))
predicted_label tuned_model.predict(test_data)
plot_data(test_data, predicted_label.argmax(1))我们可以通过可视化与基本事实相比较的预测来比较最佳模型的性能。
def plot_comparison(X, y):# Visualize the data setsplt.figure(figsize(16, 6))plt.subplot(1, 2, 1)for target, target_name in enumerate([Incorrect, Correct]):X_plot X[y target]plt.plot(X_plot[:, 0], X_plot[:, 1], linestylenone, markero, labeltarget_name)plt.xlabel(feature_names[0])plt.ylabel(feature_names[1])plt.axis(equal)plt.legend();plt.subplot(1, 2, 2)for target, target_name in enumerate([Incorrect, Correct]):X_plot X[y target]plt.plot(X_plot[:, 2], X_plot[:, 3], linestylenone, markero, labeltarget_name)plt.xlabel(feature_names[2])plt.ylabel(feature_names[3])plt.axis(equal)plt.legend();plot_comparison(test_data, test_labels.argmax(1) predicted_label.argmax(1))额外-使用Tensorboard获得结果
任何程序错误以及技术疑问或需要解答的请添加