Tuesday, May 22, 2018

A Deep Learning Black Box Code

I mentioned in a previous blog that for the basic machine learning tasks such as classification or regression, we could replace the traditional machine learning tools, such as SVM or random forest, by a fully-connected deep learning solution.  If we do such tasks often, we should be able to write a black-box solution based on a few simplifications: (1) all layers have the same number of neurons; (2) dropout is applied only to the last layer; (3) We use ReLu.  I have not really work on this, but below is an outlier of the Keras code, which automatically explores the optimal NN solution for the MNIST dataset.


from keras.datasets import mnist
from keras import models, layers
from keras.utils import to_categorical
import numpy as np

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
train_images = train_images.reshape(-1, 28*28).astype('float32')/255
validate_images = train_images[-10000:]
train_images = train_images[:-10000]
test_images = test_images.reshape(-1, 28*28).astype('float32')/255
train_labels = to_categorical(train_labels)
validate_labels = train_labels[-10000:]
train_labels = train_labels[:-10000]
test_labels = to_categorical(test_labels)

def model(input_shape, n_classes, n_layer=3, nodes_in_layer=64, dropout=0.5):
    net = models.Sequential()
    net.add(layers.Dense(nodes_in_layer, input_shape=input_shape, activation='relu'))
    for i in range(n_layer-1):
        net.add(layers.Dense(nodes_in_layer, activation='relu'))
    if dropout>0:
            net.add(layers.Dropout(dropout))
    net.add(layers.Dense(n_classes, activation='softmax'))
    return net

input_shape=(28*28,)

n_classes=10
n_layers=[1, 2, 3, 4]
nodes=[16, 32, 64, 128]
dropouts=[0, 0.2, 0.5]

n_search=10
best_acc=0.0
for i in range(n_search):
    n_layer = n_layers[np.random.randint(4)]
    nodes_in_layer = nodes[np.random.randint(4)]
    dropout = dropouts[np.random.randint(3)]
    print("INFO> Test new model: n_layer=%d, nodes=%d, dropout=%.2f" % (n_layer, nodes_in_layer, dropout))
    net = model(input_shape, n_classes, n_layer, nodes_in_layer, dropout)
    net.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    history = net.fit(train_images, train_labels, epochs=5, batch_size=64, validation_data=(validate_images, validate_labels))
    loss,acc = net.evaluate(test_images, test_labels)
    if acc > best_acc:
        print('INFO> Better model found, accuracy=', acc)
        best_acc=acc
        net.save('my_best_model.h5')
    else:
        print('INFO> Keep existing model, new model has worse accuracy=', acc)

An example run of the above script gives the following output, where the optimal model is highlighted:


INFO> Test new model: n_layer=1, nodes=16, dropout=0.50
INFO> Better model found, accuracy= 0.9193
INFO> Test new model: n_layer=3, nodes=32, dropout=0.50
INFO> Better model found, accuracy= 0.9577
INFO> Test new model: n_layer=4, nodes=64, dropout=0.50
INFO> Better model found, accuracy= 0.9707
INFO> Test new model: n_layer=2, nodes=64, dropout=0.50
INFO> Keep existing model, new model has worse accuracy= 0.9674
INFO> Test new model: n_layer=3, nodes=64, dropout=0.50
INFO> Better model found, accuracy= 0.9729
INFO> Test new model: n_layer=1, nodes=64, dropout=0.00
INFO> Keep existing model, new model has worse accuracy= 0.9674
INFO> Test new model: n_layer=2, nodes=32, dropout=0.00
INFO> Keep existing model, new model has worse accuracy= 0.9597
INFO> Test new model: n_layer=2, nodes=16, dropout=0.20
INFO> Keep existing model, new model has worse accuracy= 0.9383
INFO> Test new model: n_layer=4, nodes=64, dropout=0.20
INFO> Keep existing model, new model has worse accuracy= 0.9672
INFO> Test new model: n_layer=2, nodes=32, dropout=0.00
INFO> Keep existing model, new model has worse accuracy= 0.9583