On this page

Model Tuning and Optimization Techniques in Keras

This tutorial covers advanced techniques for tuning and optimizing models in Keras, including hyperparameter tuning, regularization methods, and learning rate schedulers.

Hyperparameter Tuning

Hyperparameter tuning is crucial for improving the performance of machine learning models. Keras provides several ways to perform hyperparameter tuning.

Grid Search

Grid search exhaustively searches through a specified subset of hyperparameters. It is straightforward but can be computationally expensive.

  from sklearn.model_selection import GridSearchCV
from keras.wrappers.scikit_learn import KerasClassifier

def create_model(optimizer='adam'):
    model = Sequential()
    model.add(Dense(12, input_dim=8, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model

model = KerasClassifier(build_fn=create_model, verbose=0)
optimizers = ['rmsprop', 'adam']
epochs = [50, 100]
batches = [5, 10]
param_grid = dict(optimizer=optimizers, epochs=epochs, batch_size=batches)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
grid_result = grid.fit(X, Y)

print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")

Random Search

Random search samples a wide range of hyperparameters randomly. It is often more efficient than grid search.

  from sklearn.model_selection import RandomizedSearchCV

param_dist = dict(optimizer=optimizers, epochs=epochs, batch_size=batches)
random = RandomizedSearchCV(estimator=model, param_distributions=param_dist, n_iter=10, n_jobs=-1, cv=3)
random_result = random.fit(X, Y)

print(f"Best: {random_result.best_score_} using {random_result.best_params_}")

Bayesian Optimization

Bayesian optimization models the function that maps hyperparameters to the objective value and uses this model to select the next set of hyperparameters to evaluate.

  from keras_tuner import BayesianOptimization

def build_model(hp):
    model = Sequential()
    model.add(Dense(units=hp.Int('units', min_value=32, max_value=512, step=32), activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(optimizer=hp.Choice('optimizer', ['adam', 'rmsprop']), loss='binary_crossentropy', metrics=['accuracy'])
    return model

tuner = BayesianOptimization(build_model, objective='val_accuracy', max_trials=10, executions_per_trial=3)
tuner.search(X_train, y_train, epochs=50, validation_data=(X_val, y_val))

best_hps = tuner.get_best_hyperparameters(num_trials=1)[0]
print(f"Best hyperparameters: {best_hps.values}")

Regularization Techniques

Regularization techniques help prevent overfitting by adding a penalty to the model’s complexity.

L1 and L2 Regularization

L1 regularization adds the absolute value of the coefficients, while L2 adds the squared value.

  from keras.regularizers import l1, l2

model.add(Dense(64, input_dim=64, activation='relu', kernel_regularizer=l2(0.01)))
model.add(Dense(64, activation='relu', kernel_regularizer=l1(0.01)))

Dropout

Dropout randomly sets a fraction of input units to 0 at each update during training time, which helps prevent overfitting.

  from keras.layers import Dropout

model.add(Dropout(0.5))

Batch Normalization

Batch normalization normalizes the inputs of each layer so that they have a mean of 0 and a standard deviation of 1.

  from keras.layers import BatchNormalization

model.add(BatchNormalization())

Learning Rate Schedulers

Learning rate schedulers adjust the learning rate during training, which can help improve performance and convergence.

Step Decay

Step decay reduces the learning rate by a factor every few epochs.

  def step_decay(epoch):
    initial_lr = 0.1
    drop = 0.5
    epochs_drop = 10.0
    lr = initial_lr * (drop ** np.floor((1+epoch)/epochs_drop))
    return lr

from keras.callbacks import LearningRateScheduler

lr_scheduler = LearningRateScheduler(step_decay)
model.fit(X_train, y_train, epochs=100, callbacks=[lr_scheduler])

Exponential Decay

Exponential decay reduces the learning rate exponentially over epochs.

  def exp_decay(epoch):
    initial_lr = 0.1
    k = 0.1
    lr = initial_lr * np.exp(-k*epoch)
    return lr

lr_scheduler = LearningRateScheduler(exp_decay)
model.fit(X_train, y_train, epochs=100, callbacks=[lr_scheduler])

LearningRateScheduler Callback

Keras provides a LearningRateScheduler callback to implement custom learning rate schedules.

  def custom_lr_schedule(epoch, lr):
    if epoch < 10:
        return lr
    else:
        return lr * tf.math.exp(-0.1)

lr_scheduler = LearningRateScheduler(custom_lr_schedule)
model.fit(X_train, y_train, epochs=100, callbacks=[lr_scheduler])

By applying these advanced tuning and optimization techniques, you can significantly improve the performance and efficiency of your Keras models.

Learn How To Build AI Projects

Now, if you are interested in upskilling in 2024 with AI development, check out this 6 AI advanced projects with Golang where you will learn about building with AI and getting the best knowledge there is currently. Here’s the link.

Edit this page

Last updated 17 Aug 2024, 12:31 +0200 . history

Model Training and Evaluation

Understanding Model Training …

Modules and Packages in Mojo

Mojo Lang description

Model Tuning and Optimization Techniques in Keras

Hyperparameter Tuning link

Grid Search link

Random Search link

Bayesian Optimization link

Regularization Techniques link

L1 and L2 Regularization link

Dropout link

Batch Normalization link

Learning Rate Schedulers link

Step Decay link

Exponential Decay link

LearningRateScheduler Callback link

Learn How To Build AI Projects link