Last active
March 2, 2020 12:22
-
-
Save lucasastorian/7419bcdda5bcd40a414eee4c1601a146 to your computer and use it in GitHub Desktop.
Extract new features from all the layer-wise activations of a trained autoencoder
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import lightgbm as lgb | |
| # compute list of activations for each hidden dense layer | |
| layer_outputs = [layer.output for layer in autoencoder.layers if 'dense' in layer.name] | |
| activation_model = tf.keras.models.Model(inputs=autoencoder.input, outputs=layer_outputs) | |
| # compute training and validation activations | |
| train_activations = activation_model.predict(X_train_standard) | |
| val_activations = activation_model.predict(X_val_standard) | |
| # concatenate activations into one big Dataframe | |
| X_train_autoencoder_activations = pd.DataFrame(np.concatenate(train_activations, axis=1)) | |
| X_val_autoencoder_activations = pd.DataFrame(np.concatenate(val_activations, axis=1)) | |
| # standardize | |
| scaler = StandardScaler() | |
| X_train_autoencoder_activations = scaler.fit_transform(X_train_autoencoder_activations) | |
| X_val_autoencoder_activations = scaler.transform((X_val_autoencoder_activations)) | |
| # convert to DMatrix for XGBoost | |
| dtrain = lgb.Dataset(X_train_autoencoder_activations, y_train, free_raw_data=False).construct() | |
| dval = lgb.Dataset(X_val_autoencoder_activations, y_val, free_raw_data=False).construct() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment