brain_age_prediction.utils package

Submodules

brain_age_prediction.utils.chek_model_type module

Function to check if the ‘model_type’ used in many funcions has an accepted value. In particular, it should be a string that can only be: ‘structural’, ‘functional’ or ‘joint’.

brain_age_prediction.utils.chek_model_type.check_model_type(model_type)

Check if the specified ‘model_type’ is valid.

Parameters:

model_type (str) – String indicating the model type.

Raises:

AssertionError – If ‘model_type’ is not one of ‘structural’, ‘functional’, or ‘joint’.

This function ensures that the provided ‘model_type’ is one of the allowed values: ‘structural’, ‘functional’, or ‘joint’. If ‘model_type’ is not valid, an AssertionError is raised with a descriptive error message.

brain_age_prediction.utils.custom_models module

functions needed to create NN models

brain_age_prediction.utils.custom_models.create_functional_model(dropout, hidden_neurons, hidden_layers)

Create and compile a functional model.

Parameters:
  • dropout (float) – The dropout rate for regularization.

  • hidden_neurons (int) – The number of neurons in each hidden layer.

  • hidden_layers (int) – The number of hidden layers.

Returns:

The compiled model with Mean Absolute Error (MAE) as the loss function and Adam optimizer with a learning rate of 0.001.

Return type:

keras.models.Sequential

This function constructs a sequential neural network model with dropout regularization, batch normalization, and specified hidden layers and neurons. The output layer has a linear activation function, and the model is compiled using the Mean Absolute Error (MAE) loss function and the Adam optimizer.

brain_age_prediction.utils.custom_models.create_joint_model(dropout, hidden_neurons, hidden_layers, model_selection=False)

Create and compile a joint model that combines structural and functional features.

Parameters:
  • dropout (float) – The dropout rate for regularization.

  • hidden_neurons (int) – The number of neurons in each hidden layer.

  • hidden_layers (int) – The number of hidden layers.

  • model_selection (bool) – If True, the model is created for model selection purposes.

Returns:

The compiled joint model with Mean Absolute Error (MAE) as the loss function and Adam optimizer with a learning rate of 0.01.

Return type:

keras.models.Model

Create the joint model. It consists of two branches which are basically the structural and functional model, with hyperparameters equal to the ones individually selected during model selection. These two branches are joined using a concatenate layer. After the concatenate layer, add a number of hidden layers equal to ‘hidden_layers’, each one with a number of units equal to ‘hidden_units’. A dropout equal to ‘dropout’ is also applied, and a batch normalisation.

The input ‘model_selection’ assumes categorical values, and it indicates if the created model is to be used for model selection purposes. This needs to be specified because scikit learn wrappers employed to do model selection don’t support multi input models. So in this case a workaround was needed: the firs layer is a single layer which takes the concatenated structural and functional features, then this layer is split through Lambda layers. At this point the structure is the same as described before.

The model is compiled using the Mean Absolute Error (MAE) loss function and the Adam optimizer with a learning rate of 0.01.

brain_age_prediction.utils.custom_models.create_structural_model(dropout, hidden_neurons, hidden_layers)

Creates and compiles the structural model.

Parameters:
  • dropout (float) – The dropout rate for regularization.

  • hidden_neurons (int) – The number of neurons in each hidden layer.

  • hidden_layers (int) – The number of hidden layers.

Returns:

The compiled model with Mean Absolute Error (MAE) as the loss function and Adam optimizer with a learning rate of 0.001.

Return type:

keras.models.Sequential

This function constructs a sequential neural network model with dropout regularization, batch normalization, and specified hidden layers and neurons. The output layer has a linear activation function, and the model is compiled using the Mean Absolute Error (MAE) loss function and the Adam optimizer.

brain_age_prediction.utils.custom_models.load_best_hyperparams()

Return the best hyperparameters found for both the structural and functional models.

Returns:

Tuple containing the best hyperparameters for the structural and functional models.

Return type:

tuple

This function loads and returns the best hyperparameters found for both the structural and functional models. The hyperparameters are stored in separate pickle files.

brain_age_prediction.utils.custom_models.load_model(model_type)

Load a saved Keras model and compile it.

Parameters:

model_type (str) – The type of model to load (‘structural’, ‘functional’, or ‘joint’).

Returns:

The compiled Keras model.

Return type:

keras.models.Model

This function loads a saved Keras model and its weights based on the provided ‘model_type’ (either ‘structural’, ‘functional’, or ‘joint’). The model is then compiled using the Mean Absolute Error (MAE) loss function and the Adam optimizer with a learning rate of 0.001.

Note: Make sure that the saved model files are present in the specified paths.

brain_age_prediction.utils.loading_data module

functions useful to load the datasets and preprocess data

brain_age_prediction.utils.loading_data.load_dataset(dataset_name)

Load the dataset as pandas dataframes and return two different dataframes: - One for the TD group - One for the ASD group

Parameters:

dataset_name (str) – The name of the dataset.

Returns:

Two pandas dataframes for the TD and ASD groups, respectively.

Return type:

tuple

This function loads a dataset from a CSV file into a pandas dataframe. It then separates the dataframe into two based on the diagnostic group (TD or ASD). The resulting dataframes are returned as a tuple.

brain_age_prediction.utils.loading_data.load_train_test(split=0.3, seed=7)

Load both the structural and functional datasets. Apply preprocessing to input features. Split the data into train and test according to the “split” variable.

Parameters:
  • split (float) – The ratio of the dataset to include in the test split.

  • random_state (int) – Seed for the random state of the train_test_split function (for reproducibility).

Returns:

Tuple containing training and test sets for structural and functional data.

Return type:

tuple

This function loads both structural and functional datasets, preprocesses the input features using the preprocessing function, and then splits the data into training and testing sets.

brain_age_prediction.utils.loading_data.preprocessing(df)

Takes in input a pandas dataframe and returns a numpy array of the features used as input for the learning process. Additionally, it applies a RobustScaler preprocessing to the input features.

Parameters:

df (pandas.DataFrame) – The input dataframe.

Returns:

The preprocessed numpy array of features.

Return type:

numpy.ndarray

This function extracts relevant features from the input dataframe, converts them to a numpy array, and applies RobustScaler for preprocessing to handle outliers.

brain_age_prediction.utils.model_selection_utils module

Functions used to perform Model Selection

brain_age_prediction.utils.model_selection_utils.model_selection(search_space, x_train, y_train, model_type, max_epochs=300)

Perform k-fold cross-validation for model selection using grid search.

Parameters:
  • search_space (list) – A list of lists defining the combination of possible hyperparameters.

  • x_train (numpy.ndarray) – Input features.

  • y_train (numpy.ndarray) – Targets.

  • model_type (str) – The type of model to perform model selection for (‘structural’, ‘functional’, or ‘joint’).

  • max_epochs (int) – Maximum number of training epochs (default=300).

This function performs k-fold cross-validation using grid search to find the optimal hyperparameters for the specified model type. It saves the optimal hyperparameters to a file.

brain_age_prediction.utils.model_selection_utils.print_grid_search_results(grid_result, filename)

Prints the results of the grid search and saves them to a file.

Parameters:
  • grid_result – The output of grid_search.fit.

  • filename (str) – The name to assign to the saved file.

This function prints the best score and parameters found during a grid search, along with the mean and standard deviation of test scores for each combination of hyperparameters. It also saves the best hyperparameters to a file.

brain_age_prediction.utils.stats_utils module

Useful statistical tools

brain_age_prediction.utils.stats_utils.correlation(x, y, permutation_number=1000)

Calculate Pearson correlation coefficient and its p-value between two arrays.

Parameters:
  • x (array-like) – First array for correlation.

  • y (array-like) – Second array for correlation.

  • permutation_number (int) – Number of permutations for computing the empirical p-value. Default is 1000.

Returns:

Tuple containing the Pearson correlation coefficient and its empirical p-value.

Return type:

tuple

This function calculates the Pearson correlation coefficient (r) between two arrays ‘x’ and ‘y’. Additionally, it performs a permutation test to estimate the empirical p-value of the correlation coefficient.

brain_age_prediction.utils.stats_utils.empirical_p_value(group1, group2, num_permutations=100000)

Calculate the empirical p-value for the difference in means between two groups using permutation testing.

Parameters:
  • group1 (array-like) – Data for the first group.

  • group2 (array-like) – Data for the second group.

  • num_permutations (int) – Number of permutations to perform for the permutation test. Default is 100,000.

Returns:

Empirical p-value for the observed difference in means.

Return type:

float

This function performs a permutation test to estimate the empirical p-value for the difference in means between two groups. The observed test statistic is the difference in means between group2 and group1.

The function generates permuted test statistics by randomly permuting the data between the two groups and calculates the difference in means for each permutation. The empirical p-value is then calculated as the proportion of permuted differences in means that are greater than or equal to the observed difference in means.

Module contents