Group 7 Project Final Report

Group

Name	Proposal Contributions
Yuval Mazor	Methods, Problem Definition,Model Construction
Ruohan Feng	Background and Data Description, Data preprocessing
Jianwei Jia	Potential Results and Discussion, Data preprocessing
Wei-Hsing Huang	Methods, Data Description, Model Construction

Problem Definition

The following study investigates how prior knowledge and expectations influence perceptual judgments in human subjects during a discrimination task. The researchers assume prior knowledge influences perception by imposing contextual constraints on sensory inputs, which enhances the speed and accuracy of detecting stimuli (Dunovan & Wheeler, 2018). This is observed in fMRI studies on category-selective regions of the inferior temporal cortex (Tremel & Wheeler, 2015). Our aim in the current study is to use machine learning algorithms to observe if prior knowledge influences people’s decision-making in response to subsequent stimuli separately for hourse and face image.

Background and Literature Review:

The data are from a study developed by Dunovan & Wheeler (2018), which investigates the same research question but with a different approach. Previous research found indirect evidence for top-down predictions in the visual cortex, demonstrating the absence of an anticipated stimulus triggered a stronger response than seeing the anticipated stimulus itself (Kok et al., 2014). However, other studies found expected faces elicited a larger stimulus-evoked response than unexpected ones (Bell et al., 2016; Tremel et al., 2015). Thus, the current study would follow the same research questions as the previous articles and investigates if the prior expectations could enhance the response to anticipated stimuli.

Table 1

Experimental Design from Dunovan & Wheeler (2018). Each trial condition is depicted along with the breakdown of the cues in each trial.

Dataset Description

19 participants completed 600 trials (five runs of 120 trials); each run resulted in 787 medical 2D images which were converted into 3D datasets. The AFNI (Analysis of Functional Neuroimages) data is composed of two files (per trial per participant) containing the voxel numerical values, spatial characteristics of each voxel, and statistical information for each sub-brick. We will merge the files into the NIFTI file which encapsulates both metadata and the actual image data as the final dataset for machine learning in Python.

Figure 1

Experimental Design from Dunovan & Wheeler (2018). Each trial condition is depicted along with the breakdown of the cues in each trial.

Methods

Data Preprocessing:

To process fMRI data using AFNI, we first convert DICOM files from each run (run1 to run5) into AFNI’s BRIK/HEAD format using the to3d command, specifying the slice timing information with a scan time of 1500 ms, 785 images, and 29 slices. Next, the datasets are deobliqued using 3dWarp to standardize their orientation. Deobliquing is necessary to correct any oblique acquisition angles, ensuring that the data aligns with standard anatomical planes, which facilitates accurate processing and analysis. Then, a mask (fusiform parahippocampal mask) is resampled to match the images using 3dresample, ensuring that the mask and functional images are in the same space for accurate application. After that, we apply slice timing and motion correction with the 3dvolreg command, using the -tshift -Fourier option for slice timing correction and the -base option to set the reference volume (111th image in each run) for motion correction, while saving the motion parameters in text files. Slice timing correction adjusts for differences in acquisition times across slices, motion correction reduces bias due to subject movement, and outlier detection identifies time points with abnormal signal variations. The mask is applied to the motion-corrected images with 3dcalc to focus the analysis on the regions of interest, excluding irrelevant brain regions. Finally, we apply spatial smoothing to the motion-corrected data with the 3dmerge command, using the -1blur_fwhm 8 option to specify an 8mm FWHM Gaussian blur, and transfer the smoothed datasets to NIFTI format for machine learning data processing.

General Linear Model :

To process fMRI data using an event-based design matrix, we start by defining key parameters such as the repetition time (TR = 1.5 seconds), the number of slices (29), and the total number of volumes (3925) per run. Frame times for the fMRI scans are calculated based on these parameters. We then load event data from a CSV file (time_event.csv), which includes onset times, durations, and trial types of the events. This event information is crucial for accurately modeling the expected brain responses during the experiment.
Using the nilearn.glm.first_level.make_first_level_design_matrix function, we create the design matrix with a polynomial drift model of order 3 to account for low-frequency noise and signal drifts over time. Next, we initialize a first-level GLM model with specified parameters such as the repetition time, slice time reference, and Hemodynamic Response Function (HRF) model using the FirstLevelModel class. The first-level model is then fitted to the combined fMRI image data and the processed design matrix, estimating the parameters of the general linear model that best describe the relationship between the observed fMRI signal and the experimental design.To enhance the analysis, we define a contrast matrix as an identity matrix, which serves as a simple contrast for each condition. Contrasts are used to compare different conditions or to isolate the effect of a specific condition. For each condition, we compute the contrast using the compute_contrast method of the fitted model, producing contrast maps. These maps show regions of the brain where there are statistically significant differences in activation related to the conditions being compared.

Methods - Image 1 — Figure 2: Contrast map showing the activated and deactivated regions of the brain in the condition 7.

Methods - Image 2 — Table 2: fMRI Data Preprocessing Steps with Python Libraries and Functions.

Unsupervised and Supervised Learning Methods proposed
- Unsupervised Learning Methods for Data Processing:
- Supervised learning method for predicting the results:

Data Processing Method Implemented
- 1. Using to3d, 3dvolreg, 3dToutcount and 3dmerge tools in FMRI to do the 1st processing step:
  - a. To process FMRI data using AFNI, we first converted DICOM files from each run (run1 to run5) into AFNI’s BRIK/HEAD format using the to3d command, specifying the slice timing information with the time of each scan = 1500 ms, number of images = 785, slice numbers = 29. Next, we applied slice timing and motion correction with the 3dvolreg command, using the -tshift -Fourier option for slice timing correction and the -base option to set the reference volume (111th image in each run) for motion correction, while saving the motion parameters in text files. The 3dToutcount command then computed outlier counts for each volume in the motion-corrected data, and we used -automask to create a brain mask and -fraction to output the fraction of voxels in the mask that are outliers. Finally, we applied spatial smoothing to the motion-corrected data with the 3dmerge command, used the -1blur_fwhm 4 option to specify a 4mm FWHM Gaussian blur, and transfered the smoothed datasets to NIFTI file for ML data processing.
  - b. Slice Timing and Motion Correction: Next, we applied slice timing and motion correction with the 3dvolreg command, using the -tshift -Fourier option for slice timing correction and the -base option to set the reference volume (111th image in each run) for motion correction, while saving the motion parameters in text files.
  - c. Musk implemenation for brain area selection: A mask (fusiform parahippocampal mask) is resampled to match the images using 3dresample, ensuring that the mask and functional images are in the same space for accurate application.
  - Code file: CS7641_fMRI_DL/data_process_code/s8_preproc.sh (Midterm)
  - Code file: CS7641_fMRI_DL/data_process_code/s13_preproc.sh (Final Project)
  - Code file: CS7641_fMRI_DL/data_process_code/fmri_data_pre.ipynb (Final Project)
  - Part of the 3D images data visualization after processing:
- 2. Using K-means to do the 2nd processing step (in Midterm Report):
  - K-means method:
  - Code file: CS7641_fMRI_DL/data_process_code/project.py (Midterm)
- ML Algorithms/Models Implemented
  - 1. Using CNN-ResNet18 as the model to train:
    - Model Definition:
      - a. Define an CNN resnet model using PyTorch’s torchvisoin.module.
      - b. Include CNN layers to process 3D MRI image data and a fully connected layer to map the CNN's hidden state to 9 output classes.
      - c. Use resnet18 for the CNN layer in our training.
    - Initialization:
      - a. Use the pretrain model for better initialization.
      - b. Set the model to use GPU if available to speed up training and inference.
    - Data Handling:
      - a. Define a custom dataset class to handle 3D MRI data and corresponding labels, transforming them for CNN input.
      - b. Clean and preprocess the data, ensuring it is in the correct format for CNN input.
    - Training and Validation Split:
      - a. Split the dataset into training and validation sets using an 80% - 20% split ratio.
      - b. Use PyTorch's DataLoader to batch and shuffle the training data, ensuring efficient data loading during training.
    - Loss Function and Optimizer:
      - a. Use CrossEntropyLoss for multi-class classification.
      - b. Implement the Adam optimizer with a learning rate of 0.01 and weight decay of 0.001 to avoid the overfitting.
      - c. Incorporate a learning rate scheduler to dynamically adjust the learning rate during training.
    - Training Loop (using 80 epoch for training):
      - For each epoch, perform:
        
        a. A forward pass to compute predictions.
        
        b. Calculate the loss using the defined loss function.
        
        c. Backpropagate the loss to compute gradients.
        
        d. Update the model parameters using the optimizer.
      - Track and print training loss and accuracy to monitor performance.
    - Validation Loop:
      - a. Evaluate the model on the validation set without gradient computation.
      - b. Calculate and print validation loss and accuracy for each epoch.
      - c. The model result evaluation will be present in results and discussion chapter.
  - 2. Using CNN-VGG11 as the model to train:
    - Model Definition:
      - a. Define a CNN VGG11 model using PyTorch’s torchvision.models.
      - b. Include CNN layers to process 3D fMRI image data and a fully connected layer to map the CNN’s hidden state to 9 output classes.
      - c. Use VGG11 for the CNN layer in our training.
    - Initialization:
      - a. Use the pretrain model for better initialization.
      - b. Set the model to use GPU if available to speed up training and inference.
    - Data Handling:
      - a. Define a custom dataset class to handle 3D MRI data and corresponding labels, transforming them for CNN input.
      - b. Clean and preprocess the data, ensuring it is in the correct format for CNN input.
    - Training and Validation Split:
      - a. Split the dataset into training and validation sets using an 80% - 20% split ratio.
      - b. Use PyTorch's DataLoader to batch and shuffle the training data, ensuring efficient data loading during training.
    - Loss Function and Optimizer:
      - a. Use CrossEntropyLoss for multi-class classification.
      - b. Implement the Adam optimizer with a learning rate of 0.01 and weight decay of 0.001 to avoid the overfitting.
      - c. Incorporate a learning rate scheduler to dynamically adjust the learning rate during training.
    - Training Loop (using 80 epoch for training):
      - For each epoch, perform:
        
        a. A forward pass to compute predictions.
        
        b. Calculate the loss using the defined loss function.
        
        c. Backpropagate the loss to compute gradients.
        
        d. Update the model parameters using the optimizer.
      - Track and print training loss and accuracy to monitor performance.
    - Validation Loop:
      - a. Evaluate the model on the validation set without gradient computation.
      - b. Calculate and print validation loss and accuracy for each epoch.
      - c. The model result evaluation will be present in results and discussion chapter.
  - 3. Using CNN-MobileNetV2 as the model to train:
    - Model Definition:
      - a. Define a CNN MobileNetV2 model using PyTorch’s torchvision.models.
      - b. Include CNN layers to process 3D fMRI image data and a fully connected layer to map the CNN’s hidden state to 9 output classes.
      - c. Use MobileNetV2 for the CNN layer in our training.
    - Initialization:
      - a. Use the pretrain model for better initialization.
      - b. Set the model to use GPU if available to speed up training and inference.
    - Data Handling:
      - a. Define a custom dataset class to handle 3D MRI data and corresponding labels, transforming them for CNN input.
      - b. Clean and preprocess the data, ensuring it is in the correct format for CNN input.
    - Training and Validation Split:
      - a. Split the dataset into training and validation sets using an 80% - 20% split ratio.
      - b. Use PyTorch's DataLoader to batch and shuffle the training data, ensuring efficient data loading during training.
    - Loss Function and Optimizer:
      - a. Use CrossEntropyLoss for multi-class classification.
      - b. Implement the Adam optimizer with a learning rate of 0.01 and weight decay of 0.001 to avoid the overfitting.
      - c. Incorporate a learning rate scheduler to dynamically adjust the learning rate during training.
    - Training Loop (using 80 epoch for training):
      - For each epoch, perform:
        
        a. A forward pass to compute predictions.
        
        b. Calculate the loss using the defined loss function.
        
        c. Backpropagate the loss to compute gradients.
        
        d. Update the model parameters using the optimizer.
      - Track and print training loss and accuracy to monitor performance.
    - Validation Loop:
      - a. Evaluate the model on the validation set without gradient computation.
      - b. Calculate and print validation loss and accuracy for each epoch.
      - c. The model result evaluation will be present in results and discussion chapter.
  - Code file: CS7641_fMRI_DL/data_process_code/project_1.py (Final)
  - Code file: CS7641_fMRI_DL/data_process_code/project_Hugo_ver2.py (Final including plot)

Results and Discussion

For the validation data prediction:

1. Draw the Confusion Matrix for the three Predict Models:
- Confusion Matrix For CNN-ResNet18:
- Confusion Matrix For CNN-VGG11:
- Confusion Matrix For CNN-MobileNetV2:
The confusion Matrix is used to see the relationship between the predict and true result in terms of 9 different picture conditions according to the brain 3D images, which means if we can predict the pictures successfully according to the brain activation. From above, we can know the ResNet18 and MobileNetV2 has a better performance.
2. Draw the Quantitative Metrics for the three Predict Models:

Quantitative Metrics For CNN-ResNet18:
Accuracy = 0.85
Recall = 0.85
Precision = 0.95
F1 score = 0.89

Quantitative Metrics For CNN-VGG11:
Accuracy = 0.28
Recall = 0.19
Precision = 0.13
F1 score = 0.13

Quantitative Metrics For CNN-MobileNetV2:
Accuracy = 0.83
Recall = 0.82
Precision = 0.94
F1 score = 0.86

Project Goals
- Confirm what areas of the ITC are activated during the study.
- See if we can predict what image was being viewed from the changes in brain activation
- Investigate the neural signals corresponding to the subjects' expectations during the pre- and post-sensory stages of decision-making

Expected Results and Current Results
- 1.Compared with the three different models:
  - a. Advantages and Disadvantages for CNN-ResNet18:
  - b. Advantages and Disadvantages for CNN-VGG11:
  - c. Advantages and Disadvantages for CNN-MobileNetV2:
- 2. Compared the confusion matrix and quantitative metrics for the three model's prediction performance:
- 3. Overall Conclusion:
Further to improve:
- 1. Data process: Use advance data process techniques to increase sample diversity, helping the model learn more robust features.
- 2. Model Adjustment: Try different model architectures or hyperparameter optimization to enhance the model's generalization ability.
- 3. Feature Engineering: Extract more effective features or use new pre-trained models for feature extraction.

Timeline

1. Prepartion and Survey
(Project Proposal)

May 24 - June 14, 2024

Initial meeting to discuss FMRI project data access, goals, timeline, and responsibilities.

2. Data Collection and Initial Data Preprocess

June 15, 2024 - June 24, 2024

Collecting and organizing the FMRI data from various sources, using FMRI tools, K-means method to do the data process and clean.

3. Initial Model Construction and Training
(Midpoint Report)

June 25, 2024 - July 03, 2024

Using CNN (ResNet) to do the model training and validation.

4. Advanced Data Process, Model Construction and Training (Using Mask to do brain area selection and construct the CNN-VGG11 and MobileNetV2).

July 04, 2024 - July 14, 2024

Cleaning and preprocessing the data for analysis.

5. Model Improvement and Evaluation (Training and using three models to do the prediction.)
(Final Report)

July 15, 2024 - July 23, 2024

Training machine learning models on the preprocessed data.

Gantt Chart

Gantt chart

References

Bell, A. H., Summerfield, C., Morin, E. L., Malecek, N. J. & Ungerleider, L. G. Encoding of Stimulus Probability in Macaque Inferior Temporal Cortex. Curr. Biol. 1–11, https://doi.org/10.1016/j.cub.2016.07.007 (2016).
Dunovan, K., & Wheeler, M. E. (2018). Computational and neural signatures of pre and post-sensory expectation bias in inferior temporal cortex. Scientific Reports, 8(1), 13256. https://doi.org/10.1038/s41598-018-31678-x
Kok, P., Failing, M. F. & de Lange, F. P. Prior Expectations Evoke Stimulus Templates in the Primary VisualCortex. J. Cogn. Neurosci. 26, 194–198 (2014).
Michel, Vincent, et al. (2011). “A Supervised Clustering Approach for Fmri-Based Inference of Brain States.” Pattern Recognition, Pergamon, https://www.sciencedirect.com/science/article/pii/S0031320311001439
Mwangi, B., Tian, T. S., & Soares, J. C. (2014). A review of feature reduction techniques in neuroimaging. Retrieved from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4040248/
Tremel, J. J., & Wheeler, M. E. (2015). Content-specific evidence accumulation in inferior temporal cortex during perceptual decision-making. NeuroImage, 109, 35–49. https://doi.org/10.1016/j.neuroimage.2014.12.072