The work of Dr. Zarella and Dr. Breen in the Advanced Pathology Imaging Laboratory (APIL) at Drexel College of Medicine focuses on the intersections of computing, digital imagery, and pathology. Work in the lab explores techniques to improve pathological diagnosis of histological images stained using hematoxylin and eosin (H&E) which has been described as "the cornerstone of anatomical pathology diagnosis."
Researchers at University of North Carolina produced a convolutional neural network (CNN) to obtain statistically significant predictions of the pathology of tissue samples. The findings, detailed in the paper "Image analysis with deep learning to predict breast cancer grade, ER status, histologic subtype, and intrinsic subtype," have implications for reducing the speed and cost of diagnosis while increasing its accuracy.
APIL is exploring algorithms to classify pathological structures using machine learning. The research and algorithms from UNC provide a point of comparison for approaches used by APIL. The work may provide insight into the features algorithms use to make predictions. This research has several aims:
- reproduce the findings of the original programs
- save trained machine learning models
- use the models to predict the malignancy of a tumor from a new dataset
The programs used in the original research are posted on GitHub and a fork of the project was created4. A clean development system was provisioned and all dependencies were installed.
Successful output of the program resulted in a trained model as well as a confusion matrix, used for visual assessment of the program’s classification performance. The diagonal values of the matrix denote correct classification. We reviewed images that were falsely classified.
The program was altered to save trained models and a standalone component was developed. This new program accepts, as input, a trained model and features extracted from histological images. A full slide histological image was selected for evaluation on a previously trained model. The image was tiled into 224x224 pixel squares and each tile was passed to the program for prediction. The resulting predictions were overlaid as green (benign) and yellow (malignant) on the original image.
When overlaid on the original slide, predictions of the trained model revealed clustered areas of benign prediction and similar areas of malignant prediction. In some cases, related areas within the same cluster of tiles showed large disparity in the prediction. The causes for these classifications requires further investigation.
Additionally, the Advanced Pathology Imaging Laboratory has a large collection of H&E stained breast cancer slides at high magnification and we would like to prepare these to be trained with the program for comparison to the original findings.
I am working to create a simple neural network to be combined with new lecture materials to allow students to observe the structure of such a program as well as the features of the model that influence its predictions.
- The American Heritage Dictionary of Medicine (2nd ed.). (2015) Boston: Houghton Mifflin
- Couture, H. D., Williams, L. A., Geradts, J., Nyante, S. J., Butler, E. N., Marron, J. S., . . . Niethammer , M. (2018). Image analysis with deep learning to predict breast cancer grade, ER status, histologic subtype, and intrinsic subtype. npj Breast Cancer, 4, 30. doi:10.1038/s41523-018-0079-1
- GitHub ImageMIL Original Program. (2019, 07 27). Retrieved from https://github.com/hdcouture/ImageMIL
- GitHub ImageMIL Fork. (2019, 7 27). Retrieved from https://github.com/lliss/ImageMIL
10 second / tweet
To survive cancer, fast and accurate diagnosis is key. I am assessing the generalizability of machine learning techniques for cancer diagnosis.
Machine learning techniques are common in image analysis. We've seen programs that can identify animals, boats, and people in photos. These techniques can also be applied to cancer diagnosis. Recent research has found that machine learning techniques can provide accurate categorization of the pathology of H&E stained histological images (a common technique used to prepare tissue samples for visual analysis). This research was based on two data sets of H&E stained breast cancer histological images. This project aims to assess the accuracy of the program developed for the prior research with new data sets.
Machine learning techniques are common in image analysis. We've seen programs that can identify animals, boats, and people in photos. These techniques can also be applied to cancer diagnosis. Commonly slides are prepared using a staining technique called Hematoxylin and Eosin (H&E). This staining process allows for features of tissue and cellular structures to be easily visually identified. Commonly, pathologists “read” such images to assess the grade, severity, or malignancy of tumors. In some cases, RNA based testing provides highly accurate diagnosis but the technique is more costly in both time and resources than the use of histological images. Further, RNA testing may be unavailable or in limited supply in certain locations. Recent research has found that machine learning techniques can provide accurate categorization of the pathology of H&E stained histological images. This can reduce cost and identify samples that would benefit from follow up testing such as RNA-based genomic testing. The research was based on two data sets of H&E stained breast cancer histological images. This project aims to assess the accuracy of the program developed for the prior research with new data sets.
Investigating Neural Networks — A Brief Introduction
Students will be presented with terminology and concepts surrounding the computer science topic of a neural network. Terms such as neuron, layer, hidden layer, input layer, output layer, and propagation will be explained. The lesson will conclude with a working example of a simple neural network that has been trained to make predictions on a given data set.
During normal internet and computer usage every day people encounter predictive technologies. These may be recommendation engines, search algorithms, or assistive technologies. Increasingly these technologies are also being used in retail and advertorial settings as well. Predictive systems rely on computer engineers developing programs that can process data and extract patterns and meaning. As these systems become more popular computer scientists and engineers will require some familiarity and exposure to these concepts. Further, since the algorithms and predictions can appear quite useful even when the underlying methodology is unknown, there is a real risk of implementation without proper skepticism or debate. Software errors and output inconsistencies are very dangerous in an environment in which the underlying mechanics are unknown. For this reason it is important for students to be able to assess and analyze programs that rely on machine learning algorithms to ensure they are fair and accurate.
After this lesson, students should be able to:
- Explain how a neural network is structure.
- Define key terms relating to neural networks