Deep learning neural networks are generally opaque, meaning that although they can make useful and skillful predictions, it is not clear how or why a given prediction was made.
Convolutional neural networks, have internal structures that are designed to operate upon two-dimensional image data, and as such preserve the spatial relationships for what was learned by the model. Specifically, the two-dimensional filters learned by the model can be inspected and visualized to discover the types of features that the model will detect, and the activation maps output by convolutional layers can be inspected to understand exactly what features were detected for a given input image.
In this tutorial, you will discover how to develop simple visualizations for filters and feature maps in a convolutional neural network. Kick-start your project with my new book Deep Learning for Computer Visionincluding step-by-step tutorials and the Python source code files for all examples. Neural network models are generally referred to as being opaque. This means that they are poor at explaining the reason why a specific decision or prediction was made.
Convolutional neural networks are designed to work with image data, and their structure and function suggest that should be less inscrutable than other types of neural networks. Specifically, the models are comprised of small linear filters and the result of applying filters called activation maps, or more generally, feature maps. For example, we can design and understand small filters, such as line detectors. Perhaps visualizing the filters within a learned convolutional neural network can provide insight into how the model works.
The feature maps that result from applying filters to input images and to feature maps output by prior layers could provide insight into the internal representation that the model has of a specific input at a given point in the model. We will explore both of these approaches to visualizing a convolutional neural network in this tutorial.
Instead of fitting a model from scratch, we can use a pre-fit prior state-of-the-art image classification model. One example is the VGG model that achieved top results in the competition. This is a good model to use for visualization because it has a simple uniform structure of serially ordered convolutional and pooling layers, it is deep with 16 learned layers, and it performed very well, meaning that the filters and resulting feature maps will capture useful features.
We can load and summarize the VGG16 model with just a few lines of code; for example:. Running the example will load the model weights into memory and print a summary of the loaded model. If this is the first time that you have loaded the model, the weights will be downloaded from the internet and stored in your home directory.
These weights are approximately megabytes and may take a moment to download depending on the speed of your internet connection. We can see that the layers are well named, organized into blocks, and named with integer indexes within each block. In neural network terminology, the learned filters are simply weights, yet because of the specialized two-dimensional structure of the filters, the weight values have a spatial relationship to each other and plotting each filter as a two-dimensional image is meaningful or could be.
The model summary printed in the previous section summarizes the output shape of each layer, e. It does not give any idea of the shape of the filters weights in the network, only the total number of weights per layer.
Each layer has a layer. One is the block of filters and the other is the block of bias values. These are accessible via the layer. We can retrieve these weights and then summarize their shape.
Running the example prints a list of layer details including the layer name and the shape of the filters in the layer.In Machine Learning, we always want to get insights into data: like get familiar with the training samples or better understand the label distribution. To do that, we visualize the data in many different ways. Typically, we need to look into multiple characteristics of the data simultaneously.
In classic ML, for example, the data may have thousands of labels. To find the right model, we first need to understand the structure of the data and the importance of these characteristics. In Deep Learning, the dimensionality gets higher compared to the classic ML. In classification, for example, we only have an image and a single corresponding class label.
On the other hand, Neural Nets have millions of parameters and multiple layers that do some complex data processing. As we live in a 3-dimensional space, we can comprehend not more than 1- 2- or 3-dimensional plots.
To visualize multidimensional data in lower dimensions, there is a family of algorithms named Dimensionality Reduction methods. The algorithm works well even for large datasets — and thus became an industry standard in Machine Learning. Now people apply it in various ML tasks including bioinformatics, cancer detection and disease diagnosis, natural language processing, and various areas in Deep Learning image recognition.
The main goal of t-SNE is to project multi-dimensional points to 2- or 3-dimensional plots so that if two points were close in the initial high-dimensional space, they stay close in the resulting projection. If the points were far from each other, they should stay far in the target low-dimensional space too.
To do that, t-SNE first creates a probability distribution that captures these mutual distance relationships between the points in the initial high-dimensional space. After this, the algorithm tries to create a low-dimensional space that has similar relations between the points. As a cost function, it uses Kullback—Leibler divergence — a commonly used measure of how different two data distributions are. You can play with t-SNE visualizations for various data distributions here.
We can describe classification network on a high level as follows. They have a backbone that extracts valuable information, or features, from the image. It also has a classifier applied right after the backbone.
The classifier makes a final decision based on the information extracted by the backbone. During the forward pass, the backbone gradually decreases the spatial size of the data while increasing the number of its channels. This way, it extracts high-level concepts about the image contents — like notions of face or car — and stores them in the channels of the smaller feature maps. This floating point numbers are essentially all the knowledge and concepts that the network extracted from the input image encoded in some way.
This high-level information then goes to the classifier that makes the final prediction. ResNet has a high classification quality — which means that the information it extracts is rich and valuable. In fact, ResNet works great on the ImageNet dataset it was trained on, but it also works well on other datasets, too, which means that it can extract these concepts on many different types of images.
New Feature Designations
We need to re-implement ResNet to be able to extract the last feature map before the classifier head. In this post, we use Animals10 dataset. It contains pictures of 10 different animals: cat, dog, chicken, cow, horse, sheep, squirrel, elephant, butterfly, and spider.
This is it — the result named tsne is the 2-dimensional projection of the dimensional features. Here we use the default values of all the other hyperparameters of t-SNE used in sklearn. First, the samples of the same classes form clearly visible clusters here. This means that network really understands the data and its classes and is able to distinguish them.
Second, notice the relations between the clusters here. We can see clusters for big domestic animals — cow, horse, and sheep — close to each other. Dogs are not far from them, and also close to the cats cluster. Spiders and butterflies are also located close to each other.
This means that the features represent semantic relations between the objects in real life.GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Work fast with our official CLI. Learn more. If nothing happens, download GitHub Desktop and try again. If nothing happens, download Xcode and try again. If nothing happens, download the GitHub extension for Visual Studio and try again. Create a conda environment with the required dependencies in order to run the notebooks on your computer.
We use optional third-party analytics cookies to understand how you use GitHub. You can always update your selection by clicking Cookie Preferences at the bottom of the page. For more information, see our Privacy Statement. We use essential cookies to perform essential website functions, e. We use analytics cookies to understand how you use our websites so we can make them better, e. Skip to content. Dismiss Join GitHub today GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign up. Go back. Launching Xcode If nothing happens, download Xcode and try again. Latest commit. Remove broken link to colab notebook. Git stats 22 commits. Failed to load latest commit information.Pytorch Tutorial - Setting up a Deep Learning Environment (Anaconda \u0026 PyCharm)
View code. Visualizing convolutional features using PyTorch Take a look at my blog post for detailed explanations. Getting Started Create a conda environment with the required dependencies in order to run the notebooks on your computer. Releases No releases published.
t-SNE for Feature Visualization
Packages 0 No packages published. Contributors 2 fg91 Fabio M. Graetz dhth Dhruv Thakur. You signed in with another tab or window.In the 60 Minute Blitzwe show you how to load in data, feed it through a model we define as a subclass of nn. Moduletrain this model on training data, and test it on test data. However, we can do much better than that: PyTorch integrates with TensorBoard, a tool designed for visualizing the results of neural network training runs.
How to Visualize Filters and Feature Maps in Convolutional Neural Networks
Now you know how to use TensorBoard! This example, however, could be done in a Jupyter Notebook - where TensorBoard really excels is in creating interactive visualizations. Furthermore, this is interactive: you can click and drag to rotate the three dimensional projection. You can now look at the scalars tab to see the running loss plotted over the 15, iterations of training:. In addition, we can look at the predictions the model made on arbitrary batches throughout learning.
Of course, you could do everything TensorBoard does in your Jupyter Notebook, but with TensorBoard, you gets visuals that are interactive by default.
To analyze traffic and optimize your experience, we serve cookies on this site. By clicking or navigating, you agree to allow our usage of cookies. Learn more, including about available controls: Cookies Policy.
Table of Contents. Run in Google Colab. Download Notebook. View on GitHub. Set up TensorBoard. Write to TensorBoard. Inspect a model architecture using TensorBoard. Use TensorBoard to create interactive versions of the visualizations we created in last tutorial, with less code.
Compose [ transforms.I like this definition because it avoids the hyped discussion whether AI is truly intelligent in the sense of our intelligence. Deep Learning research is aimed at learning rules from data to automize processes that until now were not automizable.
While that may sound less exciting, it truly is a great thing. Just one example: the emergence of deep convolutional neural networks revolutionized computer vision and pattern recognition and will allow us to introduce a vast amount of automation in fields such as medical diagnosis. This could allow humanity to quickly bring top medical diagnosis to people in poor countries that are not able to educate the many doctors and experts they would otherwise require.
Despite all the exciting news about Deep Learning, the exact way neural networks see and interpret the world remains a black box. A better understanding of how exactly they recognize specific patterns or objects and why they work so well might allow us to 1 improve them even further and would 2 also solve legal problems since in many cases the decisions a machine takes have to be interpretable to humans.
There are two main ways to try to understand how a neural network recognizes a certain pattern. If you want to know what kind of pattern significantly activates a certain feature map you could 1 either try to find images in a dataset that result in a high average activation of this feature map or you could 2 try to generate such a pattern by optimizing the pixel values in a random image.
The latter idea was proposed by Erhan et al. The article is structured as follows: First, I will show you visualizations of convolutional features in several layers of a VGG network, then we will try to understand some of those visualizations and I will show you how to quickly test a hypothesis of what kind of pattern a certain filter might detect. Finally, I will explain the code that is necessary to create the patterns presented in this article.
Neural networks learn to transform input data such as images into successive layers of increasingly meaningful and complex representations. You can think of a deep network as a multistage information-distillation operation, where information goes through successive filters and comes out increasingly purified.
After reading his article, you will know how you can generate patterns that maximize the mean activation of a chosen feature map in a certain layer of those hierarchical representations, how you might be able to interpret some of those visualizations, and finally how to test a hypothesis of what kind of pattern or texture the chosen filter might respond to. Below you find feature visualizations for filters in several layers of a VGG network.
While looking at them, I would like you to observe how the complexity of the generated patterns increases the deeper we get into the network. Those patterns really blow my mind! Starting with this one, does this remind you of something? The picture immediately reminded me of the round arches of a vaulted ceiling you would find in churches. So how could we test this hypothesis? The picture of the artificial arches was created by maximizing the mean activation of the th feature map in the 40th layer.
We, therefore, simply apply the network to the picture and plot the average activations of the feature maps in the 40th layer. What do we see?
A strong spike at feature mapas expected!Pytorch implementation of convolutional neural network visualization techniques. GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Work fast with our official CLI. Learn more. If nothing happens, download GitHub Desktop and try again. If nothing happens, download Xcode and try again. If nothing happens, download the GitHub extension for Visual Studio and try again. This repository contains a number of convolutional neural network visualization techniques implemented in PyTorch. Note : I removed cv2 dependencies and moved the repository towards PIL.
A few things might be broken although I tested all methodsI would appreciate if you could create an issue if something does not work. Note : The code in this repository was tested with torch version 0.
How to visualize convolutional features in 40 lines of code
Although it shouldn't be too much of an effort to make it work, I have no plans at the moment to make the code in this repository compatible with the latest version because I'm still using 0. I moved following Adversarial example generation techniques here to separate visualizations from adversarial stuff. Some of the code also assumes that the layers in the model are separated into two sections; featureswhich contains the convolutional layers and classifierthat contains the fully connected layer after flatting out convolutions.
If you want to port this code to use it on your model that does not have such separation, you just need to do some editing on parts where it calls model. Every technique has its own python file e.
All images are pre-processed with mean and std of the ImageNet dataset before being fed to the model. None of the code uses GPU as these operations are quite fast for a single image except for deep dream because of the example image that is used for it is huge. You can make use of gpu with very little effort.
The example pictures below include numbers in the brackets after the description, like Mastiffthis number represents the class id in the ImageNet dataset. I tried to comment on the code as much as possible, if you have any issues understanding it or porting it, don't hesitate to send an email or create an issue. Another technique that is proposed is simply multiplying the gradients with the image itself.
Results obtained with the usage of multiple gradient techniques are below. Smooth grad is adding some Gaussian noise to the original image and calculating gradients multiple times and averaging the results . There are two examples at the bottom which use vanilla and guided backpropagation to calculate the gradients.
Number of images n to average over is selected as CNN filters can be visualized when we optimize the input image with respect to output of the specific convolution operation. For this example I used a pre-trained VGG Visualizations of layers start with basic color and direction filters at lower levels. As we approach towards the final layer the complexity of the filters also increase. If you employ external techniques like blurring, gradient clipping etc. Another way to visualize CNN layers is to to visualize activations for a specific input on a specific layer and filter.
This was done in  Figure 3. The method is quite similar to guided backpropagation but instead of guiding the signal from the last layer and a specific target, it guides the signal from a specific layer and filter.GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Work fast with our official CLI. Learn more. If nothing happens, download GitHub Desktop and try again. If nothing happens, download Xcode and try again. If nothing happens, download the GitHub extension for Visual Studio and try again. Visualization of CNN units in higher layers is important for my work, and currently MayI'm not aware of any library with similar capabilities as the two mentioned above written for PyTorch.
Indeed I have some experience with deep visualization toolbox, which only supports Caffe. However, it has very poor support for networks whose input size is not around xx3 standard size for ImageNet dataset, before croppingand indeed I need to visualize networks not having input of such size, such as networks trained on CIFAR, etc.
In addition, it can't support visualization techniques other than "deconvolution". Therefore, eventually, converting PyTorch models to Caffe and then hacking the code of deep visualization toolbox to make it work is probably not worthwhile. Some people have tried doing visualization in TensorFlow. However, TensorFlow has too much boilerplate, and in general I'm not familiar with it.
I believe with the huge amount of boilerplate around TensorFlow, figuring out the usage of existing visualization code on my particular models, adapted to my particular needs, would possibly take more of my time than working on a pure PyTorch solution. It's going to be implemented mainly through forward and backward hooks of torch.
Since most of visualization techniques focus on fiddling with ReLU layers, this means that as long as your ReLU layers, as well as those layers which contain your interested units are implemented using torch. Modulenot torch. While it's possible to define some modified ReLU layers, as suggested by PyTorch developers, this make break the code, as autograd assumes correct grad computation. We use optional third-party analytics cookies to understand how you use GitHub.
You can always update your selection by clicking Cookie Preferences at the bottom of the page. For more information, see our Privacy Statement. We use essential cookies to perform essential website functions, e. We use analytics cookies to understand how you use our websites so we can make them better, e. Skip to content.