Most modern convolutional neural networks (CNNs) are compute-intensive, making them infeasible to use in mobile or embedded devices. One of the approaches to this problem is to modify a usual deep CNN with shallow early-exit branches, appended to some convolutional layers [1]. This modification, named BranchyNet, allows to process simple input samples without performing full volume of calculations, providing a speed-up on average. In this work we consider the problem of training a BranchyNet. We exploit a cascade loss function [2], which explicitly regularizes CNN’s average computation time, and modify it to use the entropy of branches’ prediction as confidence measure. We show, that on CIFAR10 dataset the proposed loss function provides a actual speed-up increase from 43% to 47% without quality degradation, comparing with the original loss function.
|