How to Engineer Your Way Out of Slow Models
DRANK

commentsBy Yoel Zeldes, Algorithms Engineer at Taboola.So you just finished designing that great neural network architecture of yours. It has a blazing number of 300 fully connected layers interleaved with 200 convolutional layers with 20 channels each, where the result is fed as the seed of a glorious bidirectional stacked LSTM with a pinch of attention. After training you get an accuracy of 99.99%, and you’re ready to ship it to production.But then you realize the production constraints won’t allow you to run inference using this beast. You need the inference to be done in under 200 milliseconds.In other words, you need to chop off half of the layers, give up on using convolutions, and let’s not get started about the costly LSTM…If only you could make that amazing model faster!Sometimes you canHere at Taboola we did it. Well, not exactly… Let me explain.One of our models has to predict CTR (Click Through Rate) of an item, or in other words — the probability the user will lik…

kdnuggets.com
Related Topics: Cassandra Deep Learning