How we Improved our Neural Network Performance Four-Fold

With Shared Learning from NeurIPS 2021. 

Last year in December, we attended the Bayesian Deep Learning workshop of the NeurIPS 2021 conference to present a poster on our neural networks titled “A Case Study on Bayesian Deep Learning for Stochastic Dynamical Systems in the Mining Industry“.

Dr. Boris Wolter, Principal Data Scientist, IntelliSense.io

by Dr. Boris Wolter, Principal Data Scientist, IntelliSense.io

During the poster session, I met Johanna Rock who was presenting a poster “On Efficient Uncertainty Estimation for Resource-Constrained Mobile Applications“. Her research piqued my interest because Johanna and her team are using Monte-Carlo Dropout (MCD) for estimating the uncertainty of their neural networks, which is the same Bayesian learning method we are using for our predictive models.

One challenge with MCD is its relatively high computational requirement because the same input has to be run through the network many times in order to get a distribution of the outputs that can be analysed statistically.

Johanna’s application runs on mobile phones, which meant that they had to work out a way to reduce the resource requirements. They had achieved this by splitting the network into a deterministic and a stochastic part. Since the deterministic part will give the same output for all identical samples, it only has to be applied to a single input sample, which reduces the required resources to a fraction of the original cost for this part of the model. The Monte-Carlo sampling only happens just before the stochastic part where it is actually needed.

NeurIPS Shared Learning

We realised that we could adopt this same methodology for our networks, which actually have the most computationally expensive parts in the first few layers. Being able to run just a single sample through those layers resulted in speed improvements and reduced memory usage by a factor of four.

The improved networks with “Just-in-Time Monte-Carlo Sampling” are now being deployed and allow for faster evaluation of our predictions and optimization (or to use more Monte-Carlo samples for better uncertainty estimates).

This modification will have the most impact on our Thickener Optimization app and is available immediately.

However, every new network with MCD that we are going to train will now include that feature. For example, the other apps that will rapidly adopt it include the GPS filling network for our Stockpile and Inventory Optimization app and the SAG mill predictions for overload detection in our Grinding Optimization app.

The sharing of ideas in a community such as NeurIPS is one of the core reasons we attend. We are always keen to share our learnings to help improve other industries. This year we (and our customers) were the fortunate beneficiaries.

Want to know more about IntelliSense.io’s suite of mining optimizations apps? We would be delighted to provide a live demo tailored to your requirements.

The IntelliSense.io Application Portfolio