It's been almost two years now since I first posted a video of Neural Amp Modeler captures running on my Raspberry Pi 4.
A lot has changed since then.
Originally, the NAM playback code was extremely expensive. So much so that a Raspberry Pi 4 could only manage to run "feather" models.
Because I wanted to run more accurate models, NAM optimization became a bit of a pet project. With some optimizations to the NAM Core codebase, we were able to increase performance by more than 2x. That, combined with more widespread availability of a 64bit OS for the Pi, made "standard" NAM captures possible - with plenty of headroom left for a cabinet IR and some light effects.
More recently, I have been working on my own implementation of WaveNet and LSTM models, with a focus on performance. The result of which, you can see running in the image above in my Stompbox app.
Note the numbers in the bottom left - that is the realtime CPU usage as reported by Jack. This is running a "standard" NAM capture. Audio buffer size is 96 samples. CPU usage of "standard" models is now low enough that I can easily run two models at once, with plenty of CPU left over for IR and effects. I can even (just barely) run three models at once.