We have found that the flexibility and adaptiveness of TensorFlow lends itself to building higher level frameworks for specific purposes, and we’ve written one for quickly building neural network modules with TF. We are actively developing this codebase, but what we have so far fits our research needs well, and we’re excited to announce that today we are open sourcing it. We call this framework Sonnet.
The main principle of Sonnet is to first construct Python objects which represent some part of a neural network, and then separately connect these objects into the TensorFlow computation graph. The objects are subclasses of sonnet.AbstractModule and as such are referred to as Modules.
Modules may be connected into the graph multiple times, and any variables declared in that module will be automatically shared on subsequent connection calls. Low level aspects of TensorFlow which control variable sharing, including specifying variable scope names, and using the reuse= flag, are abstracted away from the user.
Separating configuration and connection allows easy construction of higher-order Modules, i.e., modules that wrap other modules. For instance, the BatchApply module merges a number of leading dimensions of a tensor into a single dimension, connects a provided module, and then splits the leading dimension of the result to match the input. At construction time, the inner module is passed in as an argument to the BatchApply constructor. At run time, the module first performs a reshape operation on inputs, then applies the module passed into the constructor, and then inverts the reshape operation.
An additional advantage of representing Modules by Python objects is that it allows additional methods to be defined where necessary. An example of this is a module which, after construction, may be connected in a variety of ways while maintaining weight sharing. For instance, in the case of a generative model, we may want to sample from the model, or calculate the log probability of a given observation. Having both connections simultaneously requires weight sharing, and so these methods depend on the same variables. The variables are conceptually owned by the object, and are used by different methods of the module.