Typically, data-driven learning works best when we can exploit expectations from our data domain. For example, the development of recurrent neural network architectures to deal with the temporal dependence in language, geometric deep learning for 3-D problems, and physics-constrained Bayesian learning for more interpretable dependencies. Yet it can be unclear how to interject expectations, and which specific expectations will result in better outcomes for a given domain. In seismic event processing, enforcing consistency over disparate observations for an individual event has a long history of empirical value. For example, we almost always use magnitude estimates from many individual stations, drop outliers, and average to arrive at a final event magnitude. Similarly, we can leverage the expectation that stations provide consistent predictions for any event-level attributes, such as event type, when we develop deep learning based predictive models. In this work we show how to formulate this expectation as a loss term during model training and give several examples of how this expectation can result in better model regularization, which can reduce overfitting while still outperforming other methods, give us more trustworthy decision confidence, and allows us to leverage data where no ground truth is available.
The primary conference goal this work addresses is improving nuclear test monitoring by advancing algorithms for event discrimination. A second goal, with implications beyond event discrimination, is in advancing our ability to use deep neural networks for actionable decision sup