Tom Wallis: New abstract submitted as a poster at ECVP from 27.08.-31.08.2017 in Berlin

Title: "Matching peripheral scene appearance using deep features: Investigating image-specific variance and contributions of spatial attention" by Thomas S. A. Wallis, Christina M. Funke, Alexander S. Ecker, Leon A. Gatys, Felix A. Wichmann, Matthias Bethge


The visual system represents the periphery as a set of summary statistics. Cohen, Dennett and Kanwisher (TICS 2016) recently proposed that this influential idea can explain the discrepancy between experimental demonstrations that we can be insensitive to large peripheral changes and our rich subjective experience of the world. We present a model that summarises the information encoded by a deep neural network trained on object recognition (VGG-19; Simonyan & Zisserman, ICLR 2015) over spatial regions that increase with retinal eccentricity (see also Freeman & Simoncelli, 2011). We synthesise images that approximately match the model response to a target scene, then test whether observers can discriminate model syntheses from original scenes using a temporal oddity task. For some images, model syntheses cannot be told apart from the original despite large structural distortions, but other images were readily discriminable. Can focussed spatial attention overcome the limits imposed by summary statistics? We test this proposal in a pre-registered cueing experiment, finding no evidence that sensitivity is strongly affected by cueing spatial attention to areas of large pixel- or conv5-MSE between the original and synthesised image.  Our results suggest that human sensitivity to summary-statistic-invariant scrambling of natural scenes depends more on the image content than on eccentricity or spatial attention. Accounting for this in a feedforward summary statistic model would require that the model also encodes these content-specific dependencies.