Skip to main content

3D Perception

The knocking-over-coffee-mug example demonstrates a simple, yet profound consequence of not being able to accurately perceive 3D structures of objects that we interact with. This is what research has been shown in the past decades (e.g. Todd, Oomes, Koenderink, & Kappers, 2001). This finding is definitely counterintuitive and does not conform to our daily experiences. So why is this the case?

My research reveals one fundamental aspect of human experience that previous research has overlooked - the fact that the observers are constantly moving. Being constantly in the move entails that the observers could continuously obtain different views of the same object. Using behavioral methods and computational modeling, I showed that with at most 35° to 45° of continuous perspective change through relative motion, the observers can accurately perceive 3D structure via a bootstrap process.

To examine this process, I used the fundamental principles in computer graphics to create a series of random dot anaglyphs to manipulate the availability of different visual information in the display. The basic mechanism behind the display creation is straightforward. As the following animation shows, I start by specifying the 3D environment, including the two projection points (left and right), the screen (the vertical surface), and a 3D object. Using this setup, I simply draw (imaginary) connecting lines (dotted lines) between each projection points and the 3D points to be rendered on the screen. In this example, the four vertices are the points of interest. Using parameterized line equations, the interaction between each projection line and the screen can be easily identified (points on the screen surface), which specify the projected image of the 3D object (dashed lines). Therefore, rendering the projected points/lines on the screen would produce the display that allows the observers to perceive the 3D object. (Of course, when the 3D object becomes more complex, other issues, such as hidden line removal, would emerge and require additional steps to resolve.)

ProjectionProjection
The projection of a static slant objectThe projections of the slant object as it rotates

Using this technique, I separated different types of visual information and created three types of displays.

The Monocular Display

Only monocular structure-from-motion information is available.

Monocular

The Stereomotion Display

Only stereomotion information (i.e., change of disparity over time or CDOT) is available. This is the most interesting visual condition, because without a pair of red-

blue filter glasses all you can see is a field of noisy random dots. However, with a pair of proper glasses, the red dots will only to go your left eye, whereas theblue dots will only go to your right eye. We will be able to fuse them and the evolving distance between them (i.e. change of disparity) will provide you information about depth, which allows you to see a rotating 3D object like the ones shown above.

Stereomotion

The Combined Condition

Both monocular structure-from-motion and stereomotion information is available.

Combine

Using these displays, I studied 3D slant (Wang, et al., 2018, 2020a, 2020b) and shape perception (2020c), where I manipulated the amount of continuous perspective change (i.e., object rotation) and examined at which point did spatial judgment become accurate. I showed that 35°-45° of continuous perspective change enabled accurate 3D slant and shape perception. But why?

To account for this phenomenon, I developed a computational model that uses a stratified process to bootstrap the accurate 3D slant/shape (Wang et al., 2020b, 2020c). For more details, check out the papers linked below.

Finally, I also implemented a set of experiments in virtual reality that aims to examine the same issue, only this time participants are able to move freely in the environment. Stay tuned for mroe updates on this project.

VRSlant

Wang, X.M., Lind, M. & Bingham, G.P. (2020a). Symmetry mediates the bootstrapping of 3-D relief slant to metric slant. Attention, Perception, & Psychophysics, 82, 1488–1503. https://doi.org/10.3758/s13414-019-01859-5.

Wang, X.M., Lind, M., & Bingham, G.P. (2020b). Bootstrapping a better slant: A stratified process for recovering 3D metric slant. Attention, Perception, & Psychophysics, 82, 1504–1519. https://doi.org/10.3758/s13414-019-01860-y.

Wang, X.M., Lind, M., & Bingham, G.P. (2020c). A stratified process for the perception of objects: From optical transformations to 3D relief structure to 3D Euclidean structure to slant or aspect ratio. Vision Research, 173, 77-89. https://doi.org/10.1016/j.visres.2020.04.014.

Wang, X.M., Lind, M., & Bingham, G. P. (2018). Large continuous perspective change with non-coplanar points enables accurate slant perception. Journal of Experimental Psychology: Human Perception and Performance. 44(10), 1508-1522. https://doi.org/10.1037/xhp0000553