"Everything should be made as simple as possible, but not simpler."
We propose a novel convolutional neural network architecture for estimating geospatial functions such as population density, land cover, or land use. In our approach, we combine overhead and ground-level images in an end-to-end trainable neural network, which uses kernel regression and density estimation to convert features extracted from the ground-level images into a dense feature map.
We demonstrate that quantitative measures of scenicness can benefit semantic image understanding, content-aware image processing, and a novel application of cross-view mapping, where the sparsity of ground-level images can be addressed by incorporating unlabeled overhead images in the training and prediction steps.
We introduce a novel strategy for learning to extract semantically meaningful features from aerial imagery. Instead of manually labeling the aerial imagery, we propose to predict (noisy) semantic features automatically extracted from co-located ground imagery. Our network architecture takes an aerial image as input, extracts features using a convolutional neural network, and then applies an adaptive transformation to map these features into the ground-level perspective.
We introduce a large, realistic evaluation dataset, Horizon Lines in the Wild (HLW), containing natural images with labeled horizon lines. Using this dataset, we investigate the application of convolutional neural networks for directly estimating the horizon line, without requiring any explicit geometric constraints or other special cues.
Cloud shadows dramatically affect the appearance of outdoor scenes. We describe three approaches that use video of cloud shadows to estimate a cloudmap, a spatio-temporal function that represents the clouds passing over the scene.
We introduce a novel vanishing point detection algorithm that obtains state-of-the-art performance on three benchmark datasets. The main innovation in our method is the use of global image context to sample possible horizon lines, followed by a novel discrete-continuous procedure to score each horizon line by choosing the optimal vanishing points for the line.
Given an image, we propose to use the appearance of people in the scene to estimate when the picture was taken.
This paper presents the results of a large-scale empirical evaluation of the performance of three state-of-the-art approaches on a new dataset, which consists of roughly 100k images captured “in the wild”.
We describe a fast method for estimating transient scene attributes for a single image. Using our method, we explore applications to webcam imagery, including: 1) supporting automatic browsing and querying of large archives of webcam images, 2) constructing maps of transient attributes from webcam imagery, and 3) geolocalizing webcams.
We learn a joint feature representation for aerial and ground-level imagery and apply this representation to the problem of cross-view image geolocalization.
We introduce a method for directly estimating the focal length of a camera from a single image using a deep convolutional neural network.
We propose a data-driven approach to solving the problem of image geo-localization using an image of a face.
We show that features extracted from deep convolutional neural networks are useful for problems in geospatial image analysis.
We introduce methods for estimating scene geometry in a distributed camera network using videos from partly cloudy days.
We derive constraints and demonstrate methods that allow rainbows to be used for camera geolocalization, calibration and rainbow-specific image editing.
Our work explores the little-studied dependence of facial appearance on geographic location. To support this effort, we constructed GeoFaces, a large dataset of geotagged face images.
We describe new methods for estimating the geometry of an outdoor scene using video from multiple partly cloudy days.
We propose cloud motion as a natural scene cue that enables geometric calibration of static outdoor cameras.