Principal component analysis (PCA) is a widely used technique for dimensionality reduction which assumes that the input data can be represented as a collection of fixed-length vectors. Many real-world datasets, such as those constructed from Internet photo collections, do not satisfy this assumption. A natural approach to addressing this problem is to first coerce all input data to a fixed size, and then use standard PCA techniques. This approach is problematic because it either introduces artifacts when we must upsample an image, or loses information when we must downsample an image. We propose MPCA, an approach for estimating the PCA decomposition from multi-sized input data which avoids this initial resizing step. We demonstrate the effectiveness of this approach on simulated and real-world datasets.




Shi, F., Zhai, M., Duncan, D., & Jacobs, N. (2014). MPCA: EM-Based PCA For Mixed-Size Image Datasets. In IEEE International Conference on Image Processing (ICIP) (pp. 1807–1811). bibtex