Visual Computing

@IST Austria

Publications - Lampert group (outdated)

Probabilistic Image Colorization

Probabilistic Image Colorization

British Machine Vision Conference (BMVC 2017)

  • Royer
  • Kolesnikov
  • Lampert

We develop a probabilistic technique for colorizing grayscale natural images. In light of the intrinsic uncertainty of this task, the proposed probabilistic framework has numerous desirable properties. In particular, our model is able to produce multiple plausible and vivid colorizations for a given grayscale image and is one of the first colorization models to provide a proper stochastic sampling scheme. Moreover, our training procedure is supported by a rigorous theoretical framework that does not require any ad hoc heuristics and allows for efficient modeling and learning of the joint pixel color distribution. We demonstrate strong quantitative and qualitative experimental results on the CIFAR-10 dataset and the challenging ILSVRC 2012 dataset.

@inproceedings{royer2017probabilistic,
  title={Probabilistic Image Colorization},
  author={Royer, Amelie and Kolesnikov, Alexander Lampert, Christoph H.},
  booktitle={British Machine Vision Conference (BMVC)},
  year={2017}
}
PixelCNN Models with Auxiliary Variables for Natural Image Modeling

PixelCNN Models with Auxiliary Variables for Natural Image Modeling

International Conference on Machine Learning (ICML 2017)

  • Kolesnikov
  • Lampert

We study probabilistic models of natural images and extend the autoregressive family of PixelCNN models by incorporating auxiliary variables. Subsequently, we describe two new generative image models that exploit different image transformations as auxiliary variables: a quantized grayscale view of the image or a multi-resolution image pyramid. The proposed models tackle two known shortcomings of existing PixelCNN models: 1) their tendency to focus on low-level image details, while largely ignoring high-level image information, such as object shapes, and 2) their computationally costly procedure for image sampling. We experimentally demonstrate benefits of our models, in particular showing that they produce much more realistically looking image samples than previous state-of-the-art probabilistic models.

@inproceedings{kolesnikov2017pixelcnn,
  title={{PixelCNN} Models with Auxiliary Variables for Natural Image Modeling},
  author={Alexander Kolesnikov and Christoph H. Lampert},
  booktitle={International Conference on Machine Learning (ICML)},
  year={2017}
}
iCaRL: Incremental Classifier and Representation Learning

iCaRL: Incremental Classifier and Representation Learning

IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017)

  • Rebuffi
  • Kolesnikov
  • Sperl
  • Lampert

A major open problem on the road to artificial intelligence is the development of incrementally learning systems that learn about more and more concepts over time from a stream of data. In this work, we introduce a new training strategy, iCaRL, that allows learning in such a class-incremental way: only the training data for a small number of classes has to be present at the same time and new classes can be added progressively. iCaRL learns strong classifiers and a data representation simultaneously. This distinguishes it from earlier works that were fundamentally limited to fixed data representations and therefore incompatible with deep learning architectures. We show by experiments on CIFAR-100 and ImageNet ILSVRC 2012 data that iCaRL can learn many classes incrementally over a long period of time where other strategies quickly fail.

@inproceedings{rebuffi2017icarl,
  title={{iCaRL}: Incremental Classifier and Representation Learning},
  author={Rebuffi, Sylvestre-Alvise and Kolesnikov, Alexander and Sperl, Georg and Lampert, Christoph H.},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2017}
}
Seed, Expand and Constrain: Three Principles for Weakly-supervised Image Segmentation

Seed, Expand and Constrain: Three Principles for Weakly-supervised Image Segmentation

European Conference on Computer Vision (ECCV 2016)

  • Kolesnikov
  • Lampert

We introduce a new loss function for the weakly-supervised training of semantic image segmentation models based on three guiding principles: to seed with weak location cues, to expand objects based on the information about which classes can occur, and to constrain the segmentations to coincide with image boundaries. We show experimentally that training a deep convolutional neural network using the proposed loss function leads to substantially better segmentations than previous state-of-the-art methods on the challenging PASCAL VOC 2012 dataset. We furthermore give insight into the working mechanism of our method by a detailed experimental study that illustrates how the segmentation quality is affected by each term of the proposed loss function as well as their combinations.

@article{kolesnikov2014seed,
  title={Seed, Expand and Constrain: Three Principles for Weakly-supervised Image Segmentation},
  author={Kolesnikov, Alexander and Lampert, Christoph H},
  journal={European Conference on Computer Vision (ECCV)},
  year={2016}
}
Improving Weakly-Supervised Object Localization By Micro-Annotation

Improving Weakly-Supervised Object Localization By Micro-Annotation

British Machine Vision Conference (BMVC 2016)

  • Kolesnikov
  • Lampert

Weakly-supervised object localization methods tend to fail for object classes that consistently co-occur with the same background elements, e.g. trains on tracks. We propose a method to overcome these failures by adding a very small amount of model-specific additional annotation. The main idea is to cluster a deep network's mid-level representations and assign object or distractor labels to each cluster. Experiments show substantially improved localization results on the challenging ILSVRC 2014 dataset for bounding box detection and the PASCAL VOC 2012 dataset for semantic segmentation.

@inproceedings{kolesnikov2016improving,
  title={Improving Weakly-Supervised Object Localization By Micro-Annotation},
  author={Kolesnikov, Alexander and Lampert, Christoph H.},
  booktitle={British Machine Vision Conference (BMVC)},
  year={2016}
}
Predicting the Future Behavior of a Time-Varying Probability Distribution

Predicting the Future Behavior of a Time-Varying Probability Distribution

IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015)

  • Lampert

We study the problem of predicting the future, though only in the probabilistic sense of estimating a future state of a time-varying probability distribution. This is not only an interesting academic problem, but solving this extrapolation problem also has many practical application, e.g. for training classifiers that have to operate under time-varying conditions. Our main contribution is a method for predicting the next step of the time-varying distribution from a given sequence of sample sets from earlier time steps. For this we rely on two recent machine learning techniques: embedding probability distributions into a reproducing kernel Hilbert space, and learning operators by vector-valued regression. We illustrate the working principles and the practical usefulness of our method by experiments on synthetic and real data. We also highlight an exemplary application: training a classifier in a domain adaptation setting without having access to examples from the test time distribution at training time.

Curriculum Learning of Multiple Tasks

Curriculum Learning of Multiple Tasks

IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015)

  • Pentina
  • Sharmanska
  • Lampert

Sharing information between multiple tasks enables algorithms to achieve good generalization performance even from small amounts of training data. However, in a realistic scenario of multi-task learning not all tasks are equally related to each other, hence it could be advantageous to transfer information only between the most related tasks. In this work we propose an approach that processes multiple tasks in a sequence with sharing between subsequent tasks instead of solving all tasks jointly. Subsequently, we address the question of curriculum learning of tasks, i.e. finding the best order of tasks to be learned. Our approach is based on a generalization bound criterion for choosing the task order that optimizes the average expected classi- fication performance over all tasks. Our experimental results show that learning multiple related tasks sequentially can be more effective than learning them jointly, the order in which tasks are being solved affects the overall performance, and that our model is able to automatically discover a favourable order of tasks.

Classifier Adaptation at Prediction Time

Classifier Adaptation at Prediction Time

IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015)

  • Royer
  • Lampert

Classifiers for object categorization are usually evaluated by their accuracy on a set of i.i.d. test examples. This provides us with an estimate of the expected error when applying the classifiers to a single new image. In real application, however, classifiers are rarely only used for a single image and then discarded. Instead, they are applied sequentially to many images, and these are typically not i.i.d. samples from a fixed data distribution, but they carry dependencies and their class distribution varies over time. In this work, we argue that the phenomenon of correlated data at prediction time is not a nuisance, but a blessing in disguise. We describe a probabilistic method for adapting classifiers at prediction time without having to retraining them. We also introduce a framework for creating realistically distributed image sequences, which offers a way to benchmark classifier adaptation methods, such as the one we propose. Experiments on the ILSVRC2010 and ILSVRC2012 datasets show that adapting object classification systems at prediction time can significantly reduce their error rate, even with additional human feedback.

Closed-Form Approximate CRF Training for Scalable Image Segmentation

Closed-Form Approximate CRF Training for Scalable Image Segmentation

European Conference on Computer Vision (ECCV 2014)

  • Kolesnikov
  • Gauillaumin
  • Ferrari
  • Lampert

We present LS-CRF, a new method for training cyclic Conditional Random Fields (CRFs) from large datasets that is inspired by classical closed-form expressions for the maximum likelihood parameters of a generative graphical model with tree topology. Training a CRF with LS-CRF requires only solving a set of independent regression problems, each of which can be solved efficiently in closed form or by an iterative solver. This makes LS-CRF orders of magnitude faster than classical CRF training based on probabilistic inference, and at the same time more flexible and easier to implement than other approximate techniques, such as pseudolikelihood or piecewise training. We apply LS-CRF to the task of semantic image segmentation, showing that it achieves on par accuracy to other training techniques at higher speed, thereby allowing efficient CRF training from very large training sets. For example, training a linearly parameterized pairwise CRF on 150,000 images requires less than one hour on a modern workstation.

@article{kolesnikov2014closed,
  title={Closed-Form Approximate CRF Training for Scalable Image Segmentation},
  author={Kolesnikov, Alexander and Guillaumin, Matthieu and Ferrari, Vittorio and Lampert, Christoph H},
  journal={European Conference on Computer Vision (ECCV)},
  year={2014}
}
Deep Fisher Kernels - End to End Learning of the Fisher Kernel GMM Parameters

Deep Fisher Kernels - End to End Learning of the Fisher Kernel GMM Parameters

IEEE Computer Vision and Pattern Recognition (CVPR)

  • Sydorov
  • Sakurada
  • Lampert

Fisher Kernels and Deep Learning were two developments with significant impact on large-scale object categorization in the last years. Both approaches were shown to achieve state-of-the-art results on large-scale object categorization datasets, such as ImageNet. Conceptually, however, they are perceived as very different and it is not uncommon for heated debates to spring up when advocates of both paradigms meet at conferences or workshops. In this work, we emphasize the similarities between both architectures rather than their differences and we argue that such a unified view allows us to transfer ideas from one domain to the other. As a concrete example we introduce a method for learning a support vector machine classifier with Fisher kernel at the same time as a task-specific data representation. We reinterpret the setting as a multi-layer feed forward network. Its final layer is the classifier, parameterized by a weight vector, and the two previous layers compute Fisher vectors, parameterized by the coefficients of a Gaussian mixture model. We introduce a gradient descent based learning algorithm that, in contrast to other feature learning techniques, is not just derived from intuition or biological analogy, but has a theoretical justification in the framework of statistical learning theory. Our experiments show that the new training procedure leads to significant improvements in classification accuracy while preserving the modularity and geometric interpretability of a support vector machine setup.

@inproceedings{ sydorov-cvpr2014,
author = {Vladyslav Sydorov and Mayu Sakurada and Christoph H. Lampert},
title = {Deep Fisher Kernels: Jointly Learning a Fisher Kernel SVM and its GMM Parameters},
booktitle = "IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR)",
year = 2014,
}