Mission

Researchers are often faced with scenarios where the training data (source domain) used to learn a model has different properties from the data on which the model is  applied (target domain). This arises very naturally in many applications in computer vision, speech, and language processing:  identifying objects under different illumination conditions and poses, training speech recognizers in a noise-free environment but deploying to environments with unpredictable noise characteristic, etc. A common theme in these scenarios is that while labels for the source domains are often readily available, collecting labels for the target domains is too expensive or time-consuming. Examples particular to visual data include: 1) in recognizing objects in images taken by a smartphone, available annotations exist in different sources (amazon.com, PASCAL VOC, etc.); 2) in detecting and segmenting an organ of interest from MRI images, available algorithms are instead optimized on CT and X-Ray images; 3) millions of Flickr photos or YouTube videos can be readily obtained using keywords while a user may be interested in organizing her own multimedia collection but is reluctant to annotate it; and many others.

This challenge is commonly referred as "covariate shift'', or "data selection bias''. Regardless of the cause, any distributional change that occurs after learning can degrade performance at test time, and domain adaptation (or domain-transfer learning) aims at lessening this degradation. Despite a variety of approaches that have emerged in machine learning in addressing the dataset bias challenge, at the core of these domain adaptation approaches is the attempt to transfer models or classifiers learned on an existing domain to new domains with a minimum possible additional effort on exploiting the limited supervision in the new domains. Naturally, it can be envisioned that the era of "big data'' enables volumes of data at a scale in the millions or billions, in which the statistical variations and lack of annotation in the databases become even more pronounced, and consequently the demand for adaptations between domains is even more imperative.

Despite the prevalence of data selection bias in computer vision and substantial study of domain adaptation in the machine learning and natural language communities, the particular challenges of visual domain adaptation had not been brought to the attention of the computer vision community only very recently and has attracted significant interest from the vision community. In addition to the main challenges of visual data, scale, and the discrepancies among vision datasets, a major question that is rarely addressed in traditional domain adaptation research is one of adapting structured (non-vector) data representations. In machine learning/NLP, an input sample is usually represented as a vector in Euclidean space, different samples are treated as independent observations, and the task is typically classification. This is, however, not the case in computer vision where the representations to be potentially adapted include shapes and contours, deformable and articulated 2-D or 3-D objects, graphs and random fields, intrinsic images, as well as visual dynamics, none of which is directly supported by "vectorial" domain adaptation techniques. In addition to classification or recognition, models and algorithms for detection, segmentation, reconstruction, and tracking are awaiting mechanisms that do not yet exist, to be adapted toward emerging new domains. All of these challenges necessitate a foundation for characterizing visual domain shift and a paradigm of effective and efficient adaptation methods that are dedicated to visual data.
 
The 1st International Workshop on Visual Domain Adaptation and Dataset Bias (VisDA 2013) in conjunction with ICCV 2013 will solicit contributions on original research across all areas of visual domain adaptation. The proposed workshop should interest researchers from a large number of research areas, from low-level pixel manipulation to high-level semantic understanding, due to the ubiquitous nature of dataset bias and the dearth of labeled samples in novel domains. We expect VisDA 2013 to provide an interactive venue dedicated to the exchange of ideas and sharing of research progress among the computer vision community.