2 Mornings Workshop: Similarity, K-NN, Dimensionality, Multimedia Databases

November 20th-21st 2014, Rennes, France


Thursday, November 20th 2014

  09:00 - 09:30 Registration and Welcome coffee
  09:30 - 10:30 Gwenaël Doërr, Technicolor R&D France

Gwenaël Doërr (M’06–SM’12) received the M.Sc. degree in telecommunications systems from Telecom Sud-Paris, Evry, France, in 2001, and the Ph.D. degree in signal and image processing from the Université de Nice Sophia-Antipolis, Nice, France, in 2005. He was a Lecturer of Digital Rights Management with the Department of Computer Science, University College London, London, U.K., from 2005 to 2009. In Spring 2008, he was a Visiting Scholar with HP Labs, Palo Alto, CA, USA, to work on the interoperability of DRM systems. In 2010, he joined the Security and Content Protection Labs, Technicolor Research and Development France, Cesson-Sévigné, France, as a Senior Research Scientist on content protection. His research interests encompass various aspects of multimedia security technologies. His recent works focused on signal processing techniques for antipiracy, including transactional watermarking for different types of content, content fingerprinting for resynchronization, and passive forensics analysis to characterize pirate samples, and piracy channels. Dr. Doërr is currently the Chair of the IEEE Signal Processing Society Technical Committee on Information Forensics and Security. He is also a Distinguished Member of Technicolor’s Fellow Network. He is an Associate Editor for the IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY and the EURASIP Journal on Image and Video Processing and an Area Editor for the IEEE SIGNAL PROCESSING MAGAZINE. He co-organized Information Hiding in 2007 in St Malo, France, and the IEEE Workshop on Information Forensics and Security in 2009 in London, U.K.

Checking Up the Health of Multimedia Security

The terminology "multimedia security" gained a new popularity in the mid-90s with the rapid rise of digital watermarking in an attempt to combat piracy of copyrighted content. This milestone incarnates the mutation of content protection techniques from conventional cryptography to signal processing techniques. Today, multimedia security encompasses a much wider range of techniques such as multimedia encryption, content fingerprinting, anti-camcording, passive forensic analysis. These various techniques share a common feature, namely operating in the presence of an adversary who aims at interfering with the nominal behavior of the system. In this talk, we will survey 20 years of research in multimedia security and highlight enduring technical challenges that remain to be solved.

  10:30 - 11:15 Shin'ichi Satoh, NII, Tokyo

Shin'ichi Satoh is a professor at National Institute of Informatics (NII), Tokyo. He received PhD degree in 1992 at the University of Tokyo. His research interests include image processing, video content analysis and multimedia database. Currently he is leading the video processing project at NII, addressing video analysis, indexing, retrieval, and mining for broadcasted video archives.

Observing Society via Television---Challenges towards Social Analysis by Using Large-Scale Broadcast Video Archive

We can obtain many interesting aspects only by watching television, e.g., what's going on in Japan and the world, what is the current trends, how is economic activities, and so on. This talk will introduce couple of trials to automatically analyse such information by computers. Especially, with NII TV-RECS video archive containing 300,000 hours of broadcast videos, we developed and deployed couple of key technologies including face detection and matching, fast commercial film mining, and visual object retrieval towards social analysis tools.

  11:15 - 11:30 Coffee Break
  11:30 - 13:00 Miloš Radovanović, University of Novi Sad, Serbia

Miloš Radovanović is Assistant Professor at the Department of Mathematics and Informatics, Faculty of Sciences, University of Novi Sad, Serbia, where he received his BSc, MSc and PhD degrees. He was/is a member of several international projects supported by DAAD, TEMPUS, and bilateral programs. From 2009 he is Managing Editor of the Computer Science and Information Systems journal. He (co)authored one programming textbook, a research monograph, and over 40 papers in data mining, machine learning, and related fields. The current focus of his research are phenomena pertaining to high-dimensional data, and their effects on various data mining and machine learning algorithms and applications

Hubs in Nearest-Neighbor Graphs: Origins, Applications and Challenges (Tutorial)

The tendency of k-nearest neighbor graphs constructed from tabular data using some distance measure to contain hubs, i.e. points with in-degree much higher than expected, has drawn a fair amount of attention in recent years due to the observed impact on techniques used in many application domains. This talk will be organized into three parts: (1) Origins, which will discuss the causes of the emergence of hubs (and their low in-degree counterparts, the anti-hubs), and their relationships with dimensionality, neighbourhood size, distance concentration, and the notion of centrality; (2) Applications, where we will present some notable effects of (anti-)hubs on techniques for machine learning, data mining and information retrieval, identify two different approaches to handling hubs adopted by researchers – through fighting or embracing their existence – and review techniques and applications belonging to the two groups; and (3) Challenges, which will discuss work in progress, open problems, and areas with significant opportunities for hub-related research.

  13:00 - 14:00 Lunch
  14:00 HDR defense of Laurent Amsaleg A Database Perspective on Large Scale High-Dimensional Indexing

Friday, November 21st 2014

  09:00 - 09:30 Registration and Welcome coffee
  09:30 - 10:15 Michel Crucianu, CNAM, Paris

Michel Crucianu is Professor of Computer Science at Conservatoire National des Arts et Metiers (Cnam, Paris) since 2005. He holds a Ph.D. in Computer Science from the University of Paris-Sud. His research concerns content-based mining and retrieval from very large multimedia databases. He has coordinated or participated to many national and international research projects in the domain of multimedia retrieval and understanding. He was the Director of the CEDRIC laboratory of Cnam from 2010 to 2014.

Multimedia information retrieval: beyond ranking

Result ranking by assumed relevance was a simple yet powerful idea whose implementation was continuously improved over several decades. The success of search engines returning lists of ranked results shows that this approach does satisfy the users to some extent. However, user needs have a broader spectrum, results can be relevant in different ways to a same query and the structure in the set of results may itself be meaningful. Ranking alone cannot convey this complexity and becomes a bottleneck in the access to information. Unfortunately, the use of ranked lists is so entrenched that one mechanically attempts to satisfy his information needs by crafting sequences of queries directed to ranking-based search engines. This talk will question the nature of user needs when searching for multimedia content and explore some emerging solutions, together with problems to be solved.

  10:15 - 11:00 Marcel Worring, University of Amsterdam

Marcel Worring is associate professor at the Intelligent Systems Lab Amsterdam and Full Professor at the Amsterdam Business School, both at the University of Amsterdam. He is co-initiator and associate director of the recently established Amsterdam Data Science Center bringing together the leading research institutes in Amsterdam (CWI, VU, UvA, HvA) on the whole data science chain from data acquisition, storage and distributed processing, to analysis, retrieval, and visualization. His research interests are in multimedia analytics, leveraging synergy in human-computer processes. He has published over 170 papers in refereed journals and conferences. He is general co-chair of ACM Multimedia 2016 in Amsterdam, was program co-chair for ACM Multimedia 2013 and ICMR 2013, and co-initiator and co-organizer of the VideOlympics 2007-2009. He was associate editor of Pattern Analysis and Applications and IEEE Transactions on Multimedia and currently is associate editor of ACM TOMCCAP.

Multimedia Analytics: synergy between human and machine by visualization

In this talk I present the work of Jan Zahalka, Stevan Rudinac and me on developing multimedia analytics approaches to accessing large image collections. We report on an extensive survey of over eight hundred papers of which hundred papers were identified as being most relevant for the topic. These have been used to develop a novel multimedia analytics model. In the model, the need for semantic navigation of the collection is emphasized and multimedia analytics tasks are placed on the exploration-search axis. The axis is composed of both exploration and search in a certain proportion which changes as the analyst progresses towards insight. Categorization is proposed as a suitable umbrella task realizing the exploration-search axis in the model. Finally, the pragmatic gap, defined as the difference between the tight machine categorization model and the flexible human categorization model is identified as a crucial multimedia analytics topic. To illustrate the utility of the model we report on a first instantiation in a multi-modal recommender system

  11:00 - 11:30 Coffee Break
  11:30 - 12:15 Arthur Zimek, LMU, Germany.

Dr. Arthur Zimek is a Privatdozent in the database systems and data mining group at the Ludwig-Maximilians-Universität München (LMU), Germany. 2012-2013 he was a postdoctoral fellow in the department for Computing Science at the University of Alberta, Edmonton, Canada. He holds degrees in bioinformatics, philosophy, and theology, involving studies at universities in Munich, Mainz (Germany), and Innsbruck (Austria) and finished his Ph.D. thesis in informatics on ''Correlation Clustering'' at LMU in summer 2008. For this work, Zimek received the ''SIGKDD Doctoral Dissertation Award (runner-up)'' in 2009. His research interests include clustering and outlier detection, methods as well as evaluation, and high dimensional data. Zimek published more than 50 papers at peer reviewed conferences and in international journals. Together with his co-authors, he received the ''Best Paper Honorable Mention Award'' at SDM 2008 and the ''Best Demonstration Paper Award'' at SSTD 2011. Zimek has been a member of program committees of the leading data mining conferences (e.g. SIGKDD, ECMLPKDD, CIKM, SDM) and serves as reviewer for journals like ACM TKDD, IEEE TKDE, Data Mining and Knowledge Discovery (Springer), Machine Learning (Springer).

Challenges for Unsupervised Ensemble Learning

We discuss the use of ensemble techniques for unsupervised learning, with a focus on outlier detection. To introduce the field, we will briefly sketch the data mining task of unsupervised outlier detection and discuss some basic considerations about ensemble techniques. Then we give an overview on existing approaches to using ensemble techniques for outlier detection as well as on the challenges in doing so. Some of our recent contributions to this field will be discussed in more detail, highlighting the issue of diversity of models in building effective ensembles. Finally, we will return to the broader perspective and reason about the application of ensemble techniques in the context of unsupervised learning in general.

  12:15 - 13:00 Erich Schubert, LMU, Germany

Erich Schubert is a research and teaching assistant in the database systems and data mining group at the Ludwig-Maximilians-Universität München, Germany. He studied mathematics and computer science and finished his Ph.D. thesis on "Generalized and Efficient Outlier Detection for Spatial, Temporal, and High-Dimensional Data Mining" in 2013. He received the "Best Demonstration Paper Award" at SSTD 2011 for his work on spatial outlier detection together with his co-authors. This tutorial is closely related to his line of research and several of his recent publications.

Normalization of Scores and Distances for Ensemble Methods

In classification, ensembles can often be built on a binary decision. But in outlier detection, we need to work with the outlier detection scores, and perform a combination of the numerical values. For this, we need to make the scores of different detectors comparable, which gets difficult if we intend to combine different algorithms, where scores may have a very different meaning. We present a statistical model for interpreting and normalizing outlier detection scores based on robust estimation of distributions and Bayesian reasoning, and touch on numerical challenges arising in this context. We then outline how these normalizations can be put into the different context of data normalization, how distance functions can be modeled as ensembles of primitive similarity measures, and sketch ideas for future work on improving the analysis of high-dimensional data by careful normalization of dissimilarities and distance ensembles designed for high-dimensional data.

  13:00 - 14:00 Lunch


IRISA-INRIA, Campus Universitaire de Beaulieu - Salle Métivier
35042 Rennes Cedex

Additional information and directions to the venue can be found here.


Registration deadline: Monday, November 17th 2014

There is no registration fee for this workshop. However, all participants should register prior to the event by filling this form.


HDR defense

Laurent Amsaleg will defend his HDR thesis "A Database Perspective on Large Scale High-Dimensional Indexing" on Thursday, Novemeber 20th 2014 at 14:00 in salle Métivier at IRISA-INRIA Rennes.

Jury members:
Jean-Marc Jézéquel: Président
Michel Crucianu: Rapporteur
Tamer Özsu: Rapporteur
Shin'ichi Satoh: Rapporteur
Edward Chang: Examinateur
Éric Diehl: Examinateur
Patrick Valduriez: Examinateur
Marcel Worring: Examinateur

Registration is not required for this presentation.



Laurent Amsaleg

Email: Laurent dot Amsaleg at irisa dot fr

Li Weng

Email: Li dot Weng at inria dot fr