Dmitry I. Ignatov, Sergei O. Kuznetsov, Jonas Poelmans (Eds.) CDUD 2012 – Concept Discovery in Unstructured Data Workshop co-located with the 10th International Conference on Formal Concept Analysis (ICFCA 2012) May 2012, Leuven, Belgium i Volume Editors Dmitry I. Ignatov School of Applied Mathematics and Information Science National Research University Higher School of Economics, Moscow, Russia Sergei O. Kuznetsov School of Applied Mathematics and Information Science National Research University Higher School of Economics, Moscow, Russia Jonas Poelmans Faculty of Business and Economics Katholieke Universiteit Leuven, Belgium Printed in Belgium by the Katholieke Universiteit Leuven with ISBN 978-9-08- 140991-9. The proceedings are also published online on the CEUR-Workshop website in volume Vol-871 of a series with ISSN 1613-0073. Copyright c 2012 for the individual papers by papers’ authors, for the Volume by the editors. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means without the prior permission of the copyright owners. ii Preface Concept discovery is a subarea of Knowledge Discovery in Databases (KDD) where concept models, such as Formal Concept Analysis (FCA), multimodal clustering, conceptual graphs and other, are used for gaining insight into the underlying conceptual structure of data. Traditional machine learning techniques are mainly focusing on structured data given by object-attribute tables, whereas most data available nowadays are given in unstructured, often textual, form. As compared to traditional data mining techniques, human-centered instruments of concept discovery actively engage domain experts in the discovery process. This volume contains the papers presented at the 2nd International Workshop on Concept Discovery in Unstructured Data (CDUD 2012) held on May 10, 2012 at the Katholieke Universiteit Leuven, Belgium. This workshop welcomes papers describing innovative research on data discovery in complex data. Moreover, this workshop provides a forum for researchers and developers of data mining instruments, working on issues associated with analyzing unstructured data. This year the committee decided to accept 11 papers for publication in the proceedings. Each submission was reviewed by on average 3 program committee members. A. Mestrovic presents an application of concept lattices to semantic match- ing in Croatian language. A. Chepovskiy et al. propose a method for auto- matic language identification for transliterated texts. X. Naidenova describes a novel neural network based data structure for inferring classification tests. A. Kravchenko et al. introduce an approach for expert search which is based on analyzing e-mail communication patterns. D. Ustalov et al. propose an ontology- based approach for text-to-picture synthesis. A. Skabin presents a computerized recognition system for hand-written historical manuscripts. A. Panchenko et al. extract semantic relations between concepts from Wikipedia using KNN algo- rithms. D. Fedyanin uses parameter identification methods for Markov models and applies them to influence analysis in social networks. S. Milyaev et al. dis- cuss a new method for self-tuning semantic image segmentation. A. Vorobev pro- poses a probabilistic model for evaluating the quality level of projects, authors and experts in collaborative innovation platforms. D. Gnatyshak et al. present a novel pseudo-triclustering algorithm and applied it to online social network data. A. Bozhenyuk et al. discuss methods for maximum flow and minimum cost flow finding in fuzzy setting. We would like to express our gratitude to all contributing authors and re- viewers. We also want to thank our sponsors Amsterdam-Amstelland police, IBM Belgium, Research Foundation Flanders, Vlerick Management School, OpenCon- nect Systems and Higher School of Economics (Moscow, Russia). Finally, we should thank the authors of the EasyChair system which helped us to manage the reviewing process. May 10, 2012 Dmitry I. Ignatov Leuven Sergei O. Kuznetsov Jonas Poelmans iii Organization The 2nd International Workshop on Concept Discovery in Unstructured Data (CDUD 2012) was held on May 10, 2012 at the Katholieke Universiteit Leuven, Belgium. The workshop was co-located with the 10th International Conference on Formal Concept Analysis (ICFCA-2012). The inaugural edition of CDUD was held on June 25, 2011 at the Higher School of Economics in Moscow, Russia. Program Chairs Dmitry I. Ignatov National Research University Higher School of Eco- nomics, Russia Sergei O. Kuznetsov National Research University Higher School of Eco- nomics, Russia Jonas Poelmans Katholieke Universiteit Leuven, Belgium Program Committee Simon Andrews Sheffield Hallam University, United Kingdom Guido Dedene Katholieke Universteit Leuven, Belgium Florent Domenach University of Nicosia, Cyprus Irina Efimenko National Research University Higher School of Eco- nomics, Russia Paul Elzinga Amsterdam-Amstelland Police, The Netherlands Boris Galitsky University of Girona, Spain Bernhard Ganter Technische Universität Dresden, Germany Yury Katkov National Research University of Information Tech- nologies, Mechanics and Optics, Russia Natalia Loukachevitch Moscow State University, Russia Dmitry Mouromtsev National Research University of Information Tech- nologies, Mechanics and Optics, Russia Xenia Naidenova Military Medical Academy, Russia Alexey A. Neznanov National Research University Higher School of Eco- nomics, Russia Sergei A. Obiedkov National Research University Higher School of Eco- nomics, Russia Simon Polovina Sheffield Hallam University, United Kingdom Uta Priss Edinburgh Napier University, United Kingdom Dominik Slezak University of Warsaw and Infobright, Poland Rustam Tagiew Technische Universität Freiberg, Germany Stijn Viaene Katholieke Universiteit Leuven, Belgium Johanna Voelker University of Mannheim, Germany Rostislav Yavorsky Witology, Russia Additional Reviewers Ekaterina Cherniak, National Research University of Higher School of Economics, Russia Alexandr Vorobev, Moscow State University and Witology, Russia Sponsoring Institutions Amsterdam-Amstelland police, The Netherlands IBM, Belgium OpenConnect Systems, USA Research Foundation Flanders, Belgium Vlerick Management School, Belgium National Research University Higher of School Economics, Russia v Table of Contents The Methods of Maximum Flow and Minimum Cost Flow Finding in Fuzzy Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Alexandr Bozhenyuk, Evgeniya Gerasimenko and Igor Rozenberg Language Identification for Texts Written in Transliteration . . . . . . . . . . . . 13 Andrey Chepovskiy, Sergey Gusev and Margarita Kurbatova On Parameter Identification Methods for Markov Models Applied to Social Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Denis Fedyanin Analysing Online Social Network Data with Biclustering and Triclustering 30 Dmitry Gnatyshak, Dmitry Ignatov, Alexander Semenov and Jonas Poelmans Term Weighting in Expert Search Task: Analyzing Communication Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Anna Kravchenko and Dmitry Romanov Semantic Matching Using Concept Lattice . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Ana Mestrovic Self-Tuning Semantic Image Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Sergey Milyaev and Olga Barinova A Neural Network-Like Combinatorial Data Structure for Inferring Classification Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Xenia Naidenova Extraction of Semantic Relations between Concepts with KNN Algorithms on Wikipedia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Alexander Panchenko, Sergey Adeykin, Alexey Romanov and Pavel Ro- manov Computerized Recognition System for Historical Manuscripts . . . . . . . . . . . 87 Artem Skabin An Ontology-Based Approach to Text-to-Picture Synthesis Systems . . . . . 94 Dmitry Ustalov and Aleksander Kudryavtsev Evaluating the Quality Level of Projects, Authors and Experts . . . . . . . . . 102 Alexandr Vorobev Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107