Human-ma hine Collaboration for Enri hing Semanti Wikis using Formal Con ept Analysis Alexandre Blans hé, Hala Skaf-Molli, Pas al Molli, and Amedeo Napoli LORIA Nan y, Fran e {firstname.lastname}loria.fr Abstra t. Semanti wikis are new generation of ollaborative tools. They allow to embed semanti annotations in the wiki ontent. These annotations allow to better organize and stru ture the wiki ontents. It is then possible for users to build knowledge understandable by humans and omputers. By this way, ma hines are allowed to produ e or update semanti wiki pages as humans an do. In this paper, we propose a new smart agent based on Formal Con ept Analysis. This smart agent an ompute automati ally ategory trees based on dened semanti proper- ties. In order to redu e human-ma hine ollaboration problems, humans just validate hanges proposed by the smart agent. A distributed version of wiki is used to ensure onsisten y of the ontent during the validation pro ess. Keywords. Formal Con ept Analysis, Semanti Wiki, Human-Ma hine Collaboration 1 Introdu tion Semanti wikis are new generation of ollaborative tools [1,2,3,4℄. They allow to embed semanti annotations in the wiki ontent. These annotations allow to better organize and stru ture the wiki ontents. Semanti wikis allow mass ol- laboration for reating and emerging ontologi al resour es. They guide the users from informal knowledge ontained in do uments to more formal stru tures. Semanti wikis allow users to build knowledge understandable by humans and omputers. By this way, they also allow ma hines to produ e or update semanti wiki pages as humans an do. This opens the opportunity to onsider ma hines as new member of ommunities to produ e and maintain knowledge. Consequently, su h smart agents an redu e signi antly the overhead of ommunities in the pro ess of ontinuously knowledge building and orre t humans errors. In [5℄, authors oupled a ase-based reasoner with a semanti wiki. The ase- based reasoner an enri h the wikis with new semanti pages and thus an be onsidered as a smart agent. As pointed out in [5℄, human-ma hine ollaboration an lead to unstable system if not managed. For example, if humans hange the ategory tree used by the ase-based reasoner, the ase-based reasoner an produ e in orre t results from the point of view of humans users. In this paper, we propose a new smart agent based on Formal Con ept Analy- sis (FCA) [6℄. This smart agent an ompute automati ally ategory trees based on dened semanti properties. By this way, the FCA smart agent leverages hu- mans from these tasks. In order to redu e human-ma hine ollaboration prob- lems, humans just validate hanges proposed by the FCA smart agent. This is a hieved using the DSMW [7℄ semanti mediawiki extension. The paper is organized as follows. Se tion 2 introdu es the FCA framework. Se tion 3 shows how the FCA smart agent is used to enri h the wiki. Se tion 4 details the validation pro ess. The last se tion on ludes and points future works. 2 Formal Con ept Analysis In this paper, we present a smart agent that enri h a wiki based on a lassi ation method. A tually, any lassi ation methods might be used. We hoose Formal Con ept Analysis (FCA) be ause it extra ts on epts organized into a latti e, whi h is interesting for the navigation into the wiki. In this se tion, we briey introdu e FCA. Formal Con ept Analysis [6℄ is a lassi ation method allowing to build a on ept latti e where on epts are omposed of an intent, a maximal set of attributes, and an extent, a maximal set of obje ts sharing the attributes. A ontext K relies on a set of obje ts G, a set of attributes M and a relation between obje ts of attributes I ⊆ G × M . Considering an obje t g ∈ G and an attribute m ∈ M , (g, m) ∈ I means that g has the attribute m. A ontext an be visualized as a binary table. Table 1 shows a (simple) ex- ample of ontext about animals. There are ve attributes that des ribe animals. Animals may have hair, feather, wings. They might breath in air or water. Ob- je ts are animals: bat, bird, at and sh. In the table, a ross in one ell indi ate the animal has the orresponding attribute. Breathe in water Breathe in air Has feather Has wings Has hair Bat Ö Ö Ö Bird Ö Ö Ö Cat Ö Ö Fish Ö Table 1. Example of ontext (animals) FCA allows to build on epts organized into a latti e. A on ept C1 = (A1 , B1 ) is dened by an extent A1 (a set of obje ts) and an intent B2 (a set of attributes that dene the on ept). If C2 = (A2 , B2 ) is a sub on ept of C1 (denoted by C2 ⊑ C1 ), then A2 ⊆ A1 and B1 ⊆ B2 . The top on ept ⊤ ontains all the obje ts and usually its intent is empty (unless an attribute is present in ea h obje t). The bottom on ept ⊥ is dened by all attributes but usually ontains no obje ts (unless an obje t has all attributes). On gure 1 is shown the on ept latti e of the ontext of table 1. On the graph, every node is a on ept. A link between two nodes indi ates a subsumption relation (a on ept is a sub on ept of another on ept). The intent of a on ept is written on a gray ba kground, the extent on white ba kground. Fig. 1. Galois latti e based on the ontext from table 1 3 Wiki Enri hment 3.1 Prin iples We developed a method that reorganizes the ategories of the wiki a ording to the result of FCA. A new wiki will be reated with the same pages and properties, but dierent ategories, based on the latti e of on epts. The new ategories will be reated based on the previous ones, and on seman- ti links between pages. Useful ategories human users did not reate might be dis overed. It is even possible to start a wiki without reating any ategories but only semanti links between pages, and then let the smart agent build the ate- gories, based on the semanti links. The new ategories fa ilitate the navigation in the wiki and provide an expli it and omplete organization of the pages. A mapping between original ategories and latti e on epts is performed. Ea h ategory maps one (and only one) on ept: the most general on ept on- taining the ategory in its intent (the attribute on ept). Ea h on ept maps zero, one or several ategories. If a on ept maps a single ategory the ategory will be preserved. If a on ept maps two ategories or more, it means these at- egories are identi al and should be merged (however this ase is very unlikely). If a on ept does not map any ategory, a new ategory will be reated. Currently, the enri hment is performed by a Java appli ation that a ess the ontent of the wiki and reate an enri hed version of it. 3.2 Case study The method presented in this paper will be illustrated by a wiki on erning a ademi s. Here we present the initial ontent of the wiki. We have the following (user-dened) ategories:  Category:Professor;  Category:Topi ;  Category:Course;  Category:Level whi h ontains two sub ategories: Category:Master 1 Level and Category:Master 2 Level. We also dened two properties:  Property:isTaughtBy, the domain is a ourse, the range a professor;  Property:isAbout, the domain is a ourse, the range a topi . Finally, we added pages in the wiki:  Prof. Smith and Prof. Jones in the Professor ategory;  Artifi ial Intelligen e, Software Engineering and Networks in the Topi ategory;  Knowledge Dis overy, in the Course and Master 1 Level ategories, this page has two semanti links isAbout:Artifi ial Intelligen e and isTaughtBy:Prof. Smith;  Semanti Wiki, in the Course and Master 2 Level ategories, this page has two semanti links isAbout:Artifi ial Intelligen e and isTaughtBy:Prof. Smith;  Semanti Web, in the Course, Master 1 Level and Master 2 Level ategories, this page has two semanti links isAbout:Artifi ial Intelligen e and isTaughtBy:Prof. Smith;  Design Patterns, in the Course and Master 1 Level ategories, this page has two semanti links isAbout:Software Engineering and isTaughtBy:Prof. Jones;  Network Administration, in the Course and Master 1 Level ategories, this page has two semanti links isAbout:Networks and isTaughtBy:Prof. Jones;  IPv6 Proto ol, in the Course and Master 2 Level ategories, this page has two semanti links isAbout:Networks and isTaughtBy:Prof. Jones; 3.3 Formal on ept analysis applied on the wiki FCA an be applied on the ontent of the wiki. Obje ts to be lassied by the FCA algorithm are the standard pages of the wiki. The des ription of a page is omposed of two parts: the ategories it belongs to and the semanti properties it has (in our rst prototype, we only onsidered wiki properties of type Page). Ea h of these two parts allow to build a ontext. We an ombine these two ontext by apposition. Based on the ontent of the wiki, as des ribed above, we an reate the ontext shown on table 2. When applied to this ontext, FCA returns the latti e shown on gure 2. Table 2. Context based on the wiki isAbout:Software Engineering isAbout:Arti ial Intelligen e isTaughtBy:Prof. Smith isTaughtBy:Prof. Jones isAbout:Networks Master 1 Level Master 2 Level Professor Course Level Topi Prof. Smith Ö Prof. Jones Ö Arti ial Intelligen e Ö Networks Ö Software Engineering Ö Knowledge Dis overy ÖÖÖ Ö Ö Semanti Web ÖÖ Ö Ö Ö Semanti Wiki ÖÖÖÖ Ö Ö Design Patterns ÖÖÖ Ö Ö IPv6 Proto ol ÖÖ Ö Ö Ö Network Administration ÖÖÖ Ö Ö In the ase study, as one an see on gure 2, four on epts mat h one ate- gory: Professor, Topi , Master 1 Level, and Master 2 Level. One on ept mat hes two ategories: Course and Level. All the other on epts do not mat h any ategory at all. How to reate the new ategories depends on the number of ategories mat hed by ea h on ept. Depending on that number dierent methods are used. However, no ategories are reated for the two on epts ⊤ and ⊥, as ⊤ always ontains all pages and ⊥ does not ontain any page. Fig. 2. Galois latti e based on the ontext from table 2 3.4 Preserving of an original ategory If a on ept mat hes one and only one ategory, this ategory will simply be pre- served in the enri hed wiki. This is the ase of the ategory Topi , for instan e. A tually, in most ases, all the original ategories are preserved. 3.5 Category merging If a on ept mat hes two ategories or more, a new ategory is reated. This new ategory will merge the ontent of the original mat hing ategories: text of ea h pages are on atenated together. A default title is given to the ategory. Category merging should be rare. It only happens if two or more ategories always appear in the exa t same pages. This would happen if several users use dierent terms for the same on ept. Bit by bit, after a number of wiki edition, these dierent ategories will appear in all the same pages and then will be merged by the FCA. This is the ase of the two ategories Course and Level. Having these two ategories is due to a naming problem. The enri hed wiki has now only one ategory for this on ept. 3.6 New ategories If a on ept mat hes no ategory, a new one is reated, with a default title. This might happens in two (non-ex lusive) ases:  a page belongs to two ategories or more;  several pages having some identi al properties. A ategory about ourses on software engineering has been reated, based on the semanti relation in the page Design Patterns. Also, a ategory about ourses available for both Master 1 and Master 2 students has been reated, Semanti Web is a page of this ategory. 3.7 Category enri hment Whatever the reation method of a ategory, all the new ategories are enri hed with new text ontent, based on properties. Senten es like The pages belonging to this ategory seems to have relation T with the page P . would be appended in the page. This will help human users to understand the meaning of the ategory. For instan e, the ategory of ourses about software engineering will on- tain the senten e The pages belonging to this ategory seems to have relation Property:isAbout with the page Software Engineering., as a des ription of the ategory. 4 Validation 4.1 Validation by human users After the enri hment, new ategories need to be validated by human users. Some merged ategories might be spit, some new ategories removed. Also, human users should edit all the ategories: default titles should be hanged into more relevant ones, text should be rened. We will present three examples of valida- tion. The rst one on erns the two ategories Course and Level that have been merged. Having this two ategories was a mistake. Human users will a knowledge that and rename the merged ategory Course. They will also rename two of the sub ategories Master 1 Course and Master 2 Course to make them more intelligible. Another example on erns a new ategory that has been reated based on the semanti relation in the page Design Patterns with a default name (Category:New Category 42, for instan e). As explained in previously, the new ategory will ontain a text des ribing some properties of the on ept. A human user will understand that this ategory ontains ourses about software engi- neering and will rename it onsequently. The same thing will be done for the ategory about ourses taught by Prof. Jones. The last example on erns a sub ategory of Master 1 Course and Prof. Jones' Course. One might onsider this ategory to be irrelevant, or at least not useful. A human user would de ide to remove this ategory from the wiki and update the hierar hal links onsequently. 4.2 Distributed wiki organization Fig. 3. Man-ma hine ollaboration pro ess In order to ensure onsisten y of the data, we used a distributed wiki. Two semanti mediawiki sites are syn hronized with the DSMW extension 1 [7℄ (see gure 3).  The rst one is the Semanti Wiki1 wiki. Humans a ess this wiki as usual.  From this Semanti Wiki1, the FCA smart agent reates the latti e in the Semanti Wiki2 site.  Human users will then he k the ontent of this se ond wiki site, orre t and rene the ontent.  Next, they an push the ontent of Semanti Wiki2 on a push feed.  Finally, administrator of Semanti Wiki1 an pull validated modi ations from Semanti Wiki2 into Semanti Wiki1. This s enario demonstrates how the DSMW extension an be used to im- plement pro esses. In this ase, a simple pro ess allows validation of hanges produ ed by the FCA smart agent and avoids the problem of instability of human-ma hine ollaboration. 4.3 Enri hed wiki ontent After validation, here is the ontent of the enri hed wiki (Semanti Wiki1 in gure 3) in the ase study:  Category:Professor, ontains pages about Prof. Smith and Prof. Jones; 1 http://dsmw.org  Category:Topi , ontains pages about Networks, Arti ial Intelligen e and Software Engineering;  Category:Course;  Category:Master 1 Course, a sub ategory of Category:Course;  Category:Master 2 Course, a sub ategory of Category:Course;  Category:Artifi ial Intelligen e Course, a sub ategory of Category:Course, the page indi ates that Prof. Smith is tea hing all the ourses in this ategory;  Category:Prof. Jones' Course, a sub ategory of Category:Course;  Category:Master 1 Artifi ial Intelligen e Course, a sub ategory of Category:Master 1 Course and Category:Artifi ial Intelligen e Course, ontains the page about Knowledge Dis overy;  Category:Master 2 Artifi ial Intelligen e Course, a sub ategory of Category:Master 2 Course and Category:Artifi ial Intelligen e Course, ontains the page about Semanti Wiki;  Category:Master 1 and 2 Artifi ial Intelligen e Course, a sub- ategory of Category:Master 1 Artifi ial Intelligen e Course and Category:Master 2 Artifi ial Intelligen e Course, ontains the page about Semanti Web;  Category:Networks Course, a sub ategory of Category:Prof. Jones' Course;  Category:Software Engineering Course, a sub ategory of Category:Prof. Jones' Course and Category:Master 1 Course, ontains the page about Design Patterns;  Category:Master 1 Networks Course, a sub ategory of Category:Master 1 Course and Category:Networks Course, ontains the page about Net- work Administration;  Category:Master 2 Networks Course, a sub ategory of Category:Master 2 Course and Category:Networks Course, ontains the page about IPv6 Proto ol. 5 Con lusion and future work Semanti wikis allow users to build knowledge understandable by humans and omputers. By this way, they also allow ma hines to produ e or update semanti wiki pages as humans an do. This opens the opportunity to onsider ma hines as new member of ommunities to produ e and maintain knowledge. Consequently, su h smart agents an redu e signi antly the overhead of ommunities in the pro ess of ontinuously knowledge building and orre t humans errors. In this paper, we proposed a new smart agent based on Formal Con ept Analysis. This smart agent allows to reorganize the wiki: new ategories are omputed and pages are pla ed into these new ategories. This allows a better organization of the ontent and fa ilitate the navigation in the wiki. The refa toring pro ess needs to be validated by human users. Consisten y of the wiki is ensured by the use of DSMW: a se ond wiki site is used to store the result of the smart agent and is pulled ba k to the main wiki after human validation. This paper presented an early work, and more resear h have to be done in the future. Clearly, if applied on a real wiki, a method su h as FCA would produ e a large amount of on epts, and it would by impossible for human users to validate any one of them. Some ltering methods should be used to prevent irrelevant ategories to be added, based on the number of instan es in a ategory or other riteria. Using Relational Con ept Analysis instead of FCA should provide interesting results. Other lustering methods will also be onsidered. In the urrent version of our method, human users have a feedba k from the smart agent, they will take into onsideration the new ategories that have been reated. However, the smart agent does not have a feedba k from the human users: if a ategory has been reje ted during the validation pro ess, the smart agent will reate it again when the pro ess will be reiterated. To avoid this problem, the smart agent has to be history-aware and use the information of the modi ation by human users during the validation pro ess. 6 A knowledgments This resear h was part of the CyWiki proje t, funded by the Université Henri Poin aré of Nan y. Referen es 1. Völkel, M., Krötzs h, M., Vrande i , D., Haller, H., Studer, R.: Semanti wikipedia. In: WWW '06: Pro eedings of the 15th international onferen e on World Wide Web. (2006) 585594 2. S haert, S.: IkeWiki: A semanti wiki for ollaborative knowledge management. In: 1st International Workshop on Semanti Te hnologies in Collaborative Appli ations (STICA06), Man hester, UK. (2006) 3. Bua, M., Ereteo, G., Faron-Zu ker, C., Gandon, F., Sander, P.: SweetWiki: A semanti wiki. Journal of Web Semanti s, spe ial issue on Web 2.0 and the Semanti Web 6(1) (2008) 4. Krötzs h, M., Vrande i , D., Völkel, M., Haller, H., Studer, R.: Semanti wikipedia. Journal of Web Semanti 5(4) (2007) 251261 5. Cordier, A., Lieber, J., Molli, P., Nauer, E., Skaf-Molli, H., Toussaint, Y.: Wiki- taaable: A semanti wiki as a bla kboard for a textual ase-based reasoning system. In: 4th Workshop on Semanti Wikis (SemWiki2009), held in the 6th European Semanti Web Conferen e. (2009) 1832 6. Ganter, B., Wille, R.: Formal Con ept Analysis, Mathemati al Foundation. Springer (1999) 7. Rahhal, C., Skaf-Molli, H., Molli, P., Weiss, S.: Multi-syn hronous ollaborative semanti wikis. In: 10th International Conferen e on Web Information Systems- Wise 2009. Volume 5802 of Le ture Notes in Computer S ien e. (2009)