=Paper=
{{Paper
|id=None
|storemode=property
|title=Automated Recommendation Rule Acquisition for Two-Way Interaction-based Social Network Web Sites
|pdfUrl=https://ceur-ws.org/Vol-674/Paper51.pdf
|volume=Vol-674
|dblpUrl=https://dblp.org/rec/conf/ekaw/KimMCKWBC10
}}
==Automated Recommendation Rule Acquisition for Two-Way Interaction-based Social Network Web Sites==
Automated Recommendation Rule Acquisition for Two-
Way Interaction-based Social Network Web Sites
Y.S.Kim, A. Mahidadia, P. Compton, A. Krzywicki, W. Wobcke, M. Bain, X. Cai
School of Computer Science and Engineering,
The University of New South Wales,
Sydney, NSW, 2052, Australia
{yskim, ashesh, compton, alfredk, wobcke, mike, xcai}@cse.unsw.edu.au
ABSTRACT whether they can also be successfully applied to the
A problem with social network web sites for activities such as recommendation problems in two-way interaction.
dating or finding new friends is that often there is little positive In our research, three different rule-based recommendation
response from those contacted. In this research we investigated methods, which employed different assumptions on the
historical data from a large commercial social network site to preferences of the sender and the recipient, were compared to a
establish which subgroups of people were most likely to respond collaborative filtering method, a typical one-way recommendation
to a particular individual. Our two-way interaction model method.
developed a table for each attribute to determine which pair of
values for sender and recipient gave the best response rate. From 2. Recommendation Rule Learning Method
all the attributes the user profile of a likely responder was created, For a given user, our method learns recommendation rules using
but then less significant attributes were removed. With this simple profiles and the history of interactions between the senders and
technique we were able to demonstrate that where users had the recipients. In summary, our method creates interaction look-up
contacted people the system would have recommended, the tables for each attribute based on past interaction data. For each
success rate was 29.4% compared to a baseline success rate of attribute value of a given user, the method finds a value for the
16.6%. This represents a very considerable increase in the same attribute (called the best matching attribute value) of a
likelihood of getting a favourable response. We are now planning subgroup of recipients based on three different criteria - sending
a study that provides prospective recommendations to actual activity (SA), receiving activity (RA) and success rate (SR).
users, based on our model. Sending activity (SA) is simply the number of contacts send by
the sender group to the recipient group. It suggests the sender’s
Categories and Subject Descriptors interests in the recipients. Receiving activity (RA) is the number
H.2.8 [DatabaseManagement]: Database Applications— of contacts sent from the recipient group to the sender group. It
DataMining suggests the recipient group’s interest in the senders. Success rate
(SR) is the ratio of the number of positive responses over the
General Terms number of interactions from senders to recipients. Success rate
Algorithms represents both senders’ interests in recipients and vice versa.
Once the best matching attribute values for all attributes of a
Keywords given user are selected, it is necessary to find a subgroup of
recommendation systems recipients who satisfy all these attribute values. Given that the
1. INTRODUCTION number of attributes is large, it is possible that no recipients may
satisfy all attribute values. Therefore, it is necessary to select more
With the ever-increasing use of Web 2.0 social networking web
significant attribute values from the best matching attribute
sites, recommender systems can be used to suggest the best
values. For this purpose, we used the weighted lift, which
matching participants. In this case, it is necessary to consider a
represents the normalized ‘interest of the sender in the recipients’,
two-way interaction model, where a user, called sender, sends a
who have specific attribute value. The weighted lift is calculated
message to another user, called recipient and the recipients reply
as follows: For a given attribute value of a sender (ܽݏݒ ), let its
positively or negatively to the sender. Within this model, the
best matching attribute value be ܽݎݒ . The interest of a sender
recommendation method suggests a group of candidate recipients
subgroup who has attribute value ܽݏݒ in the recipients who has
who are more likely to reply positively to the sender.
ܽݎݒ is:
Recommendation methods for two way interaction differ from ௦ೌೡೞ →ೌೡೝ
one-way interaction model, because the recipients in the two-way ܫ௩௦ →௩ = (1)
௦ೌೡೞ →ோ
interaction can choose their response whereas the items in one-
way interaction passively receive the user’s actions. Though many where ݏ௩௦ → ݎ௩ and ݏ௩௦ → ܴ represent the number of
recommendation methods have been researched and interactions sent from a sender subgroup defined by ܽݏݒ to a
commercialized based on the one-way interaction model, recipient subgroup defined by ܽݎݒ and to all recipients ܴ
including Amazon [1], Google [2], and Neflix [3], it is not clear respectively. As each attribute has a different number of attribute
values, the ‘interest of the sender in the recipients’ (ܫ௩௦ →௩ ) is
EKAW 2010, October 11–15, 2010, Lisbon, Portugal. normalized as follows:
Copyright 2010 ACM 1-58113-000-0/00/0010
ω = ݊ × ܫ௩௦ →௩ (2) even than the CF method. The SA, RA and SR methods all try to
identify the characteristics of a recipient who is likely respond to a
where ݊ is the number of attribute values of the particular particular type of sender. The problem with the SA method is that
attribute. it does not take into account the recipients’ interests at all, so that
After calculating the weighted lift ( ω ) of all best matching we end up with highly specialized rules about sender preferences
attribute values, the method adds best matching attribute values to – and since these highly specialized rules are constructed from
the condition of a recommendation rule from high to low features considered independently, there is a greater chance that
weighted lift (ω). This process is repeated until there are no more the test data may not contain recipients who match these rules.
pairs of attributes or there is no training data for the current rule. The success rate of each method has no significant differences
Finally the method chooses the best rule that shows the highest between the SA method and the CF method. They performed
success rate and exceeds a threshold for statistical significance. slightly better than the test period success rate. The CF method
had similar limitations to the SA method as it only considered
3. EXPREIMENTAL RESULTS sender preferences. The success rates of the SR method and the
RA method are higher than SA and CF, for the obvious reason
3.1 Data Sets that they take into account the recipient’s interest in the sender.
The social network site we used provided two types of data – user
profile and user interactions. In total, 32 attributes were used for Table 2 Experimental Results
our recommendation methods. User interaction logs contain Rule Avg. Coverage Success
contact history between users, identifying types of messages sent Rules
Usage Condition (%) Rate (%)
and received. Reply messages were classified into positive and
SA 6,534 3.1 8.62 67.1 17.9
negative and accordingly each interaction is also classified as a
SR 146 201.1 2.71 96.6 29.4
positive or negative interaction. A failure to reply was taken as a
negative interaction. The data sets are summarised in Table 1. RA 8,739 2.9 7.90 82.4 21.1
Train I was collected for our rule learning method. Train II was CF 74.0 17.3
collected for the CF-based method from March, 2009 (one month).
Preliminary data analysis using the CF method over different time
periods showed that a training period of one month was 4. CONCLUSIONS
appropriate. Test data were collected from the first week of April Because we are dealing with the intangibles of human preferences
for evaluation immediately following the CF training period, to in seeking interactions with others, the highest success rate
give it the best chance of performing. The collaborative filtering (29.4% for SR) obtained from our experiment is still low.
(CF) method is based on [1]. However, this is a considerable improvement over the baseline
success rate of 16.6%, which comes from senders’ unguided
Table 1. Training and Test Data Set choices about whom they would like to communicate with, and
Positive who is likely to respond positively. The improved success rate of
Data Total Negative 29.4% comes from the senders who happened to choose the
Interactions
Set Interactions Interactions corresponding recipients we would have recommended. This
%
Train I 3,888,034 689,419 17.7 3,198,615 means, there is enormous potential for providing actual
Train II 1,357,432 236,521 17.4 1,120,911 recommendations to the current users that could significantly
Test 284,702 47,468 16.7 237,234 increase the chance of a favourable response. We plan to conduct
a study that provides actual recommendations to some of the
current users using our model.
3.2 Results
Rule acquisition results with different best matching attribute 5. ACKNOWLEDGMENTS
value selection criteria are summarized in Table 2. The RA This research has been supported by the Smart Services
method produced the largest number of rules (8,739), followed by Cooperative Research Centre.
the SA method (6,534) and by the SR method (146). Note that
these methods do not produce rules in the conventional sense, as a 6. REFERENCES
rule is constructed for each user for which a recommendation is [1] Linden, G., B. Smith, and J. York: Amazon.Com
made. Usage indicates the number of senders covered by each Recommendations: Item-to-Item Collaborative Filtering.
rule, on average. Obviously the more rules, the less users covered. Internet Computing, IEEE. 7(1), pp. 76- 80 (2003).
Of more interest is the number of conditions in a rule. On average [2] Page, L., S. Brin, R. Motwani, and T. Winograd: The
the SA method and the RA method used more condition elements, Pagerank Citation Ranking: Bringing Order to the Web.
8.62 and 7.90 respectively than the SR method with 2.71 per rule. Technical Report, Stanford InfoLab (1999).
Obviously the SR method created more general rules, while the
[3] Koren, Y. Collaborative Filtering with Temporal Dynamics.
SA and RA methods created more specific rules.
In: 15th ACM SIGKDD International Conference on
Recommendation performance of each method was measured by Knowledge Discovery and Data Mining, pp. 447-456. ACM,
coverage and success rate. By coverage we mean the fraction of Paris, France, (2009).
users for which the recommender is able to make a
recommendation. The SR method has the highest coverage
because it has more general rules. The difference between SA and
RA is interesting. The SA method has a smaller number of more
specialized rules giving it the lowest coverage – slightly lower