Enhanced Visualization for Web-Based Summaries

Enhanced Visualization for Web-Based Summaries BrentWenerstrom brent.wenerstrom@louisville.edu MehmedKantardzic Computer Eng. and Computer Science Dept Duthie Center for Engineering Louisville

40292 Kentucky

Computer Eng. and Computer Science Dept Duthie Center for Engineering Louisville

40292 Kentucky

Enhanced Visualization for Web-Based Summaries A45BF14D470D7BC9FE33FC5F987CC5C9 GROBID - A machine learning software for extracting information from scholarly documents

For each search result presented by a search engine, a user has a choice to click through for more information or to skip the result. We aim to improve the accuracy of this click process by introducing a color-coding scheme built upon our improved summary text selection approach called Re-Close. Color-coding adds an additional level of context to the text without requiring additional screen space. Our results showed an improvement in click precision from 66% when using Google summaries to 80% when using colorcoded ReClose summaries. Improvements in user click precision will lead to better user experiences, the more efficient finding of search results and higher confidence levels in search engine usage.

INTRODUCTION

Search engine usage has become a part of every day life for internet users. Every time a search is conducted on Google or Bing a list of search results is presented to the user. One of the major challenges that users face as they search for that needle of information in the Internet haystack is deciding which of the search results presented is relevant to their search needs and which are not. When conducting searches for facts and information the choices are not always obvious.

Each search result is composed of a title, a short text summary and an abbreviated URL. The title usually is revealing about the overall message of a web page. However, it is written by the web content creator and may be a slogan of a company or an advertising pitch, which can be misleading. The URL can be very helpful when one is familiar with the host contained in the URL, but many URLs encountered are not familiar to us.

The text summary is extracted from three possible locations [9,4,13]. 1) Spans of text may be taken directly from the content of a web page. 2) It may come from the HTML meta description. The meta description is embedded in the HTML of a web page. It is not displayed to users visiting a web site, but is usually a general descrip-tion of a web page or web site hand written by the content creator. 3) Lastly the text could come from the Open Directory Project (http://www.dmoz.org). The Open Directory Project is a community built directory of websites with a number of short, human-written website summaries.

When search results are presented to users, the user has the task of deciding which results are relevant to their search and which are not. Within information science it has been found that as many as 80 factors contribute to the decision of judge deciding which documents are relevant to a particular search [10]. Users typically make this decision in a matter of seconds. When a user decides to click on a search result there are two possible outcomes that depend on a user's expectations for that web page: 1) the user's expectations were not met leading to disappointment or 2) the user's expectations were met or exceeded resulting in satisfaction.

Users may incorrectly skip relevant content missing out on potentially important information, but it is the feeling of disappointment (possibility 1) that will most negatively affect a search experience. We aim to improve the user's accuracy in click decisions for the purpose of decreasing occurrences of disappointment. As an example of the kinds of disappointment that may be realized consider the search result to the query closeness centrality pictured in Figure 1. Closeness centrality is a graph theory measure used for ordering nodes. The search result shown in Figure 1 has a title of "Social Network Analysis". This page is dedicated to the analysis of social networks. Closeness centrality as is shown in the summary is clearly mentioned. One also finds an example description of closeness centrality in a social network. One may expect that this page contains a lengthy description of closeness centrality followed by this example. However, clicking through to the result page leads to Figure 2. The web page does discuss social network analysis as would be expected by the title, but there is only a single paragraph on closeness centrality. This single paragraph only describes a brief example barely longer than the text summary given by the search result. This web page did not meet the previously detailed expectations and would lead to disappointment on the part Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.This paper was presented at Very Large Data Search (VLDS) 2011. Copyright 2011. of the searcher.

The user in the previous search example would be aided by the two main features of color-coded ReClose summaries. First, keywords are highlighted with color depth to provide global context rather than just the local context of one or two sentences surrounding a keyword. This "global context" refers to the extent of discussion on a web page containing the query topic. In the previous example, the user would have been aware before clicking that there were very few occurrences of the terms "closeness" and "centrality" by visual clues of color enhanced query keyword highlighting.

Secondly, major departures from the main topics of a web search are flagged. If the main subject of a web page is different from the intent of the search user, then a topic term is shown in red. This warns the user that the keywords may be peripheral to the main subject of the web page. Both color depth and topic word flagging are shown in this paper to effectively improve user click precision and decrease user disappointment. This in turn will improve the efficiency of the user and lead to better user experiences with the search engine.

RELATED WORKS

The highlighting of keywords has been used in a number of settings where users scan documents or lists of documents. Highlighting attracts the user's attention to these keywords using bolding, reverse video or coloring the background of the text. In each case it has been shown to be useful to the scanning and examination of documents and document lists [8].

A number of useful approaches exist for highlighting keywords. Baudisch et al. [1] compresses highlighted documents using Fishnet to a single screen for visual search. Byrd [2] proposed the use of different colors for each keyword within a single document, which also was used to designate location of keywords on the scrollbar by color.

Veerasamy and Belkin [11] proposed a table of bar charts to show term importance visually. Each row designated a single document, each column represented a word. The words selected included both query terms and terms used for relevance feedback. Graham [5] presented Reader's Helper that highlighted keywords both within a single document and document lists. Each of the keywords was given a score with a matching bar showing the strength of that score visually.

Kaugars [7] used thumbnails and zoomed views to show keywords in context for a number of documents. Initially all search results are displayed as web page thumbnails, with keyword locations highlighted. A user may zoom to a level where keywords are shown in context and other paragraphs are compressed. Users may again zoom in again to view the full, scrollable contents of a document.

Hemmje et al. [6] presented Lyberworld, which displayed documents in a three dimensional sphere with keywords shown at the edge of the sphere. Documents were presented closest to the keywords contained in those documents.

Keyword highlighting has improved information retrieval result scanning for more than 25 years [8]. Highlighting has proved useful in several interfaces developed since that time [1,2]. However, no other research to the best of our knowledge has proposed the use of color depth or warning colors within summary text to provide additional context.

COLOR-CODED RECLOSE SUMMARIES

The goal of color-coded ReClose summaries is to increase the accuracy (precision) with which users click on search results to find relevant documents. Increasing accuracy will in turn lead to fewer disappointments and a better user experience. Color-coded ReClose summaries aim to improve upon current search result summaries using three main parts. First, we build upon our previous work on text summary generation approach called ReClose [12]. Second, we highlight query keywords using variable shades of blue to show the depth of usage of those query keywords on a web page. Third, we display in red terms central to the web page's topic which potentially differ from the topic of the keywords searched for.

ReClose

The ReClose approach [12] combines two sentence rankings into a single summary with two parts. It combines the benefits of query-biased and query-independent summaries. Query-biased summaries show keywords in context focusing the summaries on content most relevant to search. Queryindependent summaries provide an overview of a single document.

Query-independent summarization is achieved using closeness centrality [3] of graph theory to rank sentences as representative to the whole document. Closeness centrality ranks the centrality of nodes in a graph with the highest rank going to the node with the smallest average distance to all other nodes. Documents are converted to graphs by turning each sentence into a node, then comparing each sentence to each other sentence using word overlap.

The second part of the ReClose approach involves learning from the summary generation techniques of the top ranking search engines, namely Google, Yahoo and Bing. To improve upon the query-biased summaries of current search engines, we learn from the summaries generated by all three top search engines. We generated training data by observing which sentences were chosen by each of these search engines. We trained a linear regression model to score sentences to match the sentence selection of Google, Yahoo and Bing. After training, a new document is split into sentences and each sentence is ranked by the linear regression model. The top ranking sentences are chosen to represent the query-biased portion.

In this way we now have a two part summary taking advantage of both query-biased and query-independent approaches to summary generation. Each portion of the summary is labeled so that users of the summaries are aware of the different intentions with each of the two text spans.

Color-Coded Keywords

We color-code keywords to provide additional context about the usage of keywords. The query-biased summaries of say Google or Bing will provide one or two text spans generally that show one or two usages of the keywords searched. In this way the context on a scale of say plus or minus ten words from the keywords are shown. Our color-coding of the keywords adds depth to each keyword just as colors can provide terrain depth on a topographical map. Many topographical maps will provide a key that shows the elevation range of the map and provide different colors for each subdivision of elevation. This "color-coding" provides users of these maps a more intuitive view than simply a set of contour lines to understand depth. Our depth refers to the frequency of query keywords on a web page. This gives a user a greater appreciation for how long discussions involving the keywords may be compared to other search results.

The key used in our surveys is shown in "Select Color" step of Figure 3. We count the frequency of each keyword on a web page after the removal of stop words and use of Porter stemming. Then for each possible frequency between zero and 63 a different shade of blue is used. (A keyword may be contained in a summary and not on a web page if it is contained in the meta description but not the web page's content). A diagram of color-coding query keywords is shown in Figure 3. Now summaries of web pages that talk at great lengths about say "canines" will be distinguishable from a web page that has very little text which mentions "canines". The exact colors used are in Table 1. We chose to use a light blue (deep sky blue) for the smallest frequency value of zero. Then to make the range between 0 and 30 more pronounced we chose an intermediate, but fairly dark blue (Egyptian blue) at a frequency of 30. A dark blue (Duke blue) was used for a frequency of 63+ which was still distinguishable from regular text in black. To calculate the RGB values for frequencies in between these specific values, one divides the difference in color values by the number of different frequencies.

It is unlikely that most users will be able to know exactly what color represents which frequency, but it will be obvious which summaries contain more frequent keywords. For example in the summary in Figure 3 the keyword "database" is more frequent in the document than the keyword "building". It will also be obvious which end of the scale each keyword belongs to, whether the tail end 0-20 or the top end of 60+, which is where the real value is had.

Flagged Words

goal of the flagging module is to visually differentiate web pages in which the search keywords are the main topic from those web pages where the search keywords are peripheral to the main topic of the page.

We assume that the most frequent term(s) in a document is central to the main topic of a document. We are not concerned with presenting to the user the exact topic of a document, but instead are intent upon finding the departures of document topics from the searched topic. Generally only a single term is considered for flagging to limit the information overload of the user. A single term should allow a user to discern the potential topic of a document in addition to the summary text.

We have designed an algorithm to determine if we should flag any terms within a document summary. Often due to the nature of search the most frequent term in a document is one of the keywords. These terms should not be flagged. Additionally, many terms belong to the same topic as the query keywords and should not be flagged. Our algorithm does not flag terms highly related to the queried topic. The steps in our algorithm are diagrammed in Figure 4 and are outlined below:

1. Determine the most frequent term in a document.

2. Obtain a count of the top ranking documents also including this top term. 3. Threshold the percentage of documents containing the top term. The algorithm begins by first determining the most frequent term in a document (step 1). This involves counting term usage within a document after the removal of stop words.

Once we have determined the most frequent term in a document, we then consider all other top ranking documents returned for the search (step 2). In our case we used the top 28 documents (not including the current document), since this is the maximum number of documents returned through Google's Web Search API (http://code.google.com/apis/ websearch/).

The percentage of top ranking documents for the current search containing the most frequent term is then thresholded (step 3). We used a threshold of 60%. Terms that occur in more than half of the top documents for a search generally are highly related to the search terms. As an example consider the terms by percentage for the query algorithms. Terms above the 60% threshold include: "algorithms" at 100%, "computer" at 80% and "number" at 60% which are all related to algorithms. Examples of terms below the threshold are "privacy", "course", "heap" and "2007" with only "heap" being a term associated with algorithms. Terms found in 60% of documents are both rare and highly related.

Terms that do not meet the threshold will be displayed in the summary colored red. For example see the summary in Figure 4 where the term "JDBC" is flagged. JDBC refers to one method in Java for connecting to databases. It is distantly related to the query building a database, but clearly shows that this particular document is less focused on the building of the database, and more focused on Java related issues.

After we have determined that a term should be flagged for a particular summary, we must ensure that the flagged term is included in the summary. To accomplish this we filter the query-independent sentence ranking to only include sentences including the flagged terms. This ensures that the flagged term will appear in at least one sentence included in the summary.

EXPERIMENTAL RESULTS

We hypothesize that color-coding ReClose generated summaries that users will have more accurate expectations of the web pages summarized. To test this we created a survey that allow us to compare the accuracy of user expectations based on summaries. We mainly compare color-coded Re-Close summaries against Google summaries. We additionally compare ReClose summaries with and without colorcoding to ensure that the color-coding made a difference, and that text selection alone was not the main cause for improvement.

Survey Participants and Survey Design

For our survey we recruited 21 volunteers among undergraduate and graduate students in the Computer Engineering and Computer Science department at the University of Louisville. Surveys were conducted exclusively online.

The summary analysis was broken down into two parts and repeated for each of the three summary techniques under comparison. First a user would be shown 5 summaries for a randomly selected query. For each summary a user would mark if they would click on that summary. Then they would mark the amount of relevant content expected. The choices available were "None", "Sentences", "Paragraphs", "Pages" or "Book". Rather than just obtaining which results a user would click on, we obtain a finer grained understanding of the process through how much relevant content a user expected. Second, users were provided links to each destination page and viewed these pages one at a time. A user marked down the actual amount of relevant content using the same options presented for expectations. In this way rather than finding out if a user believes a page is relevant or not to their search, we can also monitor lesser disappointments, such as a user expecting to find pages and pages of relevant content but in actuality only finding a couple of sentences. In this case the document is still relevant, but the user is likely not satisfied with the results.

Survey participants were shown 5 summaries per summary type for a total of 15 summaries.

Summary Data

Survey participants were randomly assigned three queries out of a pool of 15 queries. These queries were chapter titles and project titles from an introductory course in computer science so that all query topics were familiar to the survey participants. Some example queries were logic gates and creating a web page.

For each of the 15 queries, 28 search results were obtained from Google. We downloaded each linked web page in the search results resulting in 400 successfully downloaded and parsed web pages out of 420 possible. We only used 5 search results per query. To decide which search results to use, we randomly selected web pages from two pools. The first pool was likely to have search results with flagged summaries because when the frequencies of terms in a document was ranked the query keywords had a low rank. The second pool contained the top 5 search results as ranked by Google.

After determining the pool of search results most likely to be flagged and the top Google search results, randomly we select 2-4 results from the pool of results likely to be flagged. Then the remaining results are taken starting starting with the top ranked Google result from the second pool.

Results and Discussion

First we verify the relationship between user click behavior and the relevance markings. Figure 5 shows the distribution of expected relevance for search results clicked and skipped. This figure shows that no user would click on a result if they expected no relevant content. If a user expected only a sentence or two of relevant data, users were unlikely to click (72% or 64/89). A natural division emerges from the expectation results. Users expecting "Sentences" or "None" would skip the result 82% (116/141) of the time, leading us to call this section "irrelevant". The other half of the relevant spectrum we labeled "relevant". Users clicked through 84% (146/174) of the time when expecting "Paragraphs" or more of relevant information. Performing a χ 2 test on the count data revealed by this dividing line resulted in χ 2 value of 134.8 and a p-value < 0.001, clearly showing a significant difference between these two groups. Click through and expectation have a lot in common, but expectations provide more insight into the mental process of the search users.

The expectations of survey participants was fairly inaccurate. Only 34% (108/315) of expectations matched exactly the actual relevant content of web pages. In another 34% (108/315) of expectations resulted in actual content being opposite of expectations in terms of the relevant/irrelevant split mentioned earlier. For example there were 16 occurrences where a survey participant marked a relevant expectation of "Paragraphs" or higher only to find no relevant content.

In our survey color-coded ReClose summaries achieved a much lower percentage of disappointment at 23% than Google summaries achieved at 34% as shown in Table 2. Disappointment was recorded when the relevant content was lower than what their expectations. When we conduct a χ 2 test on the count data comparing Google and color-coded ReClose we obtain a χ 2 value of 2.8 and a p-value of 0.09. This p-value does not fall below the usual threshold value of 0.05. However, there still remains an obvious difference between the results of Google summaries and color-coded ReClose summaries that would become more pronounced with the additional survey participants. We now look at the precision with which users chose to click on a result. Considering that a majority of users did not click when expectations were a couple sentences or less, we label all web page views with a few sentences or less of relevant content as "irrelevant." Survey participant marking more than a few sentences worth of relevant content are labeled as "relevant." Dividing clicks into relevant and irrelevant allows for us to calculate click precision. We define click precision as the percentage of summary views with clicks that led to relevant web pages. Click recall is the percentage of relevant documents that were clicked. The results of these calculations for each summary technique can be seen in Table 3.

Table 3 shows that users clicked more often (61 times) and had a higher click precision (80%) when using color- In practice a higher click precision will be more noticeable to users. Users are aware of clicks to irrelevant content, experiencing disappointment. However, there is no form of feedback for click recall. Users are not aware that they have skipped over a relevant document. One of the main objectives of color-coded ReClose summaries was to improve the click precision for users. From the numbers in Table 3 it is clear that color-coded ReClose summaries improve the precision of users, both over Google summaries and ReClose summaries without color-coding. This leads to fewer disappointments in practice.

Color-Coded Results and Discussion

We now consider the effectiveness of the two color-coding features: color-coded keywords and flagged words. In this section comparisons are only made between bolded and colorcoded ReClose summaries. You can be assured both outperformed Google summaries, but here the focus is just on the added color-coding features. We first consider the colorcoded keywords. The scale we used allowed for usage count differentiation from 0-63. Summaries were not evenly distributed across this range. Nearly half (49% or 37/75) of the summaries used had at most a keyword with 0-9 usages on the web page summarized. We would expect that users would have low expectations for summaries that at most contained keywords on the low end of the scale. Looking at the results, there was no perceived change in behavior for summaries containing low count query keywords (0-9) to medium count . Only in the case of high count query keywords (60+) was there a noticeable change in behavior.

There were 13 summaries (17%) with at least one query keyword with a usage count of 60+. For these 13 summaries, participants found the actual relevant content to be high. For example no matter the summary type, more than 50% of views led to actual relevant content in the "Pages" level. This was rarely expected when using bolded ReClose summaries, see Table 4. Bolded ReClose summaries led to 23% of pages views in the "Pages" level expectations. The color-coded ReClose summaries more often led to higher expectations in line with the actual content. In 44% of views, color-coded users identified an expectation in the "Pages" range. Color-coded ReClose summaries also led to the highest actual relevant content as well at 67%. In the case of high usage count keywords, color-coded ReClose summaries led to justifiably higher expectations.

First we compare the effect that flagging had on expectations which can be seen in Table 5 in the column marked "Expected Relevant". In this table documents were broken into two groups, documents that had terms flagged by color-coded ReClose (rows marked "Flaggable") and documents that did not (rows marked "Not Flaggable"). When color-coded ReClose summaries had flagged terms, the expectations were much lower (29% expected to be relevant) than those same summaries without color-coding (40% ex- pected to be relevant). A similar pattern was found for color-coded summaries without flagged terms having higher expectations. This shows that the flagging of terms directly affected the expectations of the user. There is a much lower percentage of documents found to be relevant that had flagged terms. Even in the case where flagged terms were not shown to users (bolded ReClose summaries), 45% of documents that could have been flagged were found to be relevant compared to 71% of documents that would not have had flagged terms. What is interesting is how flagging affects the click precision of users. Those that saw the flagged terms had a click precision of 57% on flagged summaries compared to 70% that did not see the flagging for these same summaries. However, users expected more and were more precise when color-coding was available and no flagged terms appeared in a summary achieving a click precision of 87% compared to 78% without color-coding. Overall with far fewer clicks among flagged summaries, the overall click precision was higher for the color-coded version of ReClose (see Table 3).

CONCLUSION

In this paper we outline color-coded ReClose summaries. Web-based summaries were visually enhanced using two techniques. The first technique was to provide global context for the query keywords, by using varying color to highlight these keywords. The second technique highlighted in red terms that topically differed from the topics of a query. This provided a warning mechanism to aid users avoid clicking through to results less likely to be relevant. We hypothesized that color-coded ReClose summaries would increase the accuracy of user click decisions, thus reducing disappointments and improving user experiences.

Survey results showed that color-coded ReClose summaries (80%) led to an improvement in user click precision over Google summaries (66%). This in turn led to color-coded ReClose summaries resulting in fewer disappointments (24) compared to Google summaries (36). Improved precision and decreased disappointment will result in a better user experience.

A closer look at the survey results showed that both high-lighting techniques of color-coding keywords and flagging divergent topic terms both were effective. Color-coding summaries is an effective way to enhance the summary information to users without increasing the screen space. We plan on making further improvements to the selection algorithm for flagged terms. We also plan to enhance summaries with the use of multimedia.

Figure 1 :1Figure 1: A top 10 search result for the query closeness centrality on Google (5/11/2011).

Figure 2 :2Figure 2: Web page at http://www.orgnet.com/sna. html (5/11/2011).

Figure 3 :Figure 4 :34Figure 3: Process of color-coding query keywords.Web Page Title

Figure 5 :5Figure 5: Distribution of expected relevant content divided by clicked and skipped documents.

Table 1 :1Colors used to create the color scale.RGB ValuesColor NamesFrequencies RGBDuke blue6302687Egyptian blue301652 166deep sky blue00 191 255

Table 2 :2Disappointment counts and percentages for three summary techniques.SummaryDisap -Satisfied orTotalSourcepointmentSurprisedSummariesGoogle36 (34%)69 (66%)105Color-Coded 24 (23%)81 (77%)105

Table 3 :3Click precision and recall comparison. Close summaries highlighting with bold (75%). When users used Google summaries they clicked through to relevant web pages only about 2/3 of the time that they clicked. With more precise clicks, users using color-coded ReClose summaries also clicked on more of the relevant content having a click recall score of 70%. Individuals using Google and bolded ReClose summaries skipped more relevant content having recall scores of 60% and 64% respectively.ApproachClick Precision Click RecallGoogle66% (38/58)60% (38/63)ReClose75% (39/52)64% (39/61)Color-Coded80% (49/61)70% (49/70)

Table 4 :4Expected and actual relevant content for web pages with a query keyword count of 60+.≤ParaPagesBookReCloseExpected 10 77% Actual 4 31%3 23% 0 0% 8 62% 1 8%Color-Expected 10 56%8 44% 0 0%CodedActual5 28% 12 67% 1 6%

Table 5 :5Expected and actual relevant content for documents that would (Flaggable) and would not (Not Flaggable) have summaries with flagged terms.ExpectedActualClickRelevantRelevantPrec.ReCloseF NF 30/52 (58%) 37/52 (71%) 78% 21/53 (40%) 24/53 (45%) 70%Color-F15/52 (29%) 25/52 (48%) 57%CodedNF 44/53 (83%) 45/53 (85%) 87%

<author> <persName><surname>References</surname></persName> </author> <imprint/> </monogr> </biblStruct> <biblStruct xml:id="b1"> <analytic> <title level="a" type="main">Fishnet, a fisheye web browser with search term popouts: a comparative evaluation with overview and linear view PBaudisch BLee LHanna Proceedings of the working conference on Advanced visual interfaces, AVI '04 the working conference on Advanced visual interfaces, AVI '04

New York, NY, USA

ACM 2004 A scrollbar-based visualization for document navigation DByrd Proceedings of the fourth ACM conference on Digital libraries, DL '99 the fourth ACM conference on Digital libraries, DL '99

New York, NY, USA

ACM 1999 Centrality in social networks conceptual clarification LCFreeman Social Networks 1 3 Changing your site's title and description in search results Google The reader's helper: a personalized document reading environment JGraham Proceedings of the SIGCHI conference on Human factors in computing systems: the CHI is the limit, CHI '99 the SIGCHI conference on Human factors in computing systems: the CHI is the limit, CHI '99

New York, NY, USA

ACM 1999 Lyberworlda visualization user interface supporting fulltext retrieval MHemmje CKunkel AWillett Proceedings of the 17th Annual International Conference on Research and Development in Information Retrieval, ACM SIGIR the 17th Annual International Conference on Research and Development in Information Retrieval, ACM SIGIR

Berlin

Springer 1994 Integrated multi scale text retrieval visualization KKaugars CHI 98 conference summary on Human factors in computing systems, CHI '98

New York, NY, USA

ACM 1998 Information seeking in electronic environments GMarchionini 1995 Cambridge University Press Anatomy of a bing caption Microsoft Relevance and information behavior LSchamber Annual review of information science and technology 29 1994 ARIST) Evaluation of a tool for visualization of information retrieval results AVeerasamy NJBelkin Proceedings of the 19th Annual International Conference on Research and Development in Information Retrieval, SIGIR the 19th Annual International Conference on Research and Development in Information Retrieval, SIGIR 1996 ReClose: Web page summarization combining summary techniques BWenerstrom MKantardzic the International Journal of Web Information Systems on 4 2011. /27/2011 Yahoo! How to change a page title or description in yahoo! search results