Companion 50 498 42

Cocitation 58 580 62

Netscape 40 364 29

Table 3: Summary of all answers for the algorithms

0: Page was not valuable/useful 1: Page was valuable/useful {: Page could not be accessed (i.e. did not exist, or server was down)

Please ignore the order in which the pages are returned. So if a later page contains similar

content to an earlier page please rate the latter page as if you had not seen the earlier page.

This will imply that we do not measure how \happy" you are with a set of answers returned by

an algorithm. Instead we measure how many valuable answers each algorithm gives.

4.2 User Study Results

We received responses rating the answer URLs for 59 of the input URLs. These 59 input URLs

form the basis of our study. Table 3 shows how many of these queries the algorithms answered

and how many answer URLs they returned. In many cases, the algorithms returned links that our

users rated as inaccessible. The column labeled shows the number of inaccessible

pages among all the answers for each algorithm. For the purposes of our evaluation, we treat an

inaccessible link as a score of '0', since inaccessible pages are not valuable/useful.

The Cocitation algorithm returned results for all but one of the URLs. The reason why it

returned results for almost all input URLs is that when insucient connectivity was found surrounding an input URL (e.g. a.com/X/Y), the Cocitation algorithm used a chopped URL as input

a.com/X). Although we did not include this chopping feature in our implementation of the

Companion algorithm, it is directly applicable and would enable the Companion algorithm to return answers for more URLs. We have empirically observed that Netscape's algorithm also applies

a similar chopping heuristic in some cases.

Table 4 contains a listing of the 59 URLs in our study. For each URL, the three columns labeled

, , and show the URLs for which the Companion, Cocitation, and Netscape algorithms

returned results, respectively. The table also shows the number of hyperlinks pointing to the URL

in the Connectivity Server ). For the Companion algorithm, it shows the number of nodes

and edges in the vicinity graph, as well as the wall clock time in milliseconds taken to compute the

set of related pages (computed by surrounding the computation with gettimeofday system calls).

For the cocitation algorithm, it shows the number of siblings found, the number of siblings cocited

at least twice ), and the wall clock time taken to compute the answers.

The three algorithms return answers for dierent subsets of our 59 URLs. To compare these

algorithms, we can subdivide the URLs into several groups. The group consists of those

URLs where all algorithms returned at least one answer. There were 37 URLs in this group. The

group consists of the URLs where Netscape's approach did not return any answers.

It consists of 19 URLs.

To quantify the performance of the three algorithms, we now dened two metrics. The

r for a given algorithm is the total number of answers receiving a '1' score within the rst r

answers, divided by r times the number of URLs. Notice that when an algorithm does not give

any answers for a URL, this is as if it gave all non-relevant answers for that URL.

8