Massively Large Networks and Telecommunications

In this segment, we discuss massively large networks/graphs, consisting of nodes (or vertices) and links (or edges). The first example concerns the World Wide Web, whereas the second, involves telephone calls handled by AT&T.

The Diameter of the Web

Although the number of pages on the World Wide Web exceeds 800 million, according to the researchers: Albert, Jeong, and Barabasi (see the article in Nature 401 (1999), pp. 130-131), the diameter of the World Wide Web is 19. This number represents the number of links, on average, that separate any two pages on the web and also represents, given an "origin" page (or node), the number of clicks via the hypertext links to reach a desired "destination" page. In order to arrive at the number 19, the researchers did not visit every node or page on the web and traverse every link; rather, they utilized an intelligent agent in the form of a software "robot" to follow all links on an origin page, then all the links on each page from that one, and so on, and focused on their institution's (Notre Dame University) Internet domain. Information was obtained on 325,729 documents and 1,469,680 links (approximately .3 percent of the web in 1999) and then the probability that a given node has both incoming and outgoing links was calculated (and found to follow a "power law"). These results were then extrapolated to the entire web, which yielded the number 19. This number provides a measure of the average length of the path from one page to another.
 

Call Graphs at AT&T

Another example of a massively large network or graph is a call graph, which can be obtained from telephone billing records and represents the calls (within a certain period) between phone numbers. Hence, the nodes or vertices are the telephone numbers, whereas the links or edges are the calls that have taken place within that period. Note that there may be multiple calls or links connecting two nodes if multiple calls have been made between those two numbers. J. M. Abello of the AT&T Shannon Laboratories in New Jersey studied the evolution of a call graph over a period of days and in a particular 20 day period, during which the graph grew to 290 million nodes and 4 billion links!

The team of J. M. Abello, P. M. Pardalos and M. G. C. Resende then analyzed a one-day call graph consisting of 53,767,087 nodes and 170 million links. It was found to be not a connected graph (in which every node would be joined to every other node through some sequence of links) yet, fascinatingly, it had one immense connected component consisting of 44,989,297 nodes and representing more than 80% of the entire network. The diameter of this component was 20, which means that any telephone (as represented by the phone number) in that component could be linked with any other one through a sequence of no more than 20 calls. These researchers then explored this call graph for additional structures and found that there were more than 14,000 "cliques" of 30 members, that is, distinct groups of 30 individuals in which everyone communicated or talked with everyone else at least once during that day!

For additional info, see: J. Abello, P. M. Pardalos, and M. G. C. Resende, "On Maximum Clique Problems in Very Large Graphs," in External Memory Algorithms (J. Abello and J. Vitter, editors), AMS-DIMACS Series on Discrete Mathematics and Theoretical Computer Science, vol. 50.
For more information on very large graphs or networks, see the article in the American Scientist: http://www.sigmaxi.org/amsci/issues/comsci00/compsci2000-01.html, which also served as a resource for this writeup.
For additional articles on large graphs and related topics see Pardalos' site http://www.ise.ufl.edu/pardalos