Now Is The Time ForA Distributed Fish Tank. Market Research Into A Java-based Distributed Problem Solver With The Graphical Interface Of A Fish Tank.
|
|
|
Distributed computing is an important topic in the ever expanding world
of Computer Science. Most people are familiar with the increase in single processor speed, but fewer know about the vastly more
complex and potentially powerful universe of distributed computing.
Moore's Law,
named after Gordon Moore a founder of Intel, states that
chip performance doubles every 18 months. Rapid hardware
improvement has catapulted us into the world of personal computers and
desktop machines. This has caused, however, a loss of desire to explore other ways to harness the
power of the microchip; distributed computing is one of those alternatives.
With the extension of the high speed global network of the Internet to include household users,
this expanded user base not only makes distributed computing viable, but extremely powerful.
Our project is a Distributed Fish Tank. By this, we intend to create a Java application that looks visually like a fish tank. Fish will swim back and forth in the tank, but our special twist also allows fish to swim through a network to tanks running on other user's machines. By writing our program in Java and using Remote Method Invocation (RMI), we can quickly and easily create cross-platform software that is both technically interesting and visually stimulating. Additionally, through the use of a server, the fish can be assigned to solve a small portion of what was previously an intractably sized problem. By distributing the computation over a large number of client machines, while only using a tiny percentage of their unused processor power, we can quickly derive answers to highly complex problems "for free." Currently, there is no similar network applications on the market in either area, fish tank or easy-to-setup large problem solver. There are, however, precedence for each of these two applications, on a local one user scale. Some popular screen savers offer a fish tank module, and software exists for people to search through a state-space for solutions to problems. By piggy-backing on existing market penetration of those pre-existing systems and combining them with multi-user interaction and the Internet, we believe that our distributed fish tank will illicit a large following.
|
|
|
What Is Distributed Computing?
|
|
|
Distributed computing usually takes one of two major forms. The first
takes resources, such as databases and servers, and spreads them
out over a large network. Here the crux of the problem is to make
applications look identical to the clients and to manage the divided data
in an economical and efficient manner. The other type involves dividing
a problem into pieces for some massively parallel and usually high-speed
application to solve. Our project is closer to the latter in fashion,
but it also has strong ties to client-server computing.
In their introduction to the August 1991 special issue of Computer1 about Distributed Computing Systems, guest editors Mukesh Singhal and Thomas L. Casavant excerpt the five properties of a distributed computing system from a 1987 article by H. Enslow, Jr. titled "What is a 'Distributed' Data Processing System?" They are:
|
|
|
Why Is Our Project A Model Of Distributed Computing?
|
|
|
First, our major hardware resource is generic personal computers or
workstations. As of January 1997, there were over
16 million hosts on the Internet
according to Network Wizards, so
there is clearly a large pool of computers to draw from. Additionally, with over
six million of these hosts newly added over the previous year, current
trends suggest that this number will easily surpass 20 million in
the middle of 1997. With Java, almost all of these computers are
possible targets without regard to their physical location.
For a communication network, since all of those hosts are connected to the Internet, we have specifically restricted ourselves away from the total number of 150 million personal computers2 to only those who fulfill Enslow's second criteria. Additional network access points are being added to ensure low packet loss and minimal congestion, but since the majority of our computation is on the client machines, this is not a primary concern. Still, this prepares the Internet for further growth, which can only help our system. We can setup a Fish Server to act as a surrogate for the "high-speed operating system" because it will fulfill the described need to "unif[y] ... control of the distributed components." This server will administer the distribution of the problem to the fish and collect the computed results. Everything else, such as when and where the fish should perform their computation will be managed mostly by the fish themselves, due to the object oriented nature of our design. This simplifies problems from many standpoints, including design and scalability. RMI coupled with Java provides us with system transparency, Enslow's fourth criterion. We can configure our system to merely request the distribution of instances of fish objects from one tank to another. Tanks have an minimal unique identifier of an IP address and this is a perfect fit for RMI. If we need to introduce new kinds of fish, this will not be problematic, since they all inherit attributes from an abstract parent fish class. This will encapsulate, among other information, everything necessary for distribution. Therefore, since they all adhere to a standard, we can plan for the future without even knowing exactly how it will unfold. Finally, we have cooperative autonomy, a problem with is again solved with Java. All object and computer communication will be handled by the Java Interpreter without regard to the system. This only leaves the task of coordinating the collection and compilation of the data, which is something the Fish Server will handle. Since that is not unique to our specific class of problem or setup, there are preexisting algorithms and techniques, which we plan to use. By keeping the problems simple at the beginning, and by using object oriented approaches, we hope that we can manage this task in a logical manner.
|
|
|
Why Is 1997 A Breakthough Year For Distributed Computing?
|
|
|
While our application is clearly a good model for a distributed system,
it is also equally important to show that the market is poised for an
explosion in distributed computing. Distributed systems have
many applications; serious ones like code cracking, productive ones
like groupware, even frivolous ones like chess playing. Their largest
upside is that it allows processor power to expand at an exponential
rate. If people want a problem solved faster, they do not need to write
more efficient code, design a faster chip, or wait until the "next
release." They can just hook up more machines and -- bingo --
better computer. With TCP/IP and fast network connections, the
machines do not even need to be located in the same country, to say
nothing of not needing to be in the same room.
So, how come every program is not networked? After all, with those clear enhancements, why not to take advantage of them? Well, there are downsides to distributed computing, the largest of which is complexity. When portions of a program are running in many places at once, there tend to be lots of details that need to be accounted for, so things do not get confused. Likewise, data needs to be synchronized and the new data merged with the old. Also, the additional overhead of writing code necessary for inter-computer communication is a hassle all by itself. Finally, until the recent outbreak of the World Wide Web, non-power computer users tended not to have any network connections to the outside world, and if they did it was both slow speed and temporary.3 This made it difficult for program and data to be distributed and for computed information to be returned. It is only in recent years, if not months, that cable modems, ISDN lines, T-1s, and T-3s have allowed people to have "direct connections" to the Internet from their bedrooms. This last step, however, is the catalyst for distributed computing becoming ubiquitous. Personal computers have achieved market penetrations to such an extent that they now outsell even televisions in the United States; until now, however, they were virtual islands. With high speed modems now affordably priced and in widespread use, people can and will link up their computers to be available for distributed computing.
|
|
|
Why Are Java And RMI Important For Distributed Computing And This Project?
|
|
|
With the hardware in place, we must ensure that the software can
assume its equal position in this equation. This is where Java, the
language we are using for our system, has a role. Java is a
language from Sun Microsystems that is especially
suited for distributed problems, since it is platform independent. It is both
compiled and interpreted. The compiler produces "Java bytecodes,"
those bytecodes are then run through Java interpreters on any
platform. By using a meta-format, Java is not restricted to a single
platform.
Additionally, Java is object-oriented. This allows us to write code to produce fish. When we need new fish, we can merely create additional instances of our fish. The use of the term "fish" here is in the objective sense; as mentioned, a "fish" is a generic specification for how specific types of fish should act. Just as one can not go to the supermarket and buy "a fish," our tanks do not have "fish." They could have "bass" or "salmon" or even "rainbow trout," but not "fish." Each species of fish might act similarly or differently than its fellow aquatic creatures, this will vary on a object by object basis. Some differences might by solely visual, others might vary "behind the scenes." Nevertheless, they will all belong to the same parent, so we can count on their behavior; to carry the metaphor further, they will all have "fins" and "gills." Since fish are now objects, they can be actually passed from one machine to another. We will use RMI to deal with the actual fish shipping process. RMI is an abstraction above sockets that allows a Java programmer to create, quickly, inter-machine interactions. It is akin to Remote Procedure Calls (RPC) but for objects. While not as detailed and full-featured as Common Object Request Broker Architecture (CORBA), in a Java only environment, RMI is sufficiently easier and less complicated to act as a more than viable alternative to the sometimes unwieldy CORBA syntax and protocol.
|
|
Through the use of Java, our fish tank application can run cross-platform
in a Java Virtual Machine without needing to write even one line of
system specific code. Furthermore, as these virtual machines become
increasingly powerful and desktop chips faster, we automatically
gain an increase in performance for nothing, without even needing to
incorporate more efficient algorithms. With all the current
industry hype about Java, we know that many companies are
pursuing this objective. RMI is similarly cross-platform. Java's beta version of RMI is
currently being incorporated into Java 1.1, soon to be available on all platforms. It is
already available on both
the Sun Solaris and Microsoft Windows 95/NT operating system
environments. RMI, too, is being refined for increased efficiency,
although stability is more important than efficiency for us in our
application and the beta version is already highly stable.
Java and RMI will also serve as the mechanism for our high-level operating system and system transparency. All necessary servers will be written in Java; therefore, if we need to change the location of these servers we are not even bound to one specific operating platform. RMI, by hiding socket interactions from the programmer and user, allows the tanks and servers to be called directly and easily by any machine on our network. By design, our fish tanks will be almost completely autonomous units. They will only need to interact with other tanks when they wish to send or receive fish. Server interaction is minimized to starting and stopping the client and when the fish has finished its section of the state-space problem. Otherwise, no additional interaction is necessary from any outside resource. This allows for growth without overwhelming a server and makes the system far more robust.
|
|
|
What Problems Can Be Solved Using Distributed Computing?
|
|
|
Now that this global setup has occurred, it is important to examine
possible problems that our system will be well-suited to solve. There
are many, of both theoretical and practical purposes.
One popular use, which has been receiving lots of press lately, is decoding ciphers and factoring numbers. On January 28, 1997, RSA announced a $10,000 contest called The RSA Data Security Secret-Key Challenge. Less than three weeks later, two of the prizes have already been claimed for the 40 and 48-bit keys and the same group that solved the 48 bit key is also planning to attack a 56-bit DES key. Only 312 hours of processing time was needed to solve the 48-bit key using the spare cycles of up to 3,520 computers. While these numbers might seem large, consider this: every Columbia student has an Ethernet jack in their dorm room; if they all had computers, this would be about the same amount of computers used to crack the 48-bit key. The majority of these computers are unused for most of each day, with a coordinated effort, it would be trivial for someone to be able to replicate this on a regular basis. Now, what if this was extended to a large land-grant public university, like Berkeley or the University of Michigan, where the number of students and computers is an order of magnitude larger? Likewise, game playing is another popular use of distributed computing. Complex games like Chess and Go have enormously large state-spaces, which are impossible to fully explore under today's technology. Nevertheless, corporations like IBM are trying with machines like their massively parallel super computer, Deep Blue. Our fish could also be used to "play games." Each branch of the state-space would be distributed to different fish and as the server runs the alpha-beta routine, new fish would be instructed to investigate different possible outcomes. IBM also gives many applications of its technology in "real world" instances, such as "figuring the best way to schedule 570 planes of 25 different types to 150 destinations for best passenger revenue and most efficient fueling, maintenance, crew deployment, and turnaround servicing." Chess, or other game playing, also serves as an easy benchmark of progress. There are also some possible applications in the world of pure mathematics. Sometimes we might want to verify that a set of elements fulfill certain properties, like forming a group or a field. If a formula or other proof cannot be found, a brute-force method might be the easiest way of physically checking these attributes. Another goal might be to find large prime numbers, something for which no explicit algorithm is currently known. There is currently a large Internet movement, called The Great Internet Mersenne Prime Search to find Mersenne prime numbers. Mersenne primes are of the form 2p-1, where p is a prime number. Last November, the 35th Mersenne prime, 21,398,269-1, was found by Joel Armengaud as part of the project; it has 420,921 digits and a text file containing it takes up 411k. It is especially interesting to note that all of the discovers of the previous 14 largest prime numbers now work of Silicon Graphics, a manufacturer of supercomputers and found their primes on similarly high-octane machines. Joel Armengaud used a 90 Mhz Pentium. Clearly, as more fish tanks come on-line, we will not only have available more processor power, but also a logical increase in the numbers of fish used for computation. By use of threads, each fish will encapsulate a small segment of the search-space. As the number of fish increase, the search-space will be divided up faster and the problem solved quicker.
|
|
|
Why Is Our Fish Tank Especially Attractive To Users?
|
|
|
The final and most attractive feature of our Distributed Fish Tank is the fish tank
itself. By building on the already existing fish tank market spearheaded
by Berkeley System's After Dark Screen Saver,
we know that people
already find the idea of having a computer fish tank attractive. In
fact, After Dark, the number one selling screen saver on the market with
a 65% to 70% share, has a fish tank as one of its premier screen saver modules.
In fact, about 4.5 million people use screen savers already, clearly
showing the existing popularity of this product line.
While merely creating another fish tank will not cause people to switch from their existing screen savers and fish tanks, the additional networking options in ours should. We will implement interactive features that people will find exciting to watch and fun to play with. They include, at a minimum, the ability to "attach" a message on a fish and have it show up when that fish moves to other tanks. Another possible option would be to allow people to store a local directory of friends and allow them to send fish and messages directly to their friends, in an email-like fashion. While our initial version of the fish tank will not allow users to modify the characteristics and personalities of the fish, later versions might have a fish editor, where people could make their own unique fish. Due to limitations in RMI, we are still working on technical details of allowing people to use their own GIF images with their new breed of fish. However, we are planning to build in from the beginning ways to vary the attributes of the fish. For example, certain fish may move quicker across the screen, tend to change directions more frequently, be more "afraid" of other fish, and so on. By basing our fish design around this idea at start, we hope that creating "new" fish with unique behavior will be nothing more than calling a more generic constructor with different values. This personalization should further enhance our fish tank's appeal. So, what does this all mean? It is clear that the time is right for our software system. It combines the newest and trendiest of technologies, by using Java, an Object Oriented language, and the Internet. It also captures the attention and imagination of people because the fish are interactive and allow users to communicate with others, something that people clearly enjoy. Finally, there is the excitement generated by the distributed problem solver. Who would not want the thrill of working collaboratively on a discovery never before reached by mankind? By combining these three attributes in an unique and exciting fashion, our Networked Fish Tank is a sure success.
|
|
|
Bibliography
|
|
|
1 Mukesh Singhal and Thomas L. Casavant, "Distributed Computing Systems." Computer, Vol 24., No 8. August 1991, pp. 12-15. 2 Michael Meyer, "Chipping at Intel." Newsweek, February 21, 1994, p. 70. 3 One of the authors remembers using a 1200 baud Hayes modem on a 8088 IBM PC to connect to CompuServe to play "Hunt The Wumpus."
|
|