Following Data - Life Out of Sequence

Biology Reference

In-Depth Information

and “ls” (list, which provides a list of all the fi les and other directories

in the current directory). On almost all computers, fi les are accessed by

means of a hierarchical tree of directories, which can be navigated by

traveling up and down the branches using the “cd” command. In keep-

ing with the tree metaphor, the topmost directory is usually known as

“root” (labeled “/” in Unix). However, almost anything useful in bio-

informatics will require accessing fi les on other computers; these might

be the computers of colleagues sitting next to you, they might be net-

worked hard drives sitting in the corner of the lab or the basement of

the building, or they might be servers at some distant location. In many

cases, the physical location of the machine being accessed makes little

or no difference and is unknown to the user. The most common way

of connecting to another computer is to use “ssh” or “secure shell”—

this is a system of communication that allows you to log into another

computer remotely using a username and password. For example, to

log into a computer called “tulip” at the EBI, I might type “ssh tulip.

ebi.ac.uk.” Tulip will prompt me for a username and password. If I am

authorized to access tulip, the command prompt will reappear—the

screen and the prompt may look exactly the same as if I were using my

own computer, and I can now navigate around the tulip machine using

exactly the same commands.

This sounds straightforward enough. However, access to such vir-

tual spaces is highly regulated. At the EBI, access to all physical spaces

is controlled by RFID cards—when I arrived, I was provided with an

ID on the fi rst day. Access to virtual spaces is limited by username and

password combinations; by contrast, it took over a week to arrange

my access to all the commonly used computers. Moreover, not all com-

puters are directly accessible—sometimes a series of logins is required,

ssh-ing from one's own computer to computer A and then from A to

B. Some computers can be accessed only by programs (those usually

used for intensive calculation), some are dedicated to particular kinds

of data (databases, for instance), some are for everyday use, some for

use by particular users or groups, some for long-term storage, some for

backup, and some for hosting publicly accessible websites. Figure 4.3

gives a sense of the variety of machines involved and the complicated

way in which they can be connected to one another. Bioinformaticians

use metaphors of space in talking about how they move around these

extended networks: tunnels, fi rewalls, routers, shells, and transfers all

suggest a space in which both the user and the data can move around.

As I learned to how to log into various machines and i nd my way

around the network, my questions would invariably be answered with

Search WWH ::

Custom Search

Home