Guide to Parallel Operating Systems Windows 10 Ch7 Review
7.2 Operating System Support for Parallelism
Although parallel programs tin exist quite complex, many applications can be fabricated parallel in a simple way to take advantage of the power of Beowulf clusters. In this section nosotros depict how to write uncomplicated programs using features of the Linux operating system that yous are probably already familiar with. Nosotros begin with a discussion of processes themselves (the principal unit of parallelism) and the ways they can be created in Unix environments such as Linux. A good reference on this material is [111].
vii.ii.1 Programs and Processes
Get-go we review terminology. A programme is a set of computer instructions. A figurer fetches from its retention the instruction at the address independent in its program counter and executing that pedagogy. Execution of the instruction sets the programme counter for the next educational activity. This is the bones von Neumann model of computation. A procedure consists of a program, an area of calculator retentiveness called an address infinite, and a program counter. (If in that location are multiple program counters for a unmarried accost infinite, the process is chosen a multithreaded procedure.) Processes are isolated from ane another in the sense that no single didactics from the programme associated with 1 process tin can access the address space of another process. Data can exist moved from the address space of one procedure to that of another procedure by methods that we will depict in this and succeeding chapters. For the sake of simplicity, we volition talk over single-threaded processes here, and so we may think of a process equally an (accost space, program, program counter) triple.
7.two.2 Local Processes
Where do processes come from? In Unix-based operating systems such as Linux, new processes are created by the fork system telephone call. This is an efficient and lightweight mechanism that duplicates the procedure by copying the address infinite and creating a new process with the same program. The but difference betwixt the procedure that executed the fork (called the parent process) and the new process (called the child process) is that the fork telephone call returns 0 in the child and the process id in the parent. Based on this different return code from fork, the parent and child processes, at present executing independently, tin practise different things.
1 thing the child procedure ofttimes does is an exec system call. This phone call changes the plan for the process, sets the program counter to the beginning of the program, and reinitializes the accost space. The fork-exec combination, therefore, is the mechanism by a process create a new, completely different one. The new process is executing on the same machine and competing for CPU cycles with the original process through the process scheduler in the machine's operating system.
You take experienced this mechanism many times. When yous are logged into a Unix system, yous are interacting with a shell, which is just a normal Unix procedure that prompts you lot, reads your input commands, and processes them. The default program for this process is /bin/bash; merely depending on the crush specified for your user proper name in '/etc/passwd', y'all may be using another shell. Whenever you run a Unix command, such as grep, the beat out forks and execs the program associated with the command. The command ps shows yous all the processes you are running under the current shell, including the ps process itself (strictly speaking, the process executing the ps program).
Normally, when you execute a command from the beat, the shell procedure waits for the child process to complete earlier prompting you for another control, so that simply one process of yours at a fourth dimension is really executing. By "executing" we mean that it is in the list of processes that the operating system will schedule for execution co-ordinate to its fourth dimension-slicing algorithm. If your machine has ony one CPU, of course simply one education from i procedure can be executing at a time. By time-slicing the CPU amidst processes, withal, the illusion of simultaneously executing process on a single machine, fifty-fifty a single CPU, is presented.
The easiest way to cause multiple processes to exist scheduled for execution at the aforementioned time is to append the '&' character to a command that you execute in the shell. When you practise this, the vanquish starts the new procedure (using the fork-exec machinery) but then immediately prompts for another command without waiting for the new one to complete. This is called "running a process in the background." Multiple background processes can exist executing at the same time. This situation provides usa with our first instance of parallel processes.
To determine whether a file contains a specific string, y'all can utilise the Unix command grep. To look in a directory containing mail service files in guild to find a bulletin most the Boyer-Moore string-matching algorithm, you can cd to that directory and do
If your mail service is divided into directories past year, you tin can consider search all those directories in parallel. You can use background processes to do this search in a shell script:
!# /bin/bash echo searching for $1 for i in xx* ; do ( cd $i; grep $one * > $i.out & ) ; done look cat xx*/$1.out > $ane.all
and invoke this with Boyer every bit an argument.
This unproblematic parallel program matches our definition of a manager/worker algorithm, in which the main process executes this script and the worker processes execute grep. We can compare its properties with the list in Section 7.1:
-
The subtasks, each of which is to run grep over all the files in one directory, are contained.
-
The workers are started by this trounce script, which acts as the master.
-
The subtask specifications (arguments to grep) are communicated to the workers on their corresponding control lines.
-
The results are written to the file system, ane result file in each directory.
-
The look causes the beat out script to wait for all background processes to finish, and then that the results tin can be collected past the manager (using true cat) into one place.
Ane can make a few further observations about this example:
-
The showtime line of the script tells the system which program to use to interpret the script. Here we have used the default shell for Linux systems, called bash. Other shells may exist installed on your arrangement, such every bit csh, tcsh, or zsh. Each of these has a slightly dissimilar syntax and unlike advanced features, but for the most part they provide the aforementioned bones functionality.
-
Nosotros could have made the size of the subtask smaller by running each invocation of grep on a single file. This would take led to more parallelism, just information technology is of dubious value on a single machine, and we would have been creating potentially thousands of processes at once.
-
We could time this script by putting date commands at the outset and finish, or by running it under the shell's time command:
where grepmail is the proper name of this script and boyer is the argument.
7.2.iii Remote Processes
Think that the way a process is created on a Unix system is with the fork mechanism. Only i procedure is not forked by another process, namely the single init process that is the root of the tree of all processes running at any ane time.
Thus, if nosotros want to create a new process on some other machine, we must contact some existing procedure and cause information technology to fork our new procedure for u.s.. In that location are many means to practise this, but all of them apply this same basic mechanism. They differ simply in which programme they contact to make a fork asking to. The contact is unremarkably made over a TCP socket. Nosotros describe sockets in detail in Section vii.2.5.
rsh
The rsh command contacts the rshd process if information technology is running on the remote machine and asks it to execute a programme or script. To see the contents of the '/tmp' directory on the machine foo.bar.edu, y'all would exercise
The standard input and output of the remote command are routed through the standard input and output of the rsh command, so that the output of the ls comes back to the user on the local machine. Chapter v describes how to set upward rsh on your cluster.
ssh
The ssh (secure beat out) plan behaves much similar rsh but has a more than secure authentication machinery based on public key encryption and encrypts all traffic betwixt the local and remote host. It is at present the most commonly used machinery for starting remote processes. However, rsh is substantially faster than ssh, and is used when security is not a critical event. A common example of this situation occurs when the cluster is backside a firewall and rsh is enabled just inside the cluster. Setting up ssh is described in Chapter 5, and a book on ssh has recently appeared [11].
Here is a simple example. Suppose that nosotros have a file chosen 'hosts' with the names of all the hosts in our cluster. We want to run a command (in parallel) on all those hosts. We tin practice so with a uncomplicated shell script equally follows:
#! /bin/bash for i in 'cat hosts' ; do (ssh -x $i hostname & ) ; done
If everything is working correctly and ssh has been configured and then that it does not crave a password on every invocation, and then we should go dorsum the names of the hosts in our cluster, although not necessarily in the aforementioned club as they appear in the file.
(What is that -x doing there? In this example, since the remotely executed program (hostname) does not use any Ten windowing facilities, we turn off X forwarding by using the -x option. To run a program that does use X, the X option must be turned on by the sshd server at each remote machine and the user should prepare the Brandish environment variable. Then, the connection to the X display is automatically forwarded in such a way that whatsoever X programs started from the vanquish volition go through the encrypted aqueduct, and the connection to the real Ten server will be made from the local car. We note that if y'all run several X programs at several different hosts, they will each create a file named '.Xauthority' in your home directory on each of the machines. If the machines all have the same abode directory, for example mounted via NFS, the '.Xauthority' files will conflict with each other.)
Other Process Managers
Programs such as the ones rsh and ssh contact to fork processes on their behalf are often called daemons. These processes are started when the system is booted and run forever, waiting for connections. You tin can see whether the ssh daemon is installed and running on a detail host by logging into that host and doing ps auxw | grep sshd. Other daemons, either run equally root by the system or run by a particular user, tin be used to start processes. Two examples are the daemons used to start processes in resource managers and the mpd's that can exist used to starting time MPI jobs apace (encounter Chapter 8).
7.ii.4 Files
Having discussed how processes are started, we next tunr to the topic of remote files, files that are local to a remote machine. Frequently we demand to move files from ane host to another, to prepare for remote execution, to communicate results, or even to notify remote processes of events.
Moving files is not e'er necessary, of class. On some clusters, certain file systems are attainable on all the hosts through a system similar NFS (Network File System) or PVFS (Parallel Virtual File System). (Chapter nineteen describes PVFS in detail.) However, directly remote admission tin sometimes be slower than local access. In this section we talk over some mechanisms for moving files from one host to some other, on the assumption that the programs and at to the lowest degree some of the files they employ are desired to exist staged to a local file system on each host, such equally '/tmp'.
rcp
The simplest mechanism is the remote copy command rcp. Information technology has the same syntax as the standard local file copy command cp merely tin accept user proper name and host information from the file name arguments. For example,
rcp thisfile jeeves.uw.edu:/domicile/jones/thatfile
copies a local file to a specific location on the host specified by the prefix before the ':'. A remote user tin too be added:
rcp smith@jeeves.uw.edu:/home/jones/thatfile .
The rcp command uses the same authentication mechanism as rsh does, so it will either ask for a password or not depending on how rsh has been set up. Indeed, rcp can be idea of as a companion programme to rsh. The rcp control tin can handle "3rd party" transfers, in which neither the source nor destination file is on the local machine.
scp
Only every bit ssh is replacing rsh for security reasons, scp is replacing rcp. The scp command is the ssh version of rcp and has a number of other convenient features, such equally a progress indicator, which is handy when big files are being transferred.
The syntax of scp is similar to that for rcp. For instance,
scp jones@fronk.cs.jx.edu:bazz .
will log in to automobile fronk.cs.jx.edu every bit user jones (prompting for a password for jones if necessary) and then copy the file 'bazz' in user jones's habitation directory to the file 'bazz' in the electric current directory on the local motorcar.
ftp and sftp
Both ftp and sftp are interactive programs, commonly used to scan directories and transfer files from "very" remote hosts rather than within a cluster. If yous are not already familiar with ftp, the human page will teach you how to piece of work this basic program. The sftp program is the more secure, ssh-based version of ftp.
rdist
One can utilize rdist to maintain identical copies of a set of files across a set of hosts. A flexible 'distfile' controls exactly what files are updated. This is a useful utility when ane wants to update a master copy and so have the changes reflected in local copies on other hosts in a cluster. Either rsh-style (the default) or ssh-style security tin be specified.
rsync
An efficient replacement for rcp is rsync, peculiarly when an earlier version of a file or directory to exist copied already exists on the remote auto. The idea is to notice the differences between the files and then only transfer the differences over the network. This is particularly constructive for backing up large directory copse; the whole directory is specified in the control, but only (portions of) the inverse files are actually copied.
7.two.five Interprocess Advice with Sockets
The most common and flexible way for two processes on different hosts in a cluster to communicate is through sockets. A socket between 2 processes is a bidirectional channel that is accessed by the processes using the same read and write functions that processes employ for file I/O. In this section we prove how a process connects to another process, establishing a socket, and then uses it for advice. An excellent reference for the deep topic of sockets and TCP/IP in general is [111]. Hither we just scratch the surface, merely the examples nosotros nowadays here should enable y'all to write some useful programs using sockets. Since sockets are typically accessed from programming and scripting languages, nosotros requite examples in C, Perl, and Python, all of which are common languages for programming clusters.
Although one time a socket is established, it is symmetric in the sense that communication is bidirectional, the initial setup process is asymmetric: one procedure connects; the other ane "listens" for a connection and then accepts it. Because this situation occurs in many customer/server applications, we call the process that waits for a connection the server and the process that connects to it the customer, fifty-fifty though they may play different roles later the socket has been established.
Nosotros present essentially the aforementioned example in iii languages. In the instance, the server runs forever in the groundwork, waiting for a socket connectedness. It advertises its location past announcing its host and "port" (more on ports below), on which it tin can be contacted. Then whatever customer program that knows the host and port tin can prepare a connection with the server. In our simple instance, when the server gets the connection asking, it accepts the request, reads and processes the message that the client sends information technology, and then sends a reply.
Customer and Server in C
The server is shown in Figure 7.2. Allow u.s. walk through this example, which may appear more circuitous than it really is. Most of the complexity surrounds the sockaddr_in data structure, which is used for two-way communication with the kernel.
#include <stdio.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> chief(int argc,char *argv[]) { int rc, n, len, listen_socket, talk_socket; char buf[1024]; struct sockaddr_in sin, from; listen_socket = socket(AF_INET, SOCK_STREAM, 0); bzero(&sin, sizeof(sin)); sin.sin_family = AF_INET; sin.sin_addr.s_addr = INADDR_ANY; sin.sin_port = htons(0); demark(listen_socket, (struct sockaddr *) &sin ,sizeof(sin)); listen(listen_socket, 5); getsockname(listen_socket, (struct sockaddr *) &sin, &len); printf("listening on port = %d\due north", ntohs(sin.sin_port)); while (1) { talk_socket = accept(listen_socket, (struct sockaddr *) &from, &len); north = read(talk_socket, buf, 1024); write(talk_socket, buf, n); /* echo */ shut(talk_socket); } }
Figure 7.2: A simple server in C
Get-go, we acquire a socket with the socket system call. Notation that we utilize the discussion "socket" both for the connection betwixt the two processes, equally we have used it up to now, and for a single "end" of the socket every bit it appears inside a program, every bit hither. Here a socket is a pocket-sized integer, a file descriptor just like the ones used to represent open files. Our telephone call creates an Net (AF_INET) stream (SOCK_STREAM) socket, which is how ane specifies a TCP socket. (The tertiary argument is relevant only to "raw" sockets, which we are non interested in hither. It is normally prepare to zero.) This is our "listening socket," on which we will receive connection requests. We then initialize the sockaddr_in information construction, setting its field sin_port to 0 to indicate that nosotros want the organization to select a port for us. A port is an operating organisation resource that can be made visible to other hosts on the network. We bind our listening socket to this port with the demark organisation call and notify the kernel that we wish it to accept incoming connections from clients on this port with the mind telephone call. (The second argument to listen is the number of queued connexion requests we want the kernel to maintain for us. In about Unix systems this will be 5.) At this bespeak clients tin can connect to this port but not yet to our actual server process. Also, at this signal no i knows what port we accept been assigned.
Nosotros now publish the address of the port on which we tin can be contacted. Many standard daemons listen on "well known" ports, but we have not asked for a specific port, so our listening socket has been assigned a port number that no one yet knows. We ourselves find out what it is with the getsockname system call and, in this case, just print information technology on stdout.
At this point we enter an infinite loop, waiting for connections. The take organization call blocks until there is a connection request from a customer. Then it returns a new socket on which nosotros are continued to the customer, so that information technology can continue listening on the original socket. Our server merely reads some data from the customer on the new socket (talk_socket), echoes it back to the client, closes the new socket, and goes dorsum to listening for another connexion.
This case is extremely simple. Nosotros take not checked for failures of any kind (by checking the return codes from our system calls), and of course our server does non provide much service. However, this case does illustrate how to code a mutual sequence of system calls (the socket-bind-listen sequence) that is used in virtually all socket setup code.
The corresponding client is shown in Figure 7.3. In order to connect to the server, it must know the name of the host where the server is running and the number of the port on which information technology is listening. We supply these hither equally command-line arguments.
#include <stdio.h> #include <sys/types.h> #include <sys/socket.h> #include <netdb.h> #include <netinet/in.h> main(int argc,char *argv[]) { int rc, northward, talk_socket; char buf[1024] = "test msg"; struct sockaddr_in sin; struct hostent *hp; talk_socket = socket(AF_INET, SOCK_STREAM, 0); hp = gethostbyname(argv[i]); bzero((void *)&sin, sizeof(sin)); bcopy((void *) hp->h_addr, (void *) &sin.sin_addr, hp->h_length); sin.sin_family = hp->h_addrtype; sin.sin_port = htons(atoi(argv[ii])); connect(talk_socket,(struct sockaddr *) &sin, sizeof(sin)); n = write(talk_socket, buf, strlen(buf)+1); buf[0] = '\0'; /* empty the buffer */ n = read(talk_socket, buf, 1024); printf("received from server: %s \n",buf); }
Figure 7.three: A elementary client in C
Again we acquire a socket with the socket arrangement call. We and then fill up in the sockaddr_in construction with the host and port (first calling gethostbyname to make full in the hostent structure needed to exist placed in sin). Side by side we call connect to create the socket. When connect returns, the accept has taken place in the server, and we can write to and read from the socket as a way of communicating with the server. Here we send a message and read a response, which we print.
Client and Server in Python
Python is an object-oriented scripting linguistic communication. Implementations exist for Unix and Windows; encounter www.python.org for details. It provides an extensive fix of modules for interfacing with the operating system. Ane interesting feature of Python is that the block construction of the code is given by the indentation of the code, rather than explicit "begin"/ "end" or enclosing braces.
Much of the complexity of dealing with sockets has to do with properly managing the sockaddr data structure. College-level languages like Python and Perl brand socket programming more user-friendly past hiding this data structure. A number of adept books on Python exist that include details of the socket module; encounter, for instance, [14] and [seventy]. Python uses an exception handling model (not illustrated here) for mistake atmospheric condition, leading to very clean code that does not ignore errors. The Python version of the server code is shown in Figure vii.4. Here nosotros use the "well-known port" approach: rather than inquire for a port, we specify the one we desire to use. 1 can run across the same socket-bind-listen sequence every bit in the C case, where now a socket object (due south) is returned by the socket call and bind, listen, and accept are methods belonging to the socket object. The accept method returns ii objects, a socket (conn) and information (addr) on the host and port on the other (connecting) finish of the socket. The methods send and recv are methods on the socket object conn, and so this server accomplishes the aforementioned thing as the one in Figure 7.two.
#! /usr/bin/env python #echo server program from socket import * HOST = '' # symbolic proper name for local host PORT = 50007 # arbibrary port s = socket(AF_INET, SOCK_STREAM) s.bind((HOST, PORT)) s.mind(ane) conn, addr = due south.have() impress 'connected to by', addr while 1: data = conn.recv(1024) if not data: interruption conn.send(data) conn.close()
Figure 7.4: A simple server in Python
The Python code for the corresponding customer is shown in Figure 7.5. It has but hard-coded the well-known location of the server.
#!/usr/bin/env python # Echo client program from socket import * HOST = 'donner.mcs.anl.gov' # the remote host PORT = 50007 due south = socket(AF_INET, SOCK_STREAM) s.connect((HOST, PORT)) south.send('Hello, world') data = southward.recv(1024) s.shut() print 'Received', 'information'
Figure seven.5: A simple client in Python
Client and Server in Perl
Perl [124] is a powerful and popular scripting language. Versions exist for Unix and for Windows; encounter www.perl.com for more information. Perl provides a powerful set of string matching and manipulation operations, combined with admission to many of the fundamental system calls. The man page perlipc has samples of clients and servers that apply sockets for communication.
The code for a "fourth dimension server" in Perl is shown in Figure 7.6. Information technology follows the aforementioned design as our other servers. The lawmaking for the corresponding client is shown in Effigy seven.7.
#!/usr/bin/perl use strict; utilise Socket; use FileHandle; my $port = shift || 12345; my $proto = getprotobyname('tcp'); socket(SOCK, PF_INET, SOCK_STREAM, $proto) || dice "socket: $!"; SOCK->autoflush(); setsockopt(SOCK, SOL_SOCKET, SO_REUSEADDR, pack("ane", 1)) || dice "setsockopt: $! "; demark(SOCK, sockaddr_in($port, INADDR_ANY)) || dice "demark: $!"; listen(SOCK,SOMAXCONN) || die "listen: $!"; print "server started on port $port\n"; while (1) { my $paddr = accept(Customer,SOCK); Customer->autoflush(); my $msg = <CLIENT>; print "server: recvd from customer: $msg \north"; print Customer "Hello there, it'due south now ", scalar localtime, "\northward"; shut(CLIENT); }
Figure seven.6: A simple server in Perl
#!/usr/bin/perl -w use strict; use Socket; utilize FileHandle; my ($host,$port, $iaddr, $paddr, $proto, $line); $host = shift || 'localhost'; $port = shift || 12345; $iaddr = inet_aton($host) || dice "no valid host specified: $host"; $paddr = sockaddr_in($port, $iaddr); # packed addr $proto = getprotobyname('tcp'); socket(SOCK, PF_INET, SOCK_STREAM, $proto) || die "socket failed: $!"; SOCK->autoflush(); # from FileHandle connect(SOCK, $paddr) || die "connect failed: $!"; print SOCK "hullo from client\due north"; $line = <SOCK>; print "client: recvd from server: $line \north";
Figure vii.seven: A simple client in Perl
7.two.6 Managing Multiple Sockets with Select
And so far our instance socket code has involved only one socket open up by the server at a fourth dimension (non counting the listening socket). Farther, the connections accept been short lived: after accepting a connection request, the server handled that asking and then terminated the connectedness. This is a typical pattern for a classical server but may not be efficient for manager/worker algorithms in which we might want to keep the connections to the workers open rather than reestablish them each fourth dimension. Different the clients in the examples above, the workers are persistent, so it makes sense to make their connections persistent equally well.
What is needed past the manager in this case is a mechanism to wait for advice from whatever of a gear up of workers simultaneously. Unix provides this capability with the select system call. The use of select allows a procedure to block, waiting for a change of country on whatever of a gear up of sockets. Information technology then "wakes upwardly" the process and presents it with a listing of sockets on which there is activity, such equally a connection request or a message to be read. Nosotros will not cover all of the many aspects of select here, but the code in Figure 7.8 illustrates the features most needed for manager/worker algorithms. For firmness, nosotros show this in Python. A C version would take the same logic. See the select man folio or [111] for the details of how to utilise select in C. It is also available, of course, in Perl.
#!/usr/bin/env python from socket import socket, AF_INET, SOCK_STREAM from select import select lsock = socket(AF_INET,SOCK_STREAM) lsock.bind(('',0)) # this host, anonymous port lsock.listen(5) lport = lsock.getsockname()[i] print 'listening on port =', lport sockets = [lsock] while ane: (inReadySockets, None, None) = select(sockets, [], []) for sock in inReadySockets: if sock == lsock: (tsock,taddr) = lsock.accept() sockets.append(tsock) else: msg = sock.recv(1024) if msg: print 'recvd msg=', msg else: sockets.remove(sock) sock.close()
Figure 7.eight: A Python server that uses select
The first office of the code in Figure 7.eight is familiar. We learn a socket, bind it to a port, and mind on it. Then, instead of doing an accept on this socket directly, nosotros put it into a list (sockets). Initially it is the only fellow member of this list, only eventually the listing will grow. And so nosotros call select. The arguments to select are iii lists of sockets we are interested in for reading, writing, or other events. The select call blocks until activity occurs on one of the sockets we have given to it. When select returns, it returns three lists, each a sublist of the respective input lists. Each of the returned sockets has changed state, and one can take some action on it with the knowledge that the activeness volition non block.
In our instance, nosotros loop through the returned sockets, which are now agile. We procedure action on the listening socket by accepting the connection request and so adding the new connectedness to the listing of sockets nosotros are interested in. Otherwise we read and print the message that the client has sent us. If our read attempt yields an empty message, we interpret this as meaning that the worker has airtight its end of the socket (or exited, which volition close the socket), and we remove this socket from the list.
Nosotros can test this server with the client in Figure seven.nine.
#!/usr/bin/env python from sys import argv, stdin from socket import socket, AF_INET, SOCK_STREAM sock = socket(AF_INET,SOCK_STREAM) sock.connect((argv[1],int(argv[2]))) print 'sock=', sock while ane: print 'enter something:' msg = stdin.readline() if msg: sock.sendall(msg.strip()) # strip nl else: break
Effigy vii.9: A Python client
higginbothammothip.blogspot.com
Source: http://etutorials.org/Linux+systems/cluster+computing+with+linux/Part+II+Parallel+Programming/Chapter+7+An+Introduction+to+Writing+Parallel+Programs+for+Clusters/7.2+Operating+System+Support+for+Parallelism/
0 Response to "Guide to Parallel Operating Systems Windows 10 Ch7 Review"
Post a Comment