Let us now look at a very basic primitive in distributed parallelism, and that is refered to as point-to-point communication. And the core primitives over here are send and receive for sending and receiving messages. So, the idea is that we've seen what a distributed memory computer looks like. In a cluster, there are nodes, and suppose, we have these two nodes with rank 0 and rank 1. And in pseudo code, what we want to see happen is we would like rank 0 to initialize a local string to let's say, "ABCD", and send S to rank 1. So, that's what we'd like to accomplish. And in rank 1, we would like to allocate a buffer and receive into the buffer a message from rank 0. So, and then, after that we can print the buffer. So, what we would like to accomplish is that if the string S was initialized to "ABCD" in this distributed memory computer, which, as you recall, the nodes are all connected by an interconnect. Through the interconnect, we'd like to send the string "ABCD", have it appear in the local buffer in rank 1 and then print it out. Now, how do we really write code to do this? Keeping in mind, that we are following the SPMD model. So, we have the same program running on all nodes. Well the way we would go about it is we would have a main program, and in the main program we'd first have to initialize the runtime that we're using for communication, MPI, for message passing interface. And now, we have to do different things in this program depending on the rank. And we cannot write different programs. Well, that's easy, because we have conditionals. So, we can see if the rank is 0, then we can do the logic for rank 0. And there we would assign a string to S to "ABCD". And we'd have to perform the send operation, and this gets fairly detailed. So, we'd have to do something like MPI.SEND, there's S, which is the string we want to send. In general, as we'll see, we can send a substring so, we have to specify the starting location of the string we want to send, which is 0, and also, the number of character that we want to send, again, if it was a substring, and here we can just say, S.LENGTH. Now, we also have to specify the data type since we're communicating between two different computers. The network support needs to be aware of the data type, so the data type is character. That's what each element of the string is. And then finally, we have to specify the destination. So, this is going to rank one and then there's a tag. And as we'll see later, tags can be used to differentiate among different messages sent from the same sender to the same receiver. So, let's just call the tag, 99. So, this is the destination rank and 99 is a tag for the message. Then, we also have to take care of the receive part. So, getting back to this if-then-else. We'd have an else, and in the ELSE we'd have to allocate a buffer to receive the message. So, remember string S was allocated in rank 0's local memory and is pushed through the interconnect. For it to show up in rank 1, we have to allocate some space in rank 1 to receive the message. So, we can say buf is a new character. And in this case, we know that the string is of length four. In general, you can allocate a buffer that is bigger than you need so as to avoid an array index out of bounds exception. Now, once you've allocated the buffer, we can do a receive. Which looks something like this. It's similar in structure to the send as far as the parameters go. So, we specify the buffer which is where the receive operation will receive the data. We specify the starting position, zero. We specify the number of elements that we want to receive. We could put four or I can put buf.length. We specify the type. And now, we have to specify the sender, the rank of the sender. So, zero is the rank of the sender, and the tag for it to match. Zero is the sender here. And 99 is the message tag. And after we receive this, rank 1 can print buf and then, we are done with the if-then-else, and both rank 0 and 1 can call MPI.FINALIZE and exit. So I think you're starting to get the idea now that even though we need different ranks to do different things, the SPMD model starts with the same program for both ranks. But by using conditionals, we can have each rank do something else. This is a very simple example with two ranks, but in general you can have conditionals, for example, to do something for odd numbered ranks in one case, even numbered ranks in another case and so on. The other thing to keep in mind is that the way communication occurs through send and receive is through the notion of communicating arrays of primitives. So, if this is the sending array, and then you have a receive array or a buffer, in our example we sent the complete array and it completely filled the buffer we provided for receiving the message. But it's possible that we can send a substring, for example, the start could be in position ten and the length could still be four. And these elements could end up in some other relative locations in the receive buffer. For example, the receive start could be over here and the four elements could be over here. So, it needn't be the case that the offsets in local memory have to be the same for the sender and the receiver. Now, the other thing to keep in mind as I've have mentioned is that these arrays can be arrays of primitives, but in Java we can have all kinds of data structures and objects. Well, as you've seen earlier, we can use serialization. And, if you're already working with arrays of primitives, then nothing more needs to be done with MPI because it accepts arrays of primitives. But if you have objects then you have to use the serialization approach that we've studied earlier to convert it to, typically, an array of bytes. And you can use the same send and receive idea for the communication to happen for objects. And to use deserialization to reconstruct the objects. So, with that, you've learned the basics of how to write an SPMD program that with communication, where one rank can communicate with another. Essentially, we have the matching of the send and receive operations that causes the communication to occur. And as we'll see, we can use this basic primitive to build very interesting and complex parallel programs to run in distributed parallel machines.