Socket Programming

Socket Programming

·

5 min read

Last time we talked about Inter Process Communications. While it was a good list, all the methods we discussed were mainly for communication between two processes running on the same machine. This time we will dip our toe into Socket Programming.

Wikipedia defines sockets as, «A software structure within a network node of a computer network that serves as an endpoint for sending and receiving data across the network. The structure and properties of a socket are defined by an application programming interface (API) for the networking architecture. Sockets are created only during the lifetime of a process of an application running in the node.»

Before moving forward, let's paraphrase what Wikipedia says in simpler terms. A socket is a way to communicate between two different nodes in a network. Sockets define the Communication Protocol, IP address, and Port number of a node.

Communication Protocol refers to «how» we will communicate with the other nodes. It defines the structure of the socket. And how the package will be handled in the lower levels of the network.

The IP address refers to «where» the node we want to talk to is. In the vastness of the internet, IP addresses are how we find each other.

Port number, refers to «which» process we are connecting to. On a single machine, multiple programs that connect to the internet can run concurrently. Each program has its assigned port they communicate through.

Now we know what a socket is, let's create one. For this endeavor, we will go all the down to C level. The functions and definitions will require the inclusion of multiple libraries. If you are on a Linux machine refer to the man page. If you are on a Windows machine, ask your IDE or Google. I suggest making a header containing the includes and some definitions for future use.

The local variables we will use are:
char host_name[MAX_HOST_NAME];
struct hostent host_entity;
int socket_descriptor;
struct sockaddr_in socket_address;
socklen_t socket_size;

MAX_HOST_NAME is a macro defined by us. It doesn’t need to be big. To get the host name of ourselves we need to use «int gethostname(char*, size_t);».

gethostname(host_name, (size_t)MAX_HOST_NAME);

If you want to connect to another service just enter the URL of that service.

Now we want to get the host entity. Host entity is returned by «hostent gethostbyname(char);». This will give us the host entity of that node. The host entity includes a list of addresses from the host_name server. «Why?», you might ask. Because services like Google or Youtube don't run on a single server. They are scattered around the globe and we just got the list of server addresses. We will connect to one of them.

We know who we want to connect to but we still haven't created the socket. Let's do that now. We will use «int socket(int, int, int);». The return value is an integer of the identification number of the socket. Most likely, there are multiple sockets present in the machine at the same time. We want to keep track of which one is ours.

socket_descriptor = socket(AF_INET, SOCK_STREAM, 0);

There is nothing user-defined here. AF_INET falls back to PF_INET which is short for ProtocolFamily_INET. Which further falls back to 2. You can put 2 in there but this is more readable once you get used to it.

SOKC_STREAM defines that this socket will communicate through TCP/IP. If you want to communicate through UDP then you need to use SOCK_DGRAM. Which is short for SOCKet_DataGRAM. Yeah, it looks horrendous when written out like this. Maybe that's why they shortened it... Probably not, but one can hope.

Lastly, we send 0 for protocol. You can use a predefined one but 0 tells the OS to chose one for you so you don't have to worry about it. There is a niche problem with creating custom protocol sockets on a Virtual Machine on an old Windows machine. If you face this problem update Windows to at least Founders Update.

Next on the list is the scoket_address. If your compiler likes you it will initialize all the fields in the structure with 0s. But we will go over it once more just to be safe.

memset(socket_address, ’\0’, sizeof(socket_address));

Now we can fill the fields.

socket_address.sin_family = AF_INET;

Yes, the same value we created the socket with.

socket_address.sin_port = htons(SOCKET_PORT);

SOCKET_PORT is user-defined. I used my own port because I know which port the server listens to. This is different for services. You need to research which port your server listens to and input that.

The function we used is short for HostTONetworkShort. There is a long version too for four-byte inputs. Your machine and the network may not have the same byte order for data, this function fixes that problem for you.

The port numbers reach sixty-five thousand but the first 1024(0-1023) ports are reserved ports. They are sometimes called well-known ports too. You can't freely use them so be mindful of that.

socket_address.sin_addr.s_addr = inet_addr( host_entity.h_addr_list[0] );

We said that the host entity includes addresses of servers. h_addr_list is a char** type variable and it is the list of server addresses. We took the first one in the list but should the connection fail you will iterate through this list and try others.

Since the type is a char* we want to turn this into an actual internet address, hence the function «inet_addrr();».

The sin_addr structure only contains s_addr but they are different types so we put it inside the inner one.

Lastly, we want to calculate the size of the socket because different communication protocols have different socket sizes and the server doesn't know the size beforehand.

socket_size = sizeof(socket_address);

We made all of this preparation but where is the connection? you might ask. We will finally use «int connect(int, sockaddr*, socklen_t);».

connect(socket_descriptor, (struct sockaddr*) socket_address, socket_size));

If the connection fails, which is possible for multiple reasons, you can iterate over other addresses and repeat this process until you succeed or run out of addresses.

Servers that use sockets are built with the same building blocks and mindset but a little differently. We will talk about them in a future blog post. If you don't want to miss it and would like to connect with me, come find me on Twitter.