C LANG

Let’s slow down and unpack the machinery before we write any more code. When you build a network tool in C on Linux, you are standing at the exact intersection where software meets the operating system, and the operating system meets the network hardware. Here is a deep dive into the C-language mechanics and the networking concepts you need to understand before Step 3.

1. The Preprocessor and Headers (C-Language)

In higher-level languages, you import modules that contain pre-written code. C doesn’t work like that. When you type #include , you are talking to a tool called the C Preprocessor.

Info Box: The Why and How of Headers * The Fact: C requires you to declare every function before you use it. * The Why: The C compiler reads your file top-to-bottom. If it sees a function like printf() or socket() but doesn’t know what arguments they take or what they return, it panics and throws an error. * The How: Files ending in .h (header files) don’t usually contain the actual code for the functions; they just contain the blueprints (signatures). The preprocessor literally copies and pastes those blueprints into your file before the compiler even starts. Later, a program called a “linker” connects your compiled code to the actual system libraries that execute the functions.

2. Sockets and File Descriptors (Networking & OS)

To talk to another computer, you need a connection point. In networking, this is called a Socket. But in C on a Linux system, a socket is treated as a file. Info Box: The Why and How of File Descriptors * The Fact: In Linux, “everything is a file.” A text document is a file, a keyboard is a file, and a network connection is a file. * The Why: It simplifies the operating system’s design. If everything is treated as a file, you can use the exact same functions (read() and write()) to type to a console, save to a hard drive, or send data across the globe. * The How: When you call socket(AF_INET, SOCK_STREAM, 0), the Linux kernel creates a data structure in the background to handle the TCP connection. It then hands your C program a simple integer (like 3 or 4). This integer is the File Descriptor. Whenever you want to send data, you just tell the OS, “Write this data to file descriptor 3,” and the OS routes it through the network card.

3. Structs and Memory Layout (C-Language)

When you connect to an IP address and a port, the operating system’s socket API requires that information to be packaged in a very specific format in memory. C doesn’t have objects or classes to handle this. It has structs. Info Box: The Why and How of Structs * The Fact: A struct (structure) allows you to group different data types (ints, chars, arrays) into a single, contiguous block of memory. * The Why: The Linux networking API was written decades ago. It doesn’t know how to read high-level objects. It expects to look at a specific memory address and find exactly 2 bytes for the port, followed immediately by 4 bytes for the IP address. * The How: You define a struct like sockaddr_in. When you set target.sin_port = 22, C goes to the exact byte offset within that struct’s memory block and writes the number 22. When you pass this struct to the connect() function, you are actually passing a pointer to the very first byte of that memory block.

4. The Endianness Problem (Networking + C-Language)

This is where low-level programming gets notoriously tricky. A standard IP address (IPv4) is 4 bytes of data. A port number is 2 bytes. But how do you read those bytes? Left-to-right, or right-to-left? Info Box: The Why and How of Network Byte Order * The Fact: Your computer’s processor and the internet read binary data in opposite directions. * The Why: Most modern processors (like Intel/AMD or the ARM chips in your digital cloud droplet and microcontrollers) use Little-Endian format. They store the least significant byte first. The internet protocols (TCP/IP), however, require data to be sent in Big-Endian format (the most significant byte first). * The How: If you just send the integer 22 (your SSH port) directly to the network card, the network will read it backward, and your packet will go to the wrong place. You must use the C function htons() (Host To Network Short). It intercepts your Little-Endian port number and manually flips the bytes into Big-Endian order before it hits the wire.

This bridges the gap between the concept of a port scanner and the raw, unpolished reality of writing one in C. Does the concept of treating a network connection as a “file descriptor” make sense, or would you like to see a quick code snippet showing how you actually read and write to one?