Research Context
As the development of my ICMP-based Network Communication Project continues at full throttle, today I want to talk about the most “diplomatic” part of the operation: the Checksum. If you don’t stamp this seal correctly on the packet you’re sending, the Target host’s operating system treats your packet as a “Malformed data” and dumps it in the trash before it even gets through the door.
So, how exactly is this “seal” calculated in a low-level language? Let’s examine it step-by-step through the very algorithm I wrote and currently use in my project.
🛠️ The Heart of the Algorithm: perform_checksum
The ICMP protocol uses a 16-bit One’s Complement sum to ensure data integrity. This means you have to add up the entire packet in 16-bit (2-byte) chunks.
Here is what this mathematical operation looks like in the x64 Assembly realm:
Note: In this context, rdi represents the starting address of our data buffer, r14 is the starting offset, and r15 is the ending offset.
perform_checksum:
; RFC 1071 standard 16-bit one's complement sum algorithm
xor eax, eax ; Clear eax (Accumulator for the sum)
mov r10, r14 ; r10 = Current offset
.loop:
mov r11, r15 ; r11 = End offset
sub r11, r10 ; Remaining bytes to process
cmp r11, 1 ; Check if only 1 byte is left (odd length)
jle .last ; If <= 1 byte left, jump to final block
movzx r12d, word [rdi + r10]; Read 2 bytes (1 word) zero-extended
add eax, r12d ; Add to accumulator
add r10, 2 ; Move offset forward by 2 bytes
jmp .loop ; Repeat
🧩 Part 1: Gathering the Pieces
We are essentially telling the CPU: “Fetch me a 16-bit (word) chunk from memory, add it to the eax register, and move to the next 2 bytes.” This loop runs smoothly until we hit the end of the packet.
⚖️ Part 2: The “Odd Byte” Paradox
If the total length of the packet is an odd number (e.g., 11 bytes), the very last byte won’t have a pair to form a 16-bit word. In this scenario, our algorithm elegantly dives into the .final block:
.last:
je .final ; If exactly 1 byte left, handle it
jmp .wrap ; If 0 bytes left, finalize calculation
.final:
movzx r12d, byte [rdi + r10]; Read the last remaining single byte
add eax, r12d ; Add it to the accumulator
🔄 Part 3: The Wrap and Carry
Mathematically, this continuous addition might exceed a 16-bit boundary. This is where the most critical aspect of RFC 1071 comes into play: Adding the overflowing bits (the carry) back into the main sum.
.wrap:
mov r11d, eax ; Copy sum to r11d
shr r11d, 16 ; Shift right to isolate the carry bits
and eax, 0xFFFF ; Mask eax to keep only the lower 16 bits
add ax, r11w ; Add the carry bits back to the sum
adc ax, 0 ; Add any final carry (add with carry)
not ax ; One's complement (invert bits) for final checksum
ret
🎯 Why not ax?
The not instruction at the very end is the final requirement of the One’s Complement logic. By inverting the bits (0 -> 1, 1 -> 0), we ensure that when the receiving end takes our packet and performs the exact same addition, the result will be 0xFFFF. If it is, the data is clean, and our seal is valid!
Conclusion
Writing this algorithm in Assembly is a fantastic exercise to truly understand how data is laid out in memory and how the CPU crunches bytes. Thanks to this algorithm, our custom ICMP packets can bypass kernel-level drops and roam the network like “official documents”.
When I integrate dynamic targeting and fileless execution (memfd_create) into my Distributed management architecture, this checksum engine will remain the most reliable gear in the machine.
Stay Coded!