Why Assembly Language is Vitally Important (A Trivial Example) by eokuwwy

Suppose you’re at work one day, and during your lunch break, a disturbing thought creeps into your mind. “What am I thinking? Clearly the most efficient way to flip a bit is to do it mathematically! It requires no boolean evaluations; it is only a calculation. You simply take the absolute value of b-1, where b is a whole number with a value of either 1 or 0. That is, b is a bit. No matter what the value of b is, it will be reversed by this mathematical computation. This will be a new red letter date in the history of science!” And with that, you gulp down the rest of your lunch in one big bite and sprint accross town back to your desk.

First, you search the intertubes to make sure that no one else has already made this discovery. Then you do research on the subject of flipping bits, and discover something that looks far more elegant than your idea. b = b ^ 1. That is b = b XOR 1. (XOR is exclusive OR) If b is 0, the result of b ^ 1 evaluates to 1. If b is 1, the result of b ^ 1 evaluates to 0. But aha! That’s just it. There is an evaluation taking place! Surely any evaluation requires extra time in computing the result!

So naturally, you set out to test your hypothesis and ultimately to prove to everyone that you have discovered the cure to all wordly ills. The first step, of course, is to write a C program and test the speed of your computation. You call it flipper.c. The program basically flips a bit a set number of times using two methods (which are contained within C functions): your new discovery, and the old-and-soon-to-be-abandoned-forever bitwise XOR method. It records and prints the time (in millesconds) that it takes to complete this set number of bit flips for each method. You also add a parameter that can be passed which will change the order of which method runs first. Passing an ‘m’ will make your math-based function run first, and passing a ‘b’ will make the crappy bitwise function run first. Ready, set, go! (into godliness)


root@slackware-vm:/home/nerdterm/c/random# gcc flipper.c
root@slackware-vm:/home/nerdterm/c/random# ./a.out m
Time to flip math: 68.138000
Time to flip bitwise: 39.914000
root@slackware-vm:/home/nerdterm/c/random# ./a.out m
Time to flip math: 56.697000
Time to flip bitwise: 36.105000
root@slackware-vm:/home/nerdterm/c/random# ./a.out m
Time to flip math: 60.908000
Time to flip bitwise: 37.965000
root@slackware-vm:/home/nerdterm/c/random# ./a.out m
Time to flip math: 60.767000
Time to flip bitwise: 40.302000
root@slackware-vm:/home/nerdterm/c/random# ./a.out m
Time to flip math: 51.322000
Time to flip bitwise: 39.456000
root@slackware-vm:/home/nerdterm/c/random# ./a.out b
Time to flip bitwise: 62.424000
Time to flip math: 56.711000
root@slackware-vm:/home/nerdterm/c/random# ./a.out b
Time to flip bitwise: 45.753000
Time to flip math: 49.112000
root@slackware-vm:/home/nerdterm/c/random# ./a.out b
Time to flip bitwise: 45.508000
Time to flip math: 50.846000
root@slackware-vm:/home/nerdterm/c/random# ./a.out b
Time to flip bitwise: 48.888000
Time to flip math: 55.403000
root@slackware-vm:/home/nerdterm/c/random# ./a.out b
Time to flip bitwise: 40.489000
Time to flip math: 54.205000

Hmmm…clearly the program must be written wrong. You’ve tried it both ways several times, and the bitwise function seems to be faster. That can’t be right! There is one instance where the math function ran faster, so that one must be correct. All the other times are clearly wrong. But you’ve looked at the code over and over again. You’ve correctly used pointers and made sure that no unnecessary assignments or extra steps are taking place. “What is the problem? This discovery is supposed to ease the suffering of all human-kind, and you’re telling me that it is slower than the crappy bitwise method?”

So what do you do now? If only you could see a mapping of your program’s code to the actual instructions being carried out by the computer, you might be able to figure out what kind of BS is being carried out by the C compiler, and what is preventing your world domination.

You use gcc -S to generate an assembly code file from your C program code.


root@slackware-vm:/home/nerdterm/c/random# gcc -S flipper.c


Now you can open up flipper.s and take a look at what is really happening at the lowest level (lowest level that is still “human readable”). One single instruction in assembly represents exactly one single machine code instruction.


In flipper.c, the new godly math-based function is named flipBitMath, and the old, boring, and inefficient bitwise function is named flipBitEval. Here are their assembly code representations in flipper.s.


.globl flipBitEval
	.type	flipBitEval, @function
flipBitEval:
.LFB1:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	movq	%rsp, %rbp
	.cfi_offset 6, -16
	.cfi_def_cfa_register 6
	movq	%rdi, -8(%rbp)
	movq	-8(%rbp), %rax
	movzwl	(%rax), %eax
	movl	%eax, %edx
	xorl	$1, %edx
	movq	-8(%rbp), %rax
	movw	%dx, (%rax)
	leave
	.cfi_def_cfa 7, 8
	ret
	.cfi_endproc
.LFE1:
	.size	flipBitEval, .-flipBitEval
.globl flipBitMath
	.type	flipBitMath, @function
flipBitMath:
.LFB2:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	movq	%rsp, %rbp
	.cfi_offset 6, -16
	.cfi_def_cfa_register 6
	movq	%rdi, -8(%rbp)
	movq	-8(%rbp), %rax
	movzwl	(%rax), %eax
	cwtl
	subl	$1, %eax
	movl	%eax, %edx
	sarl	$31, %edx
	xorl	%edx, %eax
	subl	%edx, %eax
	movl	%eax, %edx
	movq	-8(%rbp), %rax
	movw	%dx, (%rax)
	leave
	.cfi_def_cfa 7, 8
	ret
	.cfi_endproc
.LFE2:
	.size	flipBitMath, .-flipBitMath


Clearly flipBitMath has more lines of code in it. But maybe it’s because of the pointers and your amazing way of programming. After all, all of those mov instructions are just moving things around in memory. If you can get some of those out of the way, maybe you’ll have a better understanding of the actual computation instructions that are being carried out. You then write a much simpler and more useless version of the program, and call it tinyflip.c, shown below in its entirety.


#include <stdio.h>
#include <math.h>

int main(){
  int i = flipBitwise(0);
  int j = flipMath(0);

  printf("i,j = %d,%d\n", i, j);

  return 0;
}

int flipBitwise(int i) {
  return i^1;
}

int flipMath(int i) {
  return abs(i-1);
}

Now you generate an assembly file once again, using gcc -S tinyflip.c, and copy the body of each function’s assembly code into two separate files. flip_bitwise.asm contains the instructions within the flipBitwise function, and flip_math.asm contains the instructions within the flipMath function. Now you compare them.


diff flip_bitwise.asm flip_math.asm

9c9,13
< 	xorl	$1, %eax
---
> 	subl	$1, %eax
> 	movl	%eax, %edx
> 	sarl	$31, %edx
> 	xorl	%edx, %eax
> 	subl	%edx, %eax

Wow. flipMath really does have a lot more going on than you had hoped. In addition to the b-1 subtraction that is part of your mathematical formula, there are several more instructions that need to be carried out. movl moves the result of the b-1 subtraction into another part of memory. sarl is a bitshift to the right. Then a XOR evaluation is performed, followed by another subtraction. These steps together carry out the abs() function. If the integer is negative, that sequence of instructions will convert it to its positive integer representation. Not only are there extra instructions, but the process of computing the absolute value contains an evaluation, a XOR evaluation!

The top part of that diff output contains the entire computation of the bitwise XOR function. Yup, that’s it, just one instruction. xorl performs the bitwise evaluation and stores the result in %eax. Then the function returns. That’s it!

Well, that proves it. You feel like an idiot. If you had only dug out that old electronics kit, built yourself any digital circuit with a single bit output and fed it to a simple two input XOR gate, you could have saved yourself time, pain, anxiety, perhaps even money, and most importantly self-esteem.

Here’s flipper.c for your veiwing pleezure.


#include <stdio.h> #include <stdlib.h> #include <sys/time.h> #include <math.h>

struct timeval tm;
const long NUM_ITER=10000000;

double getTimeInMillisecs() {
gettimeofday(&tm, NULL);
double d = tm.tv_sec + ((long)tm.tv_usec/1000.0);
return d;
}

void flipBitEval(short *b) {
*b = *b^1;
}

void flipBitMath(short b) {
*b = abs(
b-1);
}

void flipBitsEval() {
double t1 = getTimeInMillisecs();
double t2 = t1;

short bit = 0; int count = 0; while(count < NUM_ITER) { flipBitEval(&bit); count += 1; } t2 = getTimeInMillisecs(); printf(“Time to flip bitwise: %f\n”, t2-t1);

}

void flipBitsMath() {
double t1 = getTimeInMillisecs();
double t2 = t1;

short bit = 0; int count = 0; while(count < NUM_ITER) { flipBitMath(&bit); count += 1; } t2 = getTimeInMillisecs(); printf(“Time to flip math: %f\n”, t2-t1);

}

int main(int argc, char *argv[]) {
if(argv == NULL || argv1 == NULL) {
printf(“You must supply either m or b\n”);
return 1;
}

if(*argv1 == ‘m’) { flipBitsMath(); flipBitsEval(); } else if(*argv1 == ‘b’) { flipBitsEval(); flipBitsMath(); } return 0;

}

NOTE: This is partially based on a true story about an individual that shall not be named.

blog comments powered by Disqus