Software Development

From source code to executable - what's involved?

Editing -> Preprocessing -> Compiling -> Linking

- Software program to perform basic text processing tasks
- Software program typically referred to as an editor
- User (programmer) enters instructions in a programming language
- User saves instructions or program code as a file on disk
- Disk files of program code are referred to as source files
- User makes necessary modifications to file
- To help identify source files, suffixes are appended to the file name
- Commonly used editors in Unix:  emacs, xemacs, vi, pico
Some commonly used file suffixes


C language source file

.C  .cpp  .cxx  .cc

C++ language source file


C header file


C++ header file


- In a shell window, open an empty file and start up your Unix-based editor (i.e. emacs)


dbock@shaula:/>  emacs --no-windows ex1.c


- Using your editor, enter the following simple C program code
- Save the program to a disk file


/* Simple C exercise */

#include <stdio.h>
#include <math.h>

#define NUM    10

    int i;

    for( i = 0; i < NUM; ++i )
        printf( "i: %d\n", i );

    printf( "asin(0.0): %f\n", asin(0.0) );


- Software program called automatically before compilation
- This program, called a preprocessor, is instantiated by the compiler
- Program performs special commands specified in source files
- Commands usually consist of
    - including commands/text from other files into source file
    - substituting or expanding symbols in source file
- Preprocessor instructions in code identified by # sign
- These instructions are also referred to as preprocessor directives
- Allows programmer to develop code that is easier to...
    - read
    - modify
    - port to other systems

- Compile your C code only through the preprocessor phase by executing


user@machine %  cc -E ex1.c


- With the -E compiler option, compilation will terminate after the preprocessor phase
- What is the output of this step?  What do you notice?

- Notice how the preprocessor has taken the contents of stdio.h and math.h and included it in the code.

- Where do you suppose math.h/stdio.h are located in the Unix operating system? Look in /usr/include….!

- Note also how NUM in the output has been substituted with the value specified (10)


- Translates a program from one language to another
- The compiler we’ll use translates C/C++ language code to machine instructions code
- Compilers first translate source code to assembly language code

- Compile your C code into assembly language code by executing


user@machine %  cc -S ex1.c


- This will create an assembly language equivalent file with a .s extension
- Take a look at this file and note the assembly language instructions

- Programs called assemblers then translate assembly language to machine code
- Compiler calls assembler automatically during compilation
- Machine language code is also called object code
- Object code is in binary (non-readable) format
- Object code files are identified with the suffix .o

- Compile your C code into object code by executing


user@machine %  cc -c ex1.c


- This will create an object code equivalent file with a .o extension
- Remember, this is a binary file (not ASCII) so we can't read this file


- Compilers examine code statements and check for syntax/semantics errors
- Compilers report any mistakes or errors to user and terminate compilation
- Users correct any errors back in the editing step and continue the process


- Introduce an error into your C code by editing your file as follows

- Change the statement

    printf( "i: %d\n", i );

  to (remove the semi-colon at the end of the statement)

    printf( "i: %d\n", i )


- Re-compile your C code into object code by executing


    user@machine %  cc -c ex1.c


- What did the compiler report?
- Correct the error back in your editor

- Program called the linker or ld, combines or "links" other object code
- Program functions not defined in source file need to be located and combined with the program object code
- Other functions might be located in other object files or in collections of objects (libraries)
- Linker, often invoked from the compiler, needs to know what object files or libraries to link
- Successful linking phase results in the final binary program, or executable image

- Compile your C code through the linking phase by executing


user@machine %  cc ex1.c


- What is the result?
- Depending on the compiler, this may have created an executable program named a.out.

- With some compilers, this linking step may have generated an error
- Linking errors can occur if the linker cannot find a particular function in your program

- For example, if the compiler cannot find the math function (asin) we are using, it will generate an error
- In these instances, you can instruct the linker to look in a math library for the function by executing

user@machine %  cc ex1.c –lm


- The -lm instructs the linker to look in the math library (libm.a) for undefined functions
- Note the convention used (i.e. -l(name) => link library named lib(name).a)

- To name our executable something other than a.out, use the -o option
- Compile your C code a final time with an output name by executing

user@machine %  cc -o ex1 ex1.c -lm


- Where do you suppose the math library (libm.a) is located in the Unix operating system? (look in /usr/lib)

- Code in these system libraries is linked and combined with executable code

- Linking occurs either at compilation (static linking) or at the time of execution or run-time (dynamic linking)



    - Before execution, the program must be transferred from disk to machine memory
    - A program called the loader automatically places the program in memory
    - This is done automatically when a user executes the program (by typing on command line)

    - Once the program is loaded in memory, the CPU executes each instruction
    - Each of the code statements are executed sequentially in turn

    - Improper program execution necessitates correction by the programmer
    - Source code must be re-examined and analyzed to track down problems
    - Modifications are made in the editing phase and the process continues

Final Notes
    - By default, most compilers will invoke the various phases automatically
    - Specifically this involves calling the preprocessor, assembler, and linker
    - This is evident in our last exercise above when the final program was generated
    - We'll be writing C/C++ code in multiple source files as well as using library functions
    - What compiler option do you suppose we'll be using the most?  Why?
    - Our compiler commands will start to become longer and more complex
    - We'll learn that special files called Makefiles, help us organize our compilation tasks