Software Development

From source code to executable - what's involved?
 

Editing -> Preprocessing -> Compiling -> Linking


Editing
- Software program to perform basic text processing tasks
- Software program typically referred to as an editor
- User (programmer) enters instructions in a programming language
- User saves instructions or program code as a file on disk
- Disk files of program code are referred to as source files
- User makes necessary modifications to file
- To help identify source files, suffixes are appended to the file name
- Commonly used editors in Unix:  emacs, xemacs, vi, pico
 
                   
Some commonly used file suffixes

.c

C language source file

.C  .cpp  .cxx  .cc

C++ language source file

.h

C header file

.hh 

C++ header file


 

Exercise
- In a shell window, open an empty file and start up your Unix-based editor (i.e. emacs)

 

dbock@shaula:/>  emacs --no-windows ex1.c

 

- Using your editor, enter the following simple C program code
- Save the program to a disk file

 

/* Simple C exercise */

#include <stdio.h>
#include <math.h>

#define NUM    10

main()
{
    int i;

    for( i = 0; i < NUM; ++i )
        printf( "i: %d\n", i );

    printf( "asin(0.0): %f\n", asin(0.0) );
}

 

Preprocessing
- Software program called automatically before compilation
- This program, called a preprocessor, is instantiated by the compiler
- Program performs special commands specified in source files
- Commands usually consist of
    - including commands/text from other files into source file
    - substituting or expanding symbols in source file
- Preprocessor instructions in code identified by # sign
- These instructions are also referred to as preprocessor directives
- Allows programmer to develop code that is easier to...
    - read
    - modify
    - port to other systems

Exercise
- Compile your C code only through the preprocessor phase by executing

 

user@machine %  cc -E ex1.c

 

- With the -E compiler option, compilation will terminate after the preprocessor phase
- What is the output of this step?  What do you notice?



- Notice how the preprocessor has taken the contents of stdio.h and math.h and included it in the code.

- Where do you suppose math.h/stdio.h are located in the Unix operating system? Look in /usr/include….!

- Note also how NUM in the output has been substituted with the value specified (10)

 

Compiling
- Translates a program from one language to another
- The compiler we’ll use translates C/C++ language code to machine instructions code
- Compilers first translate source code to assembly language code

Exercise
- Compile your C code into assembly language code by executing

 

user@machine %  cc -S ex1.c

 

- This will create an assembly language equivalent file with a .s extension
- Take a look at this file and note the assembly language instructions


- Programs called assemblers then translate assembly language to machine code
- Compiler calls assembler automatically during compilation
- Machine language code is also called object code
- Object code is in binary (non-readable) format
- Object code files are identified with the suffix .o

Exercise
- Compile your C code into object code by executing

 

user@machine %  cc -c ex1.c

 

- This will create an object code equivalent file with a .o extension
- Remember, this is a binary file (not ASCII) so we can't read this file

 

- Compilers examine code statements and check for syntax/semantics errors
- Compilers report any mistakes or errors to user and terminate compilation
- Users correct any errors back in the editing step and continue the process

 

Exercise
- Introduce an error into your C code by editing your file as follows

- Change the statement


    printf( "i: %d\n", i );

  to (remove the semi-colon at the end of the statement)

    printf( "i: %d\n", i )

 

- Re-compile your C code into object code by executing

 

    user@machine %  cc -c ex1.c

 

- What did the compiler report?
- Correct the error back in your editor


Linking
- Program called the linker or ld, combines or "links" other object code
- Program functions not defined in source file need to be located and combined with the program object code
- Other functions might be located in other object files or in collections of objects (libraries)
- Linker, often invoked from the compiler, needs to know what object files or libraries to link
- Successful linking phase results in the final binary program, or executable image

Exercise
- Compile your C code through the linking phase by executing

 

user@machine %  cc ex1.c

 

- What is the result?
- Depending on the compiler, this may have created an executable program named a.out.

- With some compilers, this linking step may have generated an error
- Linking errors can occur if the linker cannot find a particular function in your program

- For example, if the compiler cannot find the math function (asin) we are using, it will generate an error
- In these instances, you can instruct the linker to look in a math library for the function by executing

user@machine %  cc ex1.c –lm

 

- The -lm instructs the linker to look in the math library (libm.a) for undefined functions
- Note the convention used (i.e. -l(name) => link library named lib(name).a)

- To name our executable something other than a.out, use the -o option
- Compile your C code a final time with an output name by executing

user@machine %  cc -o ex1 ex1.c -lm

 

- Where do you suppose the math library (libm.a) is located in the Unix operating system? (look in /usr/lib)

- Code in these system libraries is linked and combined with executable code

- Linking occurs either at compilation (static linking) or at the time of execution or run-time (dynamic linking)

 

 

Loading
    - Before execution, the program must be transferred from disk to machine memory
    - A program called the loader automatically places the program in memory
    - This is done automatically when a user executes the program (by typing on command line)
 

Execution
    - Once the program is loaded in memory, the CPU executes each instruction
    - Each of the code statements are executed sequentially in turn
 

Debugging
    - Improper program execution necessitates correction by the programmer
    - Source code must be re-examined and analyzed to track down problems
    - Modifications are made in the editing phase and the process continues
 

Final Notes
    - By default, most compilers will invoke the various phases automatically
    - Specifically this involves calling the preprocessor, assembler, and linker
    - This is evident in our last exercise above when the final program was generated
    - We'll be writing C/C++ code in multiple source files as well as using library functions
    - What compiler option do you suppose we'll be using the most?  Why?
    - Our compiler commands will start to become longer and more complex
    - We'll learn that special files called Makefiles, help us organize our compilation tasks