| Chapter 3:  About Memory Analysis 
	The
              Basic Objective Memory
              analysis is the practice of monitoring the behavior of a program
              during its execution to determine if there are errors in the program's
              use of memory. Understanding the basic principles of memory analysis
              is very useful in interpreting ZeroFault's output.  There
              are two broad categories of errors that memory analysis detects:
              the first is reading or writing to memory that is not available
              for use by the program, and the second is reading from memory available
              to the program but which has not been initialized. Either of these
              types of errors can be either benign or catastrophic depending on
              the circumstances of the programs execution. Errors involving the
              use of allocated but uninitialized memory are more likely to be
              benign than use of memory not allocated to the process. Writing
              to unallocated memory is more likely to be harmful than reading
              from unallocated memory. Any of these errors can have catastrophic
              results and often will only manifest themselves intermittently.
              
             These
              two types of errors result in many of the more serious errors that
              can plague the software development process. These errors can be
              very difficult to detect and even more difficult to resolve. This
              is because the symptom often seems completely unrelated to the cause.
              For instance: a memory overwrite (use of more data than was actually
              allocated) may often overwrite non-essential data or, when the data
              is critical, it may not be accessed until much later. The resulting
              behavior is that the program appears to work or does not begin to
              misbehave (or crash) until much later. 
             In
              order to detect errors ZeroFault monitors every memory allocation,
              reallocation and free that a process performs and also monitors
              every instruction and system call that the process executes. The
              data that ZeroFault collects when a memory management function is
              called (typically malloc, realloc or free)
              is used to verify the legality of each instruction as it is executed.
              For instance when a process allocates 10 bytes of memory and then
              uses 11 or 12 bytes instead of just the 10 that were allocated ZeroFault
              detects this and generates an error message indicating the instruction
              and line of code where the error occurred. This is an example of
              the basic function of memory analysis: the examination of the execution
              of the program to detect instances where the program's actions are
              not legal within the context of the current memory allocation and
              initialization state. 
             In
              order for ZeroFault or any other memory allocation tool to detect
              illegal use of memory it must understand the semantics of the memory
              allocation scheme in use. ZeroFault understands the semantics of
              the standard C library functions malloc, realloc,
              and free. Since the C++ operators new and delete
              resolve to the standard C library functions they are also covered.
              Fortran and Pascal programs also manage memory using the standard
              C library functions on AIX so they are also covered by default.
              If a program uses any form of memory allocation which eventually
              resolves to the standard C library functions it will also be fully
              covered. 
             If
              a program uses non-standard memory management ZeroFault will not
              be able to detect as many errors because ZeroFault will have no
              way of knowing when a region of memory is or is not available. 
             How
              Memory is Organized on AIXThere
            are four basic memory regions in an AIX program: Stack, Data, BSS,
            and Heap. Sometimes the Data, BSS, and Heap areas are collectively
            referred to as the "data segment".For
              a normal (non-large data model) AIX program the Data segment begins
              at the bottom of segment 2 (0x20000000) and ends at the beginning
              of the BSS segment. The Data segment contains constants used by
              the program that are not initialized to zero. For instance the string
              defined by char s[] = "hello world"; would exist in the
              Data segment. 
             The
              BSS segment starts at the end of the Data segment and contains all
              global variables that are initialized to zero. For instance a variable
              declared static int i; would be contained in the BSS segment.
              
             The
              Data and BSS segments are fixed in size at link time and do not
              grow or change during program execution. All data items in both
              segments are considered readable and writable because the data is
              initialized and available for program use. Each shared object module
              or shared library will have its own Data and BSS segment. Shared
              library data segments are stored in segment 2 (0x20000000) on aix
              3.2 and in segment F (0xF0000000) on AIX 4.x. 
             The
              Heap area begins at the end of the Data segment and grows to larger
              addresses from there. The Heap area is managed by malloc,
              realloc, and free, which use the brk
              and sbrk system calls to adjust its size. The Heap area
              is shared by all shared libraries and dynamic load modules in a
              process. 
             The
              Heap area is the primary source of memory management problems and
              is the main focus of the analysis that ZeroFault performs on a process.
              When a program allocates memory via malloc, it is returned
              a pointer to a region of memory of the appropriate size. Just before
              that region of memory there is information that malloc,
              realloc, and free will use to manage the Heap.
              If that region of memory gets overwritten it will often cause a
              segmentation violation in a function called malloc_y, free_y,
              realloc_y, or one of their child functions. If a program
              does have a segmentation violation in one of those functions it
              is highly probable that the problem is a memory overwrite error.
              
             In
              large data model programs the data segment begins at the beginning
              of segment 3 (0x30000000) and grows upward from there. It is followed
              by the BSS segment for the primary executable and then by the Heap
              area shared by all load modules. 
             The
              fourth region of memory in AIX is the stack. In AIX the stack starts
              at the top of segment 2 (0x2FFFFFFC) and grows to lower memory addresses
              from there. The stack pointer (register 1) points at the lowest
              point on the stack that is valid for access. The stack region contains
              automatic, or local, variables (for instance a variable declared
              as int i; within a function). Except in certain special
              circumstances a stack frame is created each time a function is called.
              Information such as saved register values, parameters, and the return
              address is stored in the stack frame in addition to local variables.
              When a function returns to its caller the stack is popped
              and the region of stack memory that was used by that function is
              no longer available to the process. 
             Stack
              memory has no known initial state. The initial value of a stack
              variable is not defined unless it is explicitly initialized. ZeroFault
              checks every operation on a stack variable to insure that all stack
              memory is properly initialized. Examining the value of a stack variable
              before it is used will generate a USTKR error (Uninitialized Stack
              Read). 
             Loads
              and StoresA compiled
            program consists of a sequence of machine instructions. These instructions
            can be classified into three groups on Power and PowerPC CPUs:
              Monitoring
            each load and store is the core of advanced memory analysis. When
            a store operation is performed ZeroFault checks that the region of
            memory to be written to is available to the process for writing. Further,
            the region of memory that was just stored to is marked as now having
            a known state. It is now available for reading if it was not already
            available for reading.instructions
                that load memory into CPU registers (read or load instructions)
                
              instructions
                that store CPU registers into memory (write or store instructions)
                
              instructions
                that operate on the contents of CPU registers  When
              a load instruction is performed ZeroFault checks the address that
              is being read from to ensure that it is allocated to the process
              and that it is available to be read from. In short it means that
              it must be both allocated and initialized.  Initialized means
              that it must have come from a source that has a known initial state
              (sbrk, Data, or BSS) or that it must have been written
              to since it was made available to the process (malloc,
              realloc, or stack memory). 
             System
              CallsOn AIX
            (and most operating systems) the kernel is a "black box", and while
            the interfaces to that black box are well-defined, what goes on inside
            is hidden from the application program. Many system calls operate
            on application program memory; for instance read will read
            some data from a file descriptor and write that data into a buffer
            in the application program's memory. Similarly write will
            read some data from an application memory buffer and write that data
            to a file descriptor.When
              doing memory analysis it is necessary to validate the parameters
              to each system call. For instance when the read system call is invoked
              it is necessary to ensure that the buffer passed to read is available
              to the application program and is large enough to contain the amount
              of data that read may fill in. Further, if that region
              of memory was not previously marked as initialized the bytes written
              to by read must now be marked as initialized. 
             ZeroFault
              knows the semantics of all system calls defined on AIX and validates
              the parameters passed to each of them to make sure that they are
              available to the process and initialized if necessary. When a system
              call updates application memory ZeroFault marks that memory as initialized.
              
             If
              there are system calls that have been added to the system that ZeroFault
              does not know about ZeroFault will not be able to validate the parameters
              to them or mark the regions that they write to as initialized. The
              result of this is that there may be errors that are missed or falsely
              reported uninitialized memory read errors if there are non-standard
              system calls in use. 
             A
              Word About Signal HandlersA Signal
            handler is a function that is invoked when a signal is delivered to
            a process. Signal handlers are often used to process IO (SIGIO), handle
            messages from other processes (SIGUSR1, SIGUSR2), catch timeout conditions
            (SIGALRM), etc. When a signal is delivered the thread that is currently
            executing stops and the signal handler is invoked on that thread's
            stack. The signal handler must run to completion before that thread
            can continue. In a single-threaded process this means that the process
            stops all execution until the signal handler completes. In a multi-threaded
            process other threads can be executed while the signal handler is
            running, but the thread that received the signal cannot resume execution
            until the signal handler completes.Signal
              handlers can be invoked at any time. This poses a problem when a
              signal handler tries to update a shared resource. For example, if
              a thread receives a signal while it is in the midst of updating
              a linked list, and the signal handler examines or updates that same
              list, this could result in serious consequences (such as a segmentation
              violation). Because of this potential problem there are a very limited
              number of things that can be safely done from a signal handler.
              
           |