Contents

Legal Notices

Chapter 1:
  Quick Start


Chapter 2:
  Introduction and Installation


Chapter 3:
  About Memory Analysis


Chapter 4:
  Finding Memory Leaks


Chapter 5:
  Finding Memory Errors


Chapter 6:
  Startup Options


Chapter 7:
  Viewing Error Messages


Chapter 8:
  Viewing Source Code


Chapter 9:
  Tips and Techniques


Chapter 10:
  Troubleshooting


Chapter 11:
  Obtaining Support


Chapter 10:  Troubleshooting


MAL_FAIL or "Unable to malloc" message

This indicates that the program is running out of available heap memory. Since ZeroFault uses memory for its own purposes, a program may run out of memory under ZeroFault even though it runs fine without it. The first thing to check is that your ulimit for data is set high enough. You can check the ulimit by running:
	$ ulimit -d

(Note that ulimit displays and accepts values in kilobytes.) The default data ulimit is often set to 128 megs, which may be too small. As an example, you can increase it to 250 megs by running:

	$ ulimit -d 256000

If increasing your ulimit doesn't solve the problem, you can build or patch the target program to use a large address space. Normal AIX executables only use one segment, or 256 megabytes, of address space for the data segment.  But using the large address space feature, you can make your program use up to 8 segments, or 2 gigabytes, for data.

You can build your program with the large address space by specifying the -bmaxdata flag to the compiler/linker. For instance, the executable created by the following command will use up to 4 segments for its data:

	$ cc -o foo foo.o -bmaxdata:0x40000000

You can also patch an already linked executable to use the large address space by using a simple command. This chapter of the AIX General Programming Concepts guide gives more details about the large address space model and how to patch executables to use it. If you have the documentation engine for AIX, search for "Large Program Support."

Other possible causes of running out of memory are:

  • Running out of paging space. Use the lsps -a command to check paging space while the process is running.
  • Corruption of the heap data structures, caused by program errors. Resolving the errors indicated by ZeroFault, especially the Bad Memory Writes, should fix this.

A program that runs fine without ZeroFault fails when run under ZeroFault

ZeroFault is designed not to affect a program's behavior, so your program should act the same as when it is run from the command line without ZeroFault. However, programs that are timing-dependent may run differently under ZeroFault, since it does make the program run slower. Given that normal program execution speed can be affected by a number of factors (such as system load), it is generally not a good idea to have a program be this timing-dependent in any case.

Programs that have memory errors may act differently when run under ZeroFault, since ZeroFault changes the location of allocated memory, and the contents of uninitialized memory may be different. For this reason, programs that have errors may exhibit different failure characteristics when run under ZeroFault, or they may even fail under ZeroFault and not fail when run normally. Resolving all the errors reported by ZeroFault should eliminate these problems.

A setuid program doesn't run under ZeroFault

ZeroFault will run setuid/setgid programs, but it must run them under the uid/gid of the user invoking ZeroFault. This is due to a security restriction in the AIX facility that ZeroFault uses to launch the program. If you need the program to run under the uid/gid that it is set to, use the su command to set your uid/gid to that of the program before you run ZeroFault.

© Copyright 2013 The ZeroFault Group, LLC. All rights reserved. All logos and trademarks are property of their respective owners.