Feb 252012
 

Valgrind – a cool tool to check memory related problems

Valgrind是非常有用的检查内存相关问题的工具。比如: 内存泄漏,double free memory,内存非法访问。基本上Segmentation fault都能用Valgrind查出来。我刚刚查出了一个很刁钻的bug,在找bug的过程中发现valgrind非常有用,但要用好,还需要点技巧。

先描述一下问题:

自己的程序总是Segmentation Fault。我先用Valgrind运行,重要结果如下:


==11060== Invalid write of size 8
==11060== at 0x44FF07: FileReader::FileReader() (in /net/nfsb/dumbo/home/zhanxw/smallTool/BamPileup)
==11060== by 0x410D05: BufferedReader::BufferedReader(char const*, int) (IO.h:232)
==11060== by 0x41126C: LineReader::LineReader(char const*) (IO.h:332)
==11060== by 0x41047A: RangeList::addRangeFile(char const*) (RangeList.cpp:128)
==11060== by 0x405B28: main (BamPileup.cpp:258)
==11060== Address 0x75fc2e8 is 0 bytes after a block of size 40 alloc'd
==11060== at 0x4C27CC1: operator new(unsigned long) (vg_replace_malloc.c:261)
==11060== by 0x411252: LineReader::LineReader(char const*) (IO.h:332)
==11060== by 0x41047A: RangeList::addRangeFile(char const*) (RangeList.cpp:128)
==11060== by 0x405B28: main (BamPileup.cpp:258)
==11060==

因为我的BufferedReader包含FileReader类,我最开始的几个思路:
1. 自己的code有bug
BufferedReader 和 FileReader都是自己写的,用过很多次没有问题,这次出现Valgrind报错在IO.h:232,因此反复检查了那段代码。
2. 怀疑link有问题的library
重新编译整个code多次。

但是问题依旧,后来给Valgind 这几个参数 –show-reachable=yes –leak-check=full ,再重新运行:

==11908== Invalid write of size 8
==11908== at 0x4584BB: FileReader::FileReader() (BgzfFileTypeRecovery.cpp:239)
==11908== by 0x4106B5: BufferedReader::BufferedReader(char const*, int) (IO.h:232)
==11908== by 0x410C28: LineReader::LineReader(char const*) (IO.h:332)
==11908== by 0x40FE2A: RangeList::addRangeFile(char const*) (RangeList.cpp:128)
==11908== by 0x4054D8: main (BamPileup.cpp:258)
==11908== Address 0x75fc2e8 is 0 bytes after a block of size 40 alloc'd
==11908== at 0x4C27CC1: operator new(unsigned long) (vg_replace_malloc.c:261)
==11908== by 0x410C0E: LineReader::LineReader(char const*) (IO.h:332)
==11908== by 0x40FE2A: RangeList::addRangeFile(char const*) (RangeList.cpp:128)
==11908== by 0x4054D8: main (BamPileup.cpp:258)
==11908==

这次一下发现原来是我link别人代码的时候,我们都有一个类叫做FileReader,编译器把错误的FileReader代码链接给我,所以把程序搞崩溃了。

总结一下,要是:
1. 自己一下子就用到Valgrind的这些参数
2. 链接别人的代码前先测试一下,然后就能把问题的原因归于新加入的代码

可惜没那么多“要是”,以此文纪念一下刚刚过去的3个小时。