Tom Copeland's Recent Posts

RSS Feeds

« RubyForge newcaster | Main | Some excellent Ruby metaprogramming screencasts »

rcov crashing with [BUG] rb_gc_mark()

While working on some Rail apps for RollStream and using Mauricio Fernandez's excellent rcov plugin we started to encounter the [BUG] rb_gc_mark(): unknown data type problem.  We only saw this when we ran our controller tests; just running the unit tests wouldn't trigger it.   It was a bummer, though, because we couldn't see where we were coverage-wise.

I poked around rcov for a while using Valgrind - there's no Mac OS X port, but I had a Linux VMWare Fusion instance handy. After some flailing around I finally hit paydirt.  This Valgrind invocation:

valgrind --tool=memcheck --error-limit=no --leak-check=no \
--leak-resolution=low \
--log-file=valgrind.out /usr/local/bin/rcov --rails \
--aggregate coverage.data --text-summary -Ilib --html \
[... lots of controller names here ...]

turned up this problem report:

==13390== Invalid write of size 4
==13390==    at 0x784BE8E: coverage_event_coverage_hook (rcovrt.c:103)
==13390==    by 0x416E85: rb_eval (eval.c:4127)
[... stack elided ...]
==13390==  Address 0x7e419e8 is not stack'd, malloc'd or (recently) free'd

rcovt.c line 103 involves usage of a cov_array struct; I added some bounds checking like so:

$ diff -Naur rcovrt.c ~/new.rcovrt.c 
--- rcovrt.c 2008-08-28 17:50:16.000000000 -0400
+++ /Users/tom/new.rcovrt.c 2008-08-28 17:52:15.000000000 -0400
@@ -64,7 +64,9 @@
           if(!carray->ptr[sourceline])
                   carray->ptr[sourceline] = 1;
   } else {
+   if (carray && carray->len > sourceline) {
          carray->ptr[sourceline]++;
+   }
   }

   return carray;
@@ -98,7 +100,7 @@
static void
coverage_increase_counter_cached(char *sourcefile, int sourceline)
{
- if(cached_file == sourcefile && cached_array) {
+ if(cached_file == sourcefile && cached_array && cached_array->len > sourceline) {
          cached_array->ptr[sourceline]++;
          return;
  }


I rebuilt the gem, reran the coverage task, and huzzah!  It completes!

This isn't a great fix, of course - I'd much rather figure out what's wrong with the allocation of cached_array.  Perhaps someone cleverer than I can come up with a better fix.

Updated 8/27/08: Modified to document a better fix - check the cached_array->len attribute and compare it to the sourceline. 

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d83451d3c069e200e55471d2578834

Listed below are links to weblogs that reference rcov crashing with [BUG] rb_gc_mark():

Comments

Wow. Nicely done! I salute you!

@ryan, many thanks!

Just because cached_array is true doesn't mean cached_array->ptr is true, nor cached_array->ptr[sourceline]. I'm guessing one of those is invalid for one reason or another.

I think it should probably be:

if(cached_file == sourcefile && cached_array->ptr)

You could always inspect the ptr and sourceline individually to make sure they're valid, too.

@dan, yeah, I should email Mauricio, he could probably fix whatever the real problem is in about 5 minutes... the change I made seems to skip using a cached version of the source file contents, so it probably slows things down considerable. I need to dig into that extension a little more.

@dan, yup, that turned up in my Googlings, but seems inconclusive... kind of peters out with everyone saying "yeah, rb_gc_mark()" here too.

@tom - nice find. The crashes were completely random, works now, doesn't in 5 minutes, works again.

Unfortunately I've been too deep into work and severely lacking in C knowledge to try to track it down.

Hopefully a fix gets rolled into the master repo so we can all be anal about our coverage again :-)

Thanks,
Michael

Thanks for digging into this, Tom. Looks like Scott Barron has been busy too:

http://github.com/spicycode/rcov/commit/66909fb17cce40e3cf2e1312b16f7ab97b9fe559

I hope we'll see a new rcov gem soon...

@aslak - hm, I tried Scott's change but am still seeing segfaults; this time with just a "[BUG] Segmentation fault". I need to fire up Valgrind again and try to get to the bottom of those double frees...

@all, I've updated this post with a better fix that does some bounds checking... still not great, but better.

I applied your patch on a rcov fork on GitHub. You can install it as a gem.

Hope this helps:

http://mergulhao.info/2008/8/29/rcov-with-segfault-bug-patched
http://github.com/mergulhao/rcov/tree/master

@sylvestre, cool!

Thank you so much for this!
Just ran into problem, and had no idea how to fix it.

Applied the noted github gem and worked perfectly.

I installed mergulhao-rcov gem. run 'rake spec:rcov' again. it worked good. Thanks guys!

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Comments are moderated, and will not appear until the author has approved them.