Caught by a 5 and half year old Perl bug

Perl can be a lot of fun and it can mean tearing your hair out; not that I'm suggesting other languages aren't the same. For the second time in a week I've been caught out by a nasty bug but after an hour or so of investigation imagine my surprise to see it appears to be a 5 and half year old Perl bug.

We use Locale::Maketext and have successfully used it for years. The project I'm working on has a number of daemon processes which naturally run in perl tainted mode because a) it is a good idea and b) principally we take tainted data (initially) and use it to create files. The daemons also do a lot of database interaction but we don't use DBI's TaintIn or TaintOut attributes.

We've also moved these daemons from a Perl 5.8.8 to 5.10.0 on Ubuntu in the last few weeks and suddenly found something was not working as expected. After much head scratching I was reduced to running the broken daemon process in debug and this is where the problems started. When run in debug the daemon appeared to loop in some low level Perl code (way before the problem I was looking for) and I finally discovered it was Locale::Maketext. The string being converted contained normal brackets () and so I was suspicious about BRACKET NOTATION but this uses square brackets []. I tried to reproduce in a small amount of separate code but the looping did not happen and by sheer chance I decided to use Devel::Peek and saw the data was tainted. Once I'd ensured my test code used tainted data too the looping was reproduced.

I had looked at the bugs in Locale::Maketext on rt but only quickly and I missed the most important one infinite loop in Locale::Maketext::Guts::_compile() when working with tainted values. Further reading on that bug leads you to pos() does not get updated when running in taint mode which is 5 and half year old bug. Now I'm not going to pass judgment as to why a 5 1/2 year old Perl bug is not fixed but it greatly surprised me.

The temporary workaround was straight forward and that was to untaint the scalar passed to maketext but this scalar came from the database via DBI and still left me puzzled as to why my database data was tainted when I was not using TaintOut - however, I'll leave that for another post.