Today someone posted a problem using the SQL Server XML datatype with DBI/DBD::ODBC on the dbi-users mailing list. I sorted their problem pretty quickly but noticed his code using length() on scalars which were bound with bound_col was not reporting the correct length. The example and output are below:
use strict
;
use DBI
qw(:sql_types
);
use Devel
::Peek;
use bytes
;
no bytes
;
my ($txt_de, $txt_ru);
{
use utf8
;
$txt_de = 'Käse';
$txt_ru = 'Москва';
}
binmode STDOUT, ':utf8';
my @dsn = qw/DBI:ODBC:xx xx xx/;
my %opt = (PrintError
=> 0
, RaiseError
=> 1
, AutoCommit
=> 1
, ChopBlanks
=> 1
);
my $dbh = DBI
->connect( @dsn, \%opt );
# Table was created with
# CREATE TABLE T2 (a VARCHAR(99), u NVARCHAR(99), x XML);
my $sth_ins = $dbh->prepare(
'INSERT INTO T2 (a, u, x) VALUES (?, ?, CAST( ? AS XML) )' );
foreach my $row ([$txt_de, $txt_de, "<d>$txt_de</d>"],
[$txt_ru, $txt_ru, "<r>$txt_ru</r>"]) {
$sth_ins->bind_param(1
, $row->[0
]);
$sth_ins->bind_param(2
, $row->[1
]);
$sth_ins->bind_param(3
, $row->[2
], {TYPE
=> SQL_WCHAR
});
$sth_ins->execute;
}
my $sth_sel = $dbh->prepare( 'SELECT u, x FROM T2' );
$sth_sel->execute;
$sth_sel->bind_col(1
, \
my $txt, {TYPE
=> SQL_WCHAR
});
$sth_sel->bind_col(2
, \
my $xml, {TYPE
=> SQL_WCHAR
});
my $i = 0;
while ( $sth_sel->fetch ) {
printf "%3u %3u %3u %s [%s] [%s]\n",
++$i, length($txt), bytes
::length($txt),
(utf8
::is_utf8($txt) ? ' utf8' : '!utf8'), $txt, $xml;
# NOTE, if I don't reset $txt each iteration the length() call returns
# the wrong answer.
#$txt = ''; #this line fixes it
#Dump($txt);
}
$dbh->disconnect;
which outputs:
1 4 5 utf8 [Käse] [<d>Käse</d>]
2 4 12 utf8 [Москва]
Notice the length (second column) is 4 for Käse and Москва even though the latter should obviously be 6. I noticed that by resetting $txt to '' in each fetch iteration it fixed the problem and then I added Devel::Peek and Dump for those two bound scalars I got:
1 4 5 utf8 [Käse] [<d>Käse</d>]
SV = PVMG(0x853b6fc) at 0x856d570
REFCNT = 2
FLAGS = (PADMY,SMG,POK,pPOK,UTF8)
IV = 0
NV = 0
PV = 0x8674698 "K\303\244se"\0 [UTF8 "K\x{e4}se"]
CUR = 5
LEN = 8
MAGIC = 0x86746c0
MG_VIRTUAL = &PL_vtbl_utf8
MG_TYPE = PERL_MAGIC_utf8(w)
MG_LEN = 4
2 4 12 utf8 [Москва] [<r>Москва</r>]
SV = PVMG(0x853b6fc) at 0x856d570
REFCNT = 2
FLAGS = (PADMY,SMG,POK,pPOK,UTF8)
IV = 0
NV = 0
PV = 0x8671098 "\320\234\320\276\321\201\320\272\320\262\320\260"\0 [UTF8 "\x{41c}\x{43e}\x{441}\x{43a}\x{432}\x{430}"]
CUR = 12
LEN = 16
MAGIC = 0x86746c0
MG_VIRTUAL = &PL_vtbl_utf8
MG_TYPE = PERL_MAGIC_utf8(w)
MG_LEN = 4
Those MG_LEN=4 lines were really suspicious but I was still stumped. The really great people at Perl Monks soon identified the problem as there was set magic and so probably SvSETMAGIC should have been called "somewhere" but hadn't been and ikegami even posted a worked example (see length() returns wrong result - suspicious magic). After this it was not too hard to track down a sv_setpvn that should have been a sv_setpvn_mg although I am still a little mystified why I need to know there is magic attached on the bound scalars when setting them.
Thanks to Perl Monks (again) and in particular ikegami.
Trackback URL for this post:
http://www.martin-evans.me.uk/trackback/79
Recent comments
32 weeks 38 min ago
34 weeks 2 days ago
35 weeks 6 days ago
36 weeks 22 hours ago
44 weeks 1 day ago
45 weeks 2 days ago
46 weeks 5 days ago
49 weeks 3 days ago
1 year 1 week ago
1 year 4 weeks ago