ODBC - The Connection Process

A great deal of issues arise in the dbi-users mailing list and to me personally concerning ODBC connection issues so I thought I'd describe the ODBC connection process here (with a slight Perl angle since I look after DBD::ODBC). The information here should be of general interest to anyone using ODBC.

The first thing to note is that in ODBC 3 there are two connection APIs; SQLConnect (the first and older API) and SQLDriverConnect (the new ODBC 3 API). DBD::ODBC uses both but more on that later.

Applications link to an ODBC Driver Manager and some of the driver manager's responsibilities are to manage the connect APIs, locate the ODBC Driver to load and pass information to the ODBC Driver to enable it to connect you to a database usually as a specific user. In addition the ODBC Driver Manager provides APIs to the ODBC Driver to retrieve data source attributes (e.g., SQLPrivateProfileString).

First we have to cover some background information. If you think you know this or just want to skip it scroll down to The Connection Process.

Data Source Names (DSNs)

The key to the SQLConnect/SQLDriverConnect APIs is the Data Source Name (DSN). A DSN is a way of defining various attributes for an ODBC database connection under one name. Typically a DSN defines the name of the ODBC Driver to use and other attributes like the database name, host name, database username/password etc but in the main it only really has to define the ODBC Driver and all the other attributes are optional.

In Windows the DSN is stored in a registry key and ODBC Drivers in another registry key and the ODBC Driver Manager provides the GUI ODBC Administrator to create/edit DSNs. When ODBC Drivers are installed an entry is added to the registry key defining ODBC Drivers and another key defines the DLL which is used to create DSNs for that ODBC Driver.

On non-Windows platforms DSNs are stored in the odbc.ini (or ~/.odbc.ini file for user DSNs) and ODBC Drivers in the odbcinst.ini file and although ODBC Driver Managers like unixODBC provide a GUI application (and the odbcinst command) to create/edit installed ODBC Drivers and DSNs most people simply edit the files. Actually, unixODBC works just like the ODBC Driver Manager in Windows in that when an ODBC Driver is installed it can add a Setup attribute to the driver entry in the odbcinst.ini file and then unixODBC can load that shared object to create/edit DSNs but the GUI part of unixODBC is written in QT and there are few drivers which provide a QT shared object for creating/editing DSNs.

A common problem for people on non-Windows problems arises from incorrect permission on the odbc.ini and odbcinst.ini files. You can use odbcinst -j in unixODBC to find which files it is using and then you need to ensure the user running the ODBC enabled application has read access on those files.

$ odbcinst -j
unixODBC 2.2.11
DRIVERS............: /etc/odbcinst.ini
SYSTEM DATA SOURCES: /etc/odbc.ini
USER DATA SOURCES..: /home/martin/.odbc.ini

USER/SYSTEM DSNs

The first thing you need to realise is there are USER and SYSTEM DSNs. USER DSNs are only available to the logged in user and SYSTEM DSNs are available to anyone. In reality, in Windows this is achieved by putting USER/SYSTEM DSNs in different registry keys and in UNIX by putting them in different files (usually /etc/odbc.ini or /usr/local/odbc.ini for SYSTEM DSNs and ~/.odbc.ini for USER DSNs). This in itself is a common source of problems since people define USER DSNs when logged in as user "fred" then run their application (e.g., Apache, IAS etc) as user "dave" and wonder why they get a DSN not found error.

SQLConnect vs SQLDriverConnect

The process in the driver manager is pretty much the same whether you call SQLConnect or SQLDriverConnect (with some exceptions).

The SQLConnect API takes ServerName, UserName and Authentication strings. Generally these are the DSN name, database username and database password. Beyond that it is fairly inflexible and even the ServerName argument has a maximum length which is usually 32 characters.

The SQLDriverConnect API is far more flexible in that it takes arguments of a WindowHandle, InConnectionString, OutConnectionString, and DriverCompletion. The InConnectionString is a ';' separated list of attributes which may include DSN, UID (username), PWD (password), DRIVER (driver name), FILEDSN (pointer to a file containing a DSN), SAVEDSN (file to write the completed connection string) and any others the ODBC Driver cares to define. The OutConnectionString (see odbc_out_connect_string in DBD::ODBC) returns a completed connection string which may be used to connect to the database again (more later). The WindowHandle allows the ODBC Driver Manager to use the ODBC Driver Setup DLL/shared object to display a GUI dialogue to create/edit DSNs (this is another area where unixODBC and QT have some issues). The DriverCompletion argument allows the application to say how it would like the ODBC Driver to behave when the attributes passed in InConnectionString are insufficient (or not) to connect to a specific database:

SQL_DRIVER_PROMPT
SQL_DRIVER_COMPLETE
SQL_DRIVER_COMPLETE_REQUIRED
SQL_DRIVER_NOPROMPT

(See below for the meanings of these attributes).

For the sake of this discussion I am assuming the ODBC Driver manager maps calls to SQLConnect to SQLDriverConnect where ServerName, Username and Authentication arguments are mapped to SQLDriverConnect's InConnectionString as DSN/UID/PWD attributes in a ';' delimited string and DriverCompletion is SQL_DRIVER_COMPLETE. i.e., if someone calls:

SQLConnect("mydsnname", "myusername", "mypassword")

it ends up in SQLDriverConnect with an InConnectionString of:

DSN=mydsnname;UID=myusername;PWD=mypassword

and DriverCompletion of SQL_DRIVER_COMPLETE.

From here on the discussion is SQLDriverConnect focussed.

ODBC defined connection attributes

The SQLDriverConnect API has an InConnectionString argument which is a ';' delimited string of attributes. ODBC defines some attributes:

DRIVER - the name of the ODBC Driver
There are some extra rules for defining the location and name of the DLL or shared object for the driver which require the filespec to be enclosed in {}.

DSN - a data source name

UID - a database username

PWD - UID's password

FILEDSN - the name of a file containing the attributes for the DSN
Few non-Windows drivers support FILEDSN although all Easysoft drivers do.

SAVEFILE - file to save completed connection string
Few non-Windows drivers support SAVEFILE although all Easysoft drivers do.

For DRIVER see "DSN-less Connections".

If a DSN is supplied it is simply a name used to lookup the other attributes in the DSN (in the registry or odbc.ini file).

If UID/PWD are supplied they override any UID/PWD in the DSN.

FILEDSN defines the name of a file containing all the other attributes a driver may need to connect to the database. FILEDSN is rarely used but is useful if you want to distribute a DSN with your application.

ODBC Driver defined attributes

An ODBC Driver may define whatever attributes it requires. Often these are attributes like "database" (database to connect to), "servername" (server where database resides), port (port to connect to), but there is no universal standard.

If an ODBC Driver has a GUI setup DLL/shared object to create/edit DSNs it will allow you to enter the values for these attributes and store them in the DSN. Otherwise, you just have to consult the ODBC Driver documentation to find out what attributes it requires/defines.

DSN-less Connections

DSN-less connections are obviously those which do not specify a DSN. That is possible because instead you specify the DRIVER attribute and optionally all the other attributes the driver needs to connect to the database (or the driver prompts for the missing attributes). The DRIVER attribute tells the ODBC Driver Manager which driver to load and pass the InConnectionString on to. After that it is up to the ODBC Driver to find any other attributes it needs in the InConnectionString or by prompting the user for additional attributes (this depends on the DriverCompletion argument).

When specifying the driver in DSN-less connections you usually delimit the name with curly brackets { and } in case the driver contains spaces or filespec characters e.g., "DRIVER={Easysoft SQL Server Driver}".

The DriverCompletion argument

The DriverCompletion argument may be:

SQL_DRIVER_PROMPT

The driver displays a dialog box, using the values from the connection string and system information (if any) as initial values. When the user exits the dialog box, the driver connects to the data source. It also constructs a connection string from the value of the DSN or DRIVER keyword in InConnectionString and the information returned from the dialog box. It places this connection string in the OutConnectionString buffer.

There are very few ODBC Drivers on non-Windows platforms that can handle this - the Easysoft ODBC-ODBC Bridge is the only one I know of.

From DBD::ODBC 1.32_2 you can pass odbc_driver_complete to the connect method but this is unlikely to work on non-Windows platforms as it requires a window handle and a compatible driver.

SQL_DRIVER_COMPLETE
SQL_DRIVER_COMPLETE_REQUIRED

If the connection string contains enough information, and that information is correct, the driver connects to the data source and copies InConnectionString to OutConnectionString. If any information is missing or incorrect, the driver takes the same actions as it does when DriverCompletion is SQL_DRIVER_PROMPT, except that if DriverCompletion is SQL_DRIVER_COMPLETE_REQUIRED, the driver disables the controls for any information not required to connect to the data source.

This is the most flexible option on non-Windows platforms or on Windows if the application is not a GUI as it tells the ODBC driver to fill in the missing pieces as it can.

SQL_DRIVER_NOPROMPT

If the connection string contains enough information, the driver connects to the data source and copies InConnectionString to OutConnectionString. Otherwise, the driver returns SQL_ERROR for SQLDriverConnect.

This is pretty inflexible in all environments as it means that if insufficient attributes are found for a connect the SQLDriverConnect will fail (even if it could find the needed attributes in a DSN).

DEFAULT DSNs

ODBC defines a DEFAULT DSN which will be used when no DRIVER or DSN attribute is defined. The DEFAULT DSN is named "DEFAULT" and works like any other DSN except the ODBC Driver Manager knows to look for it if no Driver/DSN attributes are specified.

How do drivers "complete" a connection string?

When an ODBC Driver is handed a connection string it parses it into separate attributes and then determines if it has sufficient attributes to connect to the database. Depending on what you pass in the DriverCompletion argument the driver may consult the DSN to obtain other attributes it needs to connect, throw a dialogue to obtain the remaining attributes or error. Attributes passed in the InConnectionString override those in any DSN or dialogue (greyed out in dialogues if SQL_DRIVER_COMPLETE_REQUIRED).

Assuming the ODBC Driver has insufficient attributes to connect to the database and a DSN is named (and SQL_DRIVER_COMPLETE|SQL_DRIVER_COMPLETE_REQUIRED is specified) it can use the ODBC Driver Manager API SQLGetPrivateProfileString to obtain any remaining attributes. It sets the DSN name then requests attributes by name and the driver manager will look them up in the DSN and return them to the driver. In this way the ODBC Driver does not need to know where the additional attributes are held, only their name. It uses this method to "complete" the connection string and on connection the full OutConnectionString to reconnect to this database will be returned (applications like MS Access use this to store the full connection string so it does not need the driver to prompt the next time you connect).

This works well in Windows but falls down on non-Windows platforms where some drivers do not use the SQLGetPrivateProfileString API or store connection information in other files (e.g., freeTDS can store database and protocol information in the freetds.conf file). Worse, some drivers do use SQLPrivateProfileString but also have a driver-specific configuration file outside of ODBC and use different names for the attributes in each and define different orders for where they look first.

If you are working on an ODBC Driver or have input to the development of one then I strongly suggest you encourage the use of SQLGetPrivateProfileString and linking with an ODBC Driver Manager.

Example ini/registry files/entries

In Windows Drivers and DSNs are held in the registry. In unix, DSNs and Drivers are held in ini files. Here is an example driver entry (usually /etc/odbcinst.ini in UNIX):

[ODBC]
Trace=yes
TraceFile=/tmp/unixodbc.log

[Easysoft ODBC-SQL Server]
Driver=/usr/local/easysoft/sqlserver/lib/libessqlsrv.so
Setup=/usr/local/easysoft/sqlserver/lib/libessqlsrvS.so
Threading=0
FileUsage=1
DontDLClose=1
UsageCount=1

This defines the ODBC section which enables tracing and names the trace file. The driver manager will write an ODBC API trace to that file. This also defines a driver called "Easysoft ODBC-SQL Server", the driver shared object, a setup library and various other attributes for the driver manager uses.

As with driver definition above DSNs are defined like the example below (/etc/odbc.ini in UNIX):

[SQLSERVER_SAMPLE]
Driver=Easysoft ODBC-SQL Server
Description=Easysoft SQL Server ODBC driver
Server=server.domain.com
Port=1433
Database=northwind
User=sa
Password=password
Mars_Connection=No
Logging=0
LogFile=
QuotedId=Yes
AnsiNPW=Yes
Language=
Version7=No
ClientLB=No
Failover_Partner=
VarMaxAsLong=No
DisguiseWide=No
DisguiseLong=No
DisguiseGuid=No
Trusted_Connection=No
Trusted_Domain=
IPv6=No

This DSN is called "SQLSERVER_SAMPLE" and uses the "Easysoft ODBC-SQL Server" driver. Only the Driver attribute is absolutely necessary and the other driver manager attribute is Description which may be shown in some GUI administrators. All the other attributes are specific to this ODBC Driver.

The Connection Process

If you have not read the above I hope you know your ODBC API.

This discussion assumes the application is linked with an ODBC Driver Manager which provides SQLDriverConnect (or SQLDriverConnectW) and SQLGetPrivateProfileString.

The application calls SQLDriverConnect with a connection string and DriverCompletion. The connection string needs to define an attribute which allows the driver manager to locate the ODBC Driver (DRIVER or DSN). If the connection string names a DRIVER the driver manager loads it. If the InConnectionString contains a DSN/FILEDSN the driver manager looks up the DSN to find the driver and loads the driver. If the driver cannot be found this way you end up up with an error saying something like "Datasource not found and no DEFAULT defined" (although see DEFAULT DSNs above).

At this stage the ODBC Driver manager needs to allocate an environment handle (SQLAllocHandle for SQL_HANDLE_ENV), set the ODBC version (SQLSetConnectAttr for SQL_ODBC_VERSION) and allocate a connection handle (SQLAllocHandle for SQL_HANDLE_DBC). In particular the SQLAllocHandle for an environment can fail at this stage if the ODBC Driver requires an environment variable to be set (e.g., DB2INSTANCE, ORACLE_HOME). After this the InConnectionString is passed to the ODBC Drivers SQLDriverConnect.

From this point on what happens is down to the ODBC Driver. It parses the InConnectionString and extracts all attributes and values. It then needs to ascertain if it has sufficient attributes to connect to the database although if SQL_DRIVER_PROMPT is defined it must throw a dialogue anyway. If it is short of attributes to connect it may look up additional attributes via the FILEDSN/DSN attributes and SQLPrivateProfileString so long as DriverCompletion allows. As explained above this is where things deviate for non-Windows as some drivers do not use SQLPrivateProfileString and/or use a separate file to define connections to databases. At this stage and depending on DriverCompletion the ODBC Driver will either complete the missing attributes (see "How do drivers "complete" a connection string" above), throw a dialogue or error.

Perl DBD::ODBC

Perl's DBD::ODBC module has a long history and attempts to be compatible with as many ODBC Drivers as possible. This means it aims to support ODBC 1/2 drivers and more modern ODBC 3 drivers. As a result, any call to DBI's connect method is first attempted with SQLDriverConnect and if that fails with SQLConnect. This is to continue to support older Perl DBD::ODBC code which was written for SQLConnect usage but to always try SQLDriverConnect first because it supports more features (like the returning of the completed out connection string). This is important to note since if you do:

DBI->connect("dbi:ODBC:fred", "dbuser", "dbpass");

DBD::ODBC will first call:

SQLDriverConnect(InConnectionString="fred;UID=dbuser;PWD=dbpass")

which will fail since there is no DSN or DRIVER then it will fallback to:

SQLConnect(ServerName="fred", UserName="dbuser", Authentication="dbpass");

which will work if there is a DSN called "fred". The upshot of this is that if you are passing a DSN name to DBI's connect then prefix it with "DSN=" especially if you've added other attributes or:

  • you could waste time calling SQLDriverConnect only for it to fail and fallback on calling SQLConnect
  • you will be unable to get the completed odbc_out_connect_string
  • the UNICODE support is better in SQLDriverConnect
  • odbc_driver_complete will be ignored

Additionally there is a small conflict in DBD::ODBC with respect to DBI's connect method providing a connection string and optionally a username/password (the 2nd and 3rd arguments to DBI's connect) as the connection string could also contain UID/PWD attributes. DBD::ODBC resolves this by only appending the username/password passed to DBI's connect to the connection string if the connection string does not already contain the UID/PWD attributes. However, it only appends DBI's username/password if the connection string is already longer than SQL_MAX_DSN_LENGTH (the largest string SQLConnect can handle) or if the connection string already contains a DSN or DRIVER attribute as either of these would imply we need to call SQLDriverConnect.

So:

DBI->connect("dbi:ODBC:DSN=fred","dbuser","dbpass")
results in
SQLDriverConnect(InConnectionString="DSN=fred;UID=dbuser;PWD=dbpass")
as DBD::ODBC added the dbuser/dbpass because it saw you used "DSN" but did not use the UID/PWD attributes.

DBI->connect("dbi:ODBC:DSN=fred;UID=username;PWD=password", "dbuser", "dbpass")
results in
SQLDriverConnect(InConnectionString="DSB=fred;UID=username;PWD=password")
because connection string attributes override those passed as 2nd/3rd arguments to DBI's connect.

DBI->connect("dbi:ODBC:DSN=fred;attribute1=value")
results in:
SQLDriverConnect(InConnectionString="DSN=fred;attribute1=value");

DBI->connect("dbi:ODBC:DRIVER={fred};attribute1=value")
results in
SQLDriverConnect(InConnectionString="DRIVER={fred};attribute1=value");

and in addition UID=x and PWD=y will be added if you passed a username/password to DBI->Connect.

Note that if DBD::ODBC is built with UNICODE support (see the -u argument to perl Makefile.PL and odbc_has_unicode and you have a newer DBI (after 1.607) then DBD::ODBC uses the new DBD interface dbd_db_login6_sv (accepting scalars rather than C strings as the older one did) and this really calls SQLDriverConnectW (the so called Wide version of SQLDriverConnect) which supports a subset of Unicode. The scalars passed to DBI's connect are converted to UTF16 and passed to SQLDriverConnectW. The ODBC Driver Manager needs to determine if the ODBC driver supports the wide version and if it does not it maps the call the SQLDriverConnectA (ANSI version) translating the arguments back (with possible lossage) on the way.

If you have an older DBI (1.607 of older) or have not build DBD::ODBC with Unicode support then the older DBD interface dbd_db_login6 is used and this only supports 8bit characters.

If you need Unicode support I strongly recommend you read Unicode in the DBD::ODBC documentation.

TODO

o more on SQL_DRIVER_PROMPT

Change History

19-May-2010
Some corrections to Perl section including better Unicode description
Added HTML links for a lot of stuff
pasted from emacs so hard newlines were inserted - removed

20-May-2010
Added file permission problem for ini files.

17-June-2010
Add link to odbc_out_connect_string
Add link to Easysoft ODBC Drivers
Add link to ODBC-ODBC Bridge
Add some internal links

15-July-2010
Change unixODBC text to HTML links
Added odbcinst -j example
change QT text to HTML links

30-September-2012
Refomatted some text
Add odbc_driver_complete
Added example ini files

Comments

When is DBD::ODBC needed?

If I recall correctly a long, long time ago I used DBD::ODBC to connect to and Access database. Back then I had trouble understanding the whole issue of DBD::ODBC vs DBD::SomeRealDatabase.
If I understood correctly ODBC is another abstraction over the actual access to the database. So I can use

My code - DBI - DBD::Pg - PostgreSQL

or I can use

My code - DBI - DBD::ODBC - (the ODBC - PostgreSQL driver) - PostgreSQL

For MS Access I understand the need for ODBC as there is no DBD::Access but when do I need the additional abstraction for other databases? I thought ODBC is some Windows only thing but from your article I understand it is also used on Linux?

Then there was ADO and Jet. What are those? How are they related to Perl and ODBC?

Now I see there are DBD::ODBC, Win32::ODBC, DBD::ADO and maybe a few other related modules (but no DBD::Jet) and I have no idea what is the difference between them and when each one of those is needed or preferred?

There are also JDBC and DBD::JDBC that just further confuse me. I think they are somehow parallel to ODBC but in the Java world but I am really not sure.

It would be great to see some clarification on these issues as I guess I am not the only one confused.

When is DBD::ODBC needed?

Thanks szabgab, you raise an interesting set of questions which are not that easy to answer. I'll briefly tackle the main ones here and will probably write this up more fully later.

As you say ODBC is another abstraction over the actual database access library. ODBC is a huge specification that attempts to make it easier to access any database without knowing much about it beforehand. The very large number of ODBC aware applications that work with hundreds of ODBC drivers is a good illustration that although ODBC is often not that straight forward an API it is one which people have repeatedly used successfully. ODBC is still very popular due to the large number of drivers that are available.

JDBC is the equivalent of ODBC for Java. Some databases only have a JDBC driver and that is mostly why DBD::JDBC exists although I believe at some stage Tim Bunce was looking at some of JDBC for inspiration for DBI 2. Interestingly, when JDBC was first released there were virtually no JDBC Drivers so Sun released a JDBC-ODBC Bridge so Java users could access their existing databases from Java through their ODBC drivers.

JET (Joint Engine Technology) is a database engine Microsoft Access and Visual Basic use. The original (and often still used) ODBC Driver for MS Access is built over JET.

So for your example of postgres there is a DBD::Pg so you can use:

Perl - DBI - DBD::Pg - postgres client library - PostgreSQL

but as the postgres people have provided an ODBC driver you can also do:

Perl - DBI - DBD::ODBC - Postgres ODBC Driver - postgres client library - PostGreSQL

In the above case you probably would not want to use DBD::ODBC because it adds another layer. However, if you were writing common code for multiple databases then you might as they would all use DBD::ODBC and different ODBC Drivers for each database but they should all work with the ODBC specification e.g., the date/time formats are the same if you use ODBC syntax. Also, may be you have an ODBC driver for all the platforms you are supporting but the DBD::mydb driver does not exist or work on all of them.

ODBC is far from Windows only. On non-Windows platforms there are now at least 3 ODBC Driver managers I know of (unixODBC being the most popular) and there are a huge number of ODBC Drivers for UNIX and Linux from IBM, Oracle, 3rd party ISVs like Easysoft (where I work) and many open source ODBC Drivers for PostgreSQL, MySQL etc. The unixODBC Driver manager offers a complete and accurate implementation of the ODBC specification so much so that code written for Windows ODBC is ported to non-Windows with no effort in the ODBC area. Most Linux distributions provide a packaged unixODBC.

Why would you use Win32::ODBC over DBD::ODBC over DBD::ADO? Well I'm not a user of Win32::ODBC or DBD::ADO but here is my quick summary:

Win32::ODBC

Started life in 1996 and is currently at release 0.034 released 14-June-2008.

Win32::ODBC is a standalone Perl module not part of the DBI family (although I believe an experimental DBD layer for Win32::ODBC once existed before DBD::ODBC). Like DBD::ODBC it links with an ODBC Driver Manager but it is limited to Windows only. As such, if you write code using Win32::ODBC you are stuck with Windows and if you need a non-Windows platform you'll have to rewrite your code using something else.

The only thin I am aware of that you can do in Win32::ODBC you cannot do with DBD::ODBC is configure/create a DSN (with its ConfigDSN).

Win32::ODBC uses the ODBC 2.0 specification only and I'm unsure of its support for Unicode or for 64 bit Windows.

DBD::ADO

Started life some time in 1999 and is currently at release 2.97 released 23-Mar-2009.

Is Windows only.

Requires the Win32::OLE module and ADO (a set of COM objects for accessing data sources) from Microsoft. ADO is a layer between a programming language and OLE DB that allows programs to access data without knowing how the database is implemented.

Mostly used with Microsoft IIS to access a database from inside an ASP page.

Has virtually no passes on CPAN testers!

DBD::ODBC

Started life some time around 1994 but the Changes file's earliest recorded date is 1998. The current release is 1.24 released 14-May-2010.

DBD::ODBC is a DBI driver for ODBC supporting ODBC from versions 1 to 3.52. You need to install DBI, then build DBD::ODBC against an ODBC Driver Manager then you can use and ODBC Drivers your Driver Manager knows about. DBD::ODBC gains all the advantages of being a DBD under DBI as:

o DBI defines a common set of methods and attributes in a specification for access to multiple databases via many different DBDs.

o code written for DBI should be fairly portable across DBDs with the possible exception of the SQL used (depends on database support for SQL and how careful you are) and any DBD-specific methods.

o DBI provides some methods itself where DBDs cannot or have not supplied them.

o DBI is universally accepted as the database access module for Perl.

o platform independent.

DBD::ODBC is not just for Windows and is not just for 32 bit platforms.

Advantages of DBD::ODBC:

o platform independent module

o supports at least 4 ODBC Driver managers and no reason why it should not work with others if they provide reasonable ODBC support

o works on 32 bit and 64 bit platforms

o Supports Unicode

o in active development

o ODBC 3.52 but will work fine with ODBC drivers developed for earlier versions of the specification

o tested with literally dozens of ODBC Drivers