Introduction

Sometimes it becomes necessary to work with unsupported database types in our application. For instance, I've recently dealt with a situation where I was working on an ecommerce site that needed to talk to a legacy COBOL-based ERP system. Luckily this ERP system had an ODBC interface that I could utilize in my application. In this article we will cover the basics of talking to databases via ODBC. Note that this article is going to be a bit different since you can't easily follow along unless you have access to your own ODBC data source. However I will do the best i can to break things down step by step. Let's get started.

Rails Application Setup

Ruby ODBC is the ODBC library that allows us to interface with ODBC via Ruby. To install it, add the ruby-odbc gem to your gemfile:

Gemfile:

gem 'ruby-odbc', '~> 0.99995'

Now run a bundle install to install the gem.

Terminal Commands:

bundle install

Once installed, we can use it right from Ruby or we can use it within Rails via rake tasks or other methods. Unfortunately, there is no ActiveRecord binding for ODBC. This means you cannot use ActiveRecord with ODBC. For this example we will utilize a simple rake task. In this example, I will use a rake task to pull data out of the ERP and then use Ruby on Rails to find/update products in the ecommerce site.

To start, first I create a file called get_data_from_erp.rake in the lib/tasks folder. Inside this file, i add the following code:

lib/tasks/get_data_from_erp.rake:

task get_data_from_erp: :environment do
  puts "Starting download of products..."
end

This tells rails we wish to create a task that we can run by running the command:

Terminal Commands:

rake get_data_from_erp

In a production environment we would run the following command:

Terminal Commands:

rake get_data_from_erp RAILS_ENV=production

Now that i have a basic rake task, I need to add in code that will allow me to actually talk to the erp. First, i need to add a require statement to the top of my task file. Typically I will add the line require 'odbc_utf8'. However, sometimes this may cause problems with older systems, in which case I would add require 'odbc'. For example:

lib/tasks/get_data_from_erp.rake:

require 'odbc_utf8'

task get_data_from_erp: :environment do
  puts "Starting download of products..."
end

The next thing i need to do is connect to the ERP. Since I already have a data source installed on my host system, I can connect to it via the following line:


client = ODBC.connect("ecommERP")

You can optionally specify a username and password:


client = ODBC.connect("ecommERP", "myUser", "myPass")

Now to construct and execute the SQL query itself. In this example we have a table called PRODUCT_MASTER in the PUBLIC namespace. This table has a number of fields, but the ones that interest us are PART_NUMBER, NAME, PRICE, and ACTIVE. Thus the SQL query for this particular ERP system would look something like this:


SELECT PUBLIC.PRODUCT_MASTER.PART_NUMBER, PUBLIC.PRODUCT_MASTER.NAME, PUBLIC.PRODUCT_MASTER.PRICE, PUBLIC.PRODUCT_MASTER.ACTIVE FROM PUBLIC.PRODUCT_MASTER

To execute the sql statement, i simply called 2 commands:


sql = "SELECT PUBLIC.PRODUCT_MASTER.PART_NUMBER, PUBLIC.PRODUCT_MASTER.NAME, PUBLIC.PRODUCT_MASTER.PRICE, PUBLIC.PRODUCT_MASTER.ACTIVE FROM PUBLIC.PRODUCT_MASTER"

statement = client.prepare(sql)
statement.execute

I can then loop through the results using a while loop.


while row = statement.fetch
# ....
end

At the end I execute statement.drop. The statement.drop command frees up the resources taken by the query.

To recap, my task would now look something like this:

lib/tasks/get_data_from_erp.rake:

require 'odbc_utf8'
task get_data_from_erp: :environment do
  puts "Starting download of products..."

  client = ODBC.connect("ecommERP")

  sql = "SELECT PUBLIC.PRODUCT_MASTER.PART_NUMBER, PUBLIC.PRODUCT_MASTER.NAME, PUBLIC.PRODUCT_MASTER.PRICE, PUBLIC.PRODUCT_MASTER.ACTIVE FROM PUBLIC.PRODUCT_MASTER"
  
  statement = client.prepare(sql)
  statement.execute
  
  while row = statement.fetch

  end
  statement.drop
end

That's great, but how do we get the data into our Rails app? Well, luckily this part is easy. We can either connect to the database directly and add in the data using SQL statements, or we can use ActiveRecord. In this example we will use ActiveRecord and the first_or_initialize method. The first_or_initialize method will allow us to find a record if it exists, or create a new one if it doesn't. In our system the part_number field is unique, so we can find our products based on that field.

lib/tasks/get_data_from_erp.rake:

p = Product.where(part_number: row[0]).first_or_initialize

p.name = row[1]
p.price = row[2]
p.active = row[3]

if p.persisted?
  puts 'Updating #{row[0]}...'
else
  puts 'Creating #{row[0]}...'
end

p.save!

Thus our rake task would end up looking something like this:

lib/tasks/get_data_from_erp.rake:

require 'odbc_utf8'
task get_data_from_erp: :environment do
  puts "Starting download of products..."

  client = ODBC.connect("ecommERP")

  sql = "SELECT PUBLIC.PRODUCT_MASTER.PART_NUMBER, PUBLIC.PRODUCT_MASTER.NAME, PUBLIC.PRODUCT_MASTER.PRICE, PUBLIC.PRODUCT_MASTER.ACTIVE FROM PUBLIC.PRODUCT_MASTER"
  
  statement = client.prepare(sql)
  statement.execute
  
  while row = statement.fetch
    p = Product.where(part_number: row[0]).first_or_initialize

    p.name = row[1]
    p.price = row[2]
    p.active = row[3]

    if p.persisted?
      puts 'Updating #{row[0]}...'
    else
      puts 'Creating #{row[0]}...'
    end

    p.save!
  end
  statement.drop
end

That's all there is to it. However, there are a few things to keep in mind. First, in a real life scenario the system would likely have tens or hundreds of thousands of records, so it might be useful to filter out by a date or timestamp. In my case there was no easy way to filter out the records, since the date and timestamps were nonexistent. Also if you are dealing with large amounts of records, it might be worth bypassing ActiveRecord and using the MySQL or PostgreSQL libraries directly since activerecord can really slow things down. It may also be worth moving the logic for updating the Product model to the model itself if the logic is complex. Finally, it's probably worth moving the name of the ODBC data source to an environment variable.

That's it! Thanks for reading!