Monday, July 2, 2012

Moving large data between Oracle database and CSV

I'm currently working on a project that involves importing and exporting large volumes of data between spreadsheet-friendly comma-separated-value (CSV) files and an Oracle database. Although import/export of data can be done using a GUI tool like SQLDeveloper, it is not practical for large volumes or when such jobs need to be scripted. This blog describes how this can be done using BASH scripts and the command-line tools sqlplus and sqlldr.


Suppose I run an online shop and maintain a list of purchase transactions with fields such as item_id, customer_id, delivery_status, etc. On my Oracle database, I have PURCHASE_TRANSACTIONS, ITEMS and CUSTOMERS tables with the appropriate joins.

Exporting from Oracle DB to CSV

To export data from the Oracle database table (normally with additional joins and filtering criteria), create a SQL file with:

  • a query to write out the header row of the CSV
  • a query to get the data and concatenate them together in a CSV format

Something like this to export a list of items pending delivery:

set head off;
set feed off;
set trimspool on;
set linesize 32767;
set pagesize 32767;
set echo off;
set termout off;

spool &1;
select pt.item_id
|| ',"' ||
|| '",' || itm.price
|| ',' || pt.customer_id
|| ',"' ||
|| '","' || c.address
|| '"'
join CUSTOMERS c on pt.customer_id =
join ITEMS itm on pt.item_id =
where pt.delivery_status = 'PENDING'
order by pt.customer_id;

spool off;

Note the double-quotes around textual data. To execute this, save it to a file named export-pending.sql and write a simple BASH script like so:

sqlplus -S MY_DB_USER/MY_DB_PASSWORD@MY_DB_SERVER:1521/DB_SID @export-pending.sql pending-delivery-report.csv

Running this script will write out the file pending-delivery-report.csv with contents like:

432,"Super Mouse Pad",23.5,11,"John Smith","45 Some Street, Sydney, NSW 2000"
877,"USB Fan",14,11,"John Smith","45 Some Street, Sydney, NSW 2000"
344,"NeverDie UPS",125.3,67,"Alex Jones","101 Scary Road, Melbourne, VIC 3000"

This export functionality is typically used to generate spreadsheet friendly reports.

Importing from CSV to Oracle DB

Suppose I have a CSV file items-data.csv containing a new list of items that I want to ADD to the ITEMS table, in the following format:

1002,'Car battery',98
1003,'Super cool camera',450.50

To load this into the database, create a control file load-items.ctl with the following:

load data
into table ITEMS
fields terminated by ',' optionally enclosed by '\''

Note that in order to REPLACE the entire table's contents with the CSV data, simply change the append instruction (second line) to replace. Finally, create a simple BASH script with the following to run sqlldr:

sqlldr MY_DB_USER/MY_DB_PASSWORD@MY_DB_SERVER:1521/DB_SID control=load-items.ctl data=items-data.csv direct=true rows=1000 skip=1 errors=1 log=load-items.log bad=load-items.bad discard=load-items.discard

After executing this script, check the items-data.log for the progress of the data load. The load-items.bad and load-items.discard files contain the data with errors.

Sunday, July 1, 2012

Mandriva 2011 : the painful way

Last week, I started setting up a new home desktop computer. As all my other home machines were running Mandriva, I decided to install the latest Mandriva 2011 on this new one. Should only take a couple of days to get everything up and running, or so I thought. This blog describes the funpain of actually getting everything to work.


The new system came with an Asus P8H61 motherboard that had a Realtek RTL8111/8168B Ethernet controller. Sounded like very ordinary hardware that would definitely just work with Mandriva, if not for the fact that the kernel picked the wrong driver! It wrongly loaded the r8169 kernel module, instead of the required r8168. It took me 3 DAYS of messing about :- checking all my cables, reconfiguring my router, updated my kernel version, trying various network settings, etc, before I realized it was all because of this stupid kernel bug.

To fix it, I first had to get the right r8168 driver from the Realtek website. The file I got was r8168-8.031.00.tar.gz2. After unpacking everything and making sure I had the kernel source and headers, I ran:

$ cd r8168-8.031.00
$ ./

which gave me the following error:

WARNING: Symbol version dump /usr/src/devel/
           is missing; modules will have no dependencies and modversions.

Why happened here? Why would the kernel source not have this file that was required for building kernel modules? I had no choice but to rebuild the whole kernel just to get this file, that I then copied to where the Realtek driver expected to find it:

$ cd /usr/src/linux-
$ make config
$ make
$ cp Module.symvers /usr/src/devel/

After that, the Realtek script worked as expected. It unloaded r8169 and swapped in r8168. Finally, I had network connectivity. To make sure my changes stuck, I modified /etc/modprobe.conf to say:

alias pci:v000010ECd00008168sv00001043sd00008432bc02sc00i00 r8168
alias eth0 r8168

and then added r8169 to the module blacklist files :

$ echo 'blacklist r8169' >> /etc/modprobe.d/blacklist-brcm80211.conf
$ echo 'blacklist r8169' >> /etc/modprobe.d/blacklist-compat
$ echo 'blacklist r8169' >> /etc/modprobe.d/blacklist-mdv

Graphics card

Next challenge:- getting hardware direct rendering working. This system had a low-end (for work, not gaming) 1GB Gigabyte GT210 graphics card running on nVidia GeForce 210.

Understandably, Mandriva had installed the free Nouveau driver by default. What I didn't expect was that installing the nVidia driver didn't actually cause it to replace the nouveau driver. Instead, running nvidia-xconfig simply broke X (thankfully it also backed up the original xorg.conf so I could revert back). Wasted several hours of my life before figuring out I first had to uninstall nouveau, and then re-installed the nVidia rpms below:


Finally, I ran XFdrake one more time and it prompted me to use the nVidia driver.