Modify MySQL Tables

After setting up my first small databases the next learning steps concern basic operation on existing tables in a MySQL database, such as deleting or creating additional variables/columns in a table. I found tech-recipes.com very helpful:

Delete a variable/column

To delete the column MyCol from the table MyTable, use the following SQL command:

ALTER TABLE MyTable DROP MyCol

Comment: Tech-Recipes.com used single-quotes around MyCol and MyTable which I found not necessary.

Add a column/variable

To add a column called email […] with a datatype of VARCHAR(60) to the table contacts, use the following SQL statement:

ALTER TABLE contacts ADD email VARCHAR(60);

This first statement will add the email column to the end of the table. To insert the new column after a specific column, such as name, use this statement:

ALTER TABLE contacts ADD email VARCHAR(60) AFTER name;

If you want the new column to be first, use this statement:

ALTER TABLE contacts ADD email VARCHAR(60) FIRST;

Update/Change the values in a column

One would use the UPDATE statement so I looked it up in the reference

UPDATE [LOW_PRIORITY] [IGNORE] table_reference
SET col_name1={expr1|DEFAULT} [, col_name2={expr2|DEFAULT}] ...
[WHERE where_condition]
[ORDER BY ...]
[LIMIT row_count]

In order to update the column intensity in the table measures from “20%” to simply “20” one would
UPDATE measures SET intensity="20" WHERE intensity="20%";

MS SQL calculating the number of days in a the month of a given date

I was looking for MS Access SQL to calculate the numbers of days in a certain month, specified by a date record. I did not find the solution but the neccessary hint in the right direction on Eggheadcafe (Hattip: John W. Vinson [MVP]).

SQL functions needed are:

  • DateDiff( "d" ; date1 ; date2)
    calculates the difference in days between data1 and date2, and
  • DateSerial( year ; month ; day )
    creates a datevariable representing the date given by a number for the year, the month, and the day.
  • Year( date )
    gets the number of the year out of a date variable. Year(#01.01.2010#) produces the number 2010. The functions Month() and Day() work the same way and are needed.

To get the number of days in the month given by the date “startTime” the following code worked for me with MS Access 2003:

DateDiff("d";DateSerial(Year([startTime]);Month([startTime]);1);DateSerial(Year([startTime]);Month([startTime])+1;1))

Convert Character to Numeric in MS Access

Several times I had to correct database tables with a character primary key (which is ok, if not all other tables had numeric …). In order to get the necessary numeric format I inserted another variable with the correct format and transformed the character variable to numeric using the CDbl() Function.

The syntax for the CDbl function is:
CDbl([MyString])
I did forget it again and could not find any documentation. Googling the topic gave a lot of fruitless hits, so here it is.

Open Access .mdb Files with RODBC

Getting data into R can be done by reading colon separated files (.csv) via the read.table() function. It is also possible to access databases directly and send SQL queries directly from R to the database. This has some advantages: Using Sweave the queries get documented in the analysis report, variable formats are retained.

To install the RODBC package:
install.packages("RODBC")

Open a database connection to an Microsoft Access database file, e.g. “MyDataBase.mdb” sitting in the Folder “C:\ MyPath\MyDataBase.mdb”:
channel <- odbcConnectAccess("C:/MyPath/MyDataBase")
note that the Windows backslashes “\” become slashes “/” in R and the extension “.mdb” is omitted.

Getting the database table “MyTable” into the R dataframe “R.Table”
R.Table <- sqlQuery( channel , paste ("select * from MyTable"))
MyTable can also be a sql query in the Access database.

Command line instructions to create a database in MySQL

Still working on a way to port a sloppy Access database to MySQL. In the end it would be lovely to have a shell script doing all steps in a row:

  1. Extract the tables of the database as .csv
  2. Create a new (empty) MySQL database
  3. Create all tables in the new MySQL db
  4. Import the .csvs into those tables

One of the necessary steps is to create a new MySQL database:

$ mysql -u ''adminusername'' -p
mysql> CREATE DATABASE ''databasename'';
mysql> GRANT ALL PRIVILEGES ON ''databasename''.* TO "''username''"@"''hostname''" IDENTIFIED BY "''password''";
mysql> FLUSH PRIVILEGES;
mysql> EXIT

I just collected the code, did not try it out. Just not to forget….

Accessing an Access Database with mdbtools

Basically, what I wanted to do is accessing an MS Access database directly from R. This is probably possible and subject of another post.

Halfway from MS Access to R is mdbtools which can be added to Ubuntu in a terminal with:
sudo aptitude install mdbtools

MDB Tools comes with a set of command line utilties that allow mdb files to be used in shell scripting, extraction to another database, and similar functions. Each program is documented in its man page.

In order to extract the tables of the database into colon separated value (.csv) format create a file, say mdb-explode with the following content:

#! /bin/bash
mkdir $(echo $1|awk -F "." {'print $1'}).csvs
for i in $(mdb-tables $1); do
echo $i
mdb-export $1 $i > $(echo $1|awk -F "." {'print $1'}).csvs/$i.csv
done

Save the the file, open a terminal and make it executable:
chmod +x mdb-explode

If your MS Access database file (.mdb) is in the same directory you would
./mdb-explode YourDatabaseFile.mdb
and all tables of the database end up in a folder called YourDatabaseFile.csvs as .csv files which can be accessed by R via the read.table() function.

The idea is posted on Ubuntuforums by Skrynesaver but the script posted has some typos, which are corrected above. The script posted here worked with Ubuntu 9.04, mdbtools 0.5.99 and an Access 2003 database file.