In the early stages of an organization, it is common for most data to be stored in spreadsheets. Spreadsheet applications are common and easily understood by most small business people. Unfortunately, spreadsheets become burdensome when tasks move beyond calculations and into querying (Churcher, 2007). Fortunately, databases are designed with querying in mind. To make the leap from spreadsheet to database, an organization must find a method to efficiently export data from the spreadsheet and import it into the database without data corruption.
The best possible solution for this problem is to use a database system that is designed by the same software company that built the spreadsheet application. For example, Microsoft is the creator of the Excel spreadsheet application and two different database management systems: Access and SQL Server. Microsoft has created an Import / Export Wizard that allows spreadsheet users to quickly and easily import the data from their spreadsheets into one of Microsoft's database platforms. Additionally, Microsoft provides development tools to automate this procedure (Bluttman & Freeze, 2007).
It is not always feasible to use the same vendor for spreadsheet and database applications. Not only is there no guarantee that the spreadsheet vendor makes a database product, but this also ties the organization down to decisions that were made when the organization was just forming. In this case, it is often best to export the spreadsheet to a standardized format that can be imported into many different relational database management systems (RDBMS). Request For Comment (RFC) 4180 officially defines a file format that has been in production for many years: Comma-Separated Values (CSV). CSV files are plain text files that include one row of data per line. The data is separated by commas, and the RFC defines several guidelines for the data in the fields between commas (Shafranovich, 2005).
Once the data from the spreadsheet has been exported into a common format such as a CSV file, it is ready to be imported into the RDBMS. Since the CSV file format is so common, most database systems support importing data from this format. Each individual RDBMS has its own rules for importing the data. Some systems may require the table structure to be defined before the data from the file can be imported. Other systems may read the file and then automatically create table structures to fit the data in the file.
References
Bluttman, K. & Freeze, W. S. (2007). Access data analysis cookbook. Sebastopol, CA: O’Reilly.
Churcher, C. (2007). Beginning database design: From novice to professional. Berkeley, CA: Apress.
Shafranovich, Y. (2005). Request For Comment 4180: Common format and MIME type for Comma-Separated Values (CSV) files. Retrieved August 9, 2009.