Creating database tables
Database tables are SQL tables that are stored within the workspace database. Any tabular data, e.g. a CSV file, can be converted to a database table.
There are multiple benefits to doing this. Metadata can be added to a table to ensure that it follows the FAIR data principles. Built-in tools are also available for working with the datasets, such as a point and click SQL for joining tables, rapid data visualisation functionality and a table editor. To find out more about these tools, see articles in the Managing data section.
Creating a new database table
To create a new table from scratch within your workspace, go to the Database menu and select ‘New database table’; a new tab will open showing the “New database table” wizard.
New table
To create a table from scratch, set the name of your new table and click ’Next’. Note: We recommend that tables only contain lowercase letters, numbers and underscores and names do not start with a number. It is possible to use other names (such as those with spaces in them) but you’ll need to compensate for this if you reference the table in a SQL script by using quotes. You can also choose to select a CSV to convert by selecting the “Select a CSV file” option on the right hand side of the screen; this will open a file selector where you can find and select your CSV file.
Adding metadata
The next stage of the wizard allows you to add metadata to your table. You can add a title, description and source URL or leave all of these blank and continue. Alternatively, you could select a TDF metadata file from your workspace file list.
Adding columns
After selecting Next, you will be able to add columns to your new table. Fill out the column information (only Name and Type are mandatory) and use the “+” button to add another column. Repeat this until you have added all the columns to your table then click Next. Note: if you’ve selected to convert a CSV to a table, this screen should show you the columns from your CSV.
Confirm
The next screen will show you a summary of your table and columns. If it looks good then click Next to create your table; otherwise click Back and fix anything needed.
Converting a CSV to a database table
Database tables can be easily created within the workspace by converting a CSV file. From the Files tab, choose a CSV file and select the option 'Convert to table' in the option menu. A New database table tab will open in the workspace.
New table
This screen shows you the name of the file you’re going to convert to a table. It also names your table based on the name of your file; you can change this if needed. Click Next to move to the next step in the wizard.
Adding metadata
This step is described above. Note that if you have a TDF (metadata file) which matches the name of your CSV, this should be automatically selected and will show on the right hand side of the page. If the wrong file is selected or you wish to change it, click the cross in the file selector and then select to find the correct file.
Editing the columns
The next screen shows the potential columns and their metadata for your new table which have been picked up from your CSV file. You can check and edit this information as needed. Some sample data from each column is also shown.
The following information can be changed:
- Column names must satisfy the restrictions outlined below, otherwise, the field will be highlighted in red, indicating an error. Column name requirements:
- must start with a letter
- can only contain lowercase letters, numbers and underscores
- must be less than 65 characters
- cannot be a reserved SQL word (e.g. GROUP)
- Data types of the columns (selected from a dropdown list). Data tables currently support the following data types: text, integer, decimal, boolean, date, datetime, datetime with time zone, time, time with time zone.
- The primary key of the table. The Primary Key can easily be changed at this stage by selecting a column which has no duplicate values in the first 10GB.
- Which columns to exclude from the table (done by clicking the checkbox beside the column name).
Note: for large files the type guesser may be running in the background. This will be indicated by a spinner running on screen.
If the file hasn’t been parsed correctly or you want to change a parameter, click Advanced at the top of the screen, this will open the additional options. The following input parameters can be changed:
- Separator: the character used as the delimiter to separate data items in the file.
- Other Separator: can be used if there is more than one separator used in the file
- Encoding: the character encoding used in the file. The default is UTF-8.
- Text qualifier: the character used to distinguish where the text begins and ends. The default is a double quote.
- Null qualifier: the string used to represent null values in the file. Such data items will appear as no value fields in the table. Example null qualifiers: NULL, null, 0
- Include header row: specifies whether the data file includes a header row. If this checkbox is selected, the items in the first line of the file will be used as column names in the data table.
When happy with the data preview, you can select the ‘Next’ button which moves you to the next step in the wizard. If you haven’t selected a primary key then you will be prompted to do so at this stage.
Confirm
The next screen will show you a summary of your table and columns, if it looks good then click ‘Next’ to create your table; otherwise click ‘Back’ and fix anything needed.
Summary
The next screen will display a summary of the new table as well as a spinner while the table is being created. Once your table is ready, you can navigate to it by clicking ‘Open Database Table’ . Remember to close the tab using the x on the tab header.
Error report
For all file to data table conversions, a report is generated. This report is accessible from the Activity tab as well as from the Note section for the table, which can be found at the top right of the table summary page.
If the workspace could not create the data table, a report is generated and shown in the Activity tab. This report contains a list of the errors encountered while trying to create the data table, including a reference to the line or character whenever possible.
The image below shows an example report which indicates that the table creation failed because the primary key column contained duplicate values.
Troubleshooting
I can't find the 'Convert to table' button
This action is only available for files ending in .csv (or .CSV). Make sure that you are trying to make a conversion from a CSV file.
My data table is taking a long time to show up
Large tables can take quite a few minutes to convert. This is longer when you choose to exclude certain columns from their table or when multiple users are carrying out the action at once.
Do I have too much data?
Probably not!
The maximum number of columns allowed in a DB table is 1600 although this is less if all of the columns are type varchar (here the max would be about 450). The row limit is well above 100,000,000 but again depends on the data types.