Guidance for uploading files

Uploading a lot of files

Your project may have thousands of existing files of diverse types. As you are planning to move your research into the workspace, understanding how the upload process works can help you plan how to do this most effectively. For example, you might consider one of the following approaches:

  • The most simple way to upload files is using the workspace web interface. However, if you want to upload data programmatically, you can do so using an API.
  • We recommend that you upload no more than 500 files at a time (just as you probably wouldn’t between your desktop and any other shared file server).
  • If you have a Virtual Machine add-on to your workspace, using .zip files to batch upload manageable chunks of data is one way to manage this. You will be able to unpack these later in the workspace, and the upload process will be easier.

Upload limits

  • Uploads to Files up to 10 GB and to Blobs up to 100 GB have been tested. Larger uploads might also be possible but any network interruption might result in upload failure.
  • Generally, files that are larger than 250GB should not be uploaded into the workspace using the methods described. If you have files that are over 250GB, please get in touch with your Data Steward Team who will be able to help plan the data migration.

Guidance

See the table below for a summary of different types of source data, guidance on where to store it, and how you can access the data once it has been uploaded into the workspace.

Source data to be uploadedFile extensionsTypical size per filePurposeWorkspace folderData mapping applied?Accessed from Web interfaceAccessed from Virtual machine
Tabular data.csv1000s of rows and columnsDatabase analysisFilesWorkspace databaseYesYes
Analysis scripts.r, .sql100 – 500kBReproducible statisticsFilesWorkspace file systemYesYes
Text, pdf documents and small images.txt, .doc, .pdf, .png, .jpg2MBProject communication and reportsFilesWorkspace file systemYesYes
Large image files, image series, genomic data, executable files for tool installation, other non-structured data.png, .jpg, .vcf, .exe10GB – 250GBRaw data for analysis and information extractionBlobsWorkspace file systemYesYes (needs to be enabled by Service Desk)
Updated on August 31, 2023

Was this article helpful?