Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Data Access

In case your notebook requires external data to be run, this section will explain the procedures to handle it.

Attention - Unknown Directive
Please be aware that the data will be public accessible, therefore no private or restricted data should be used in the notebooks.

Current storage allowance: each group space has the capability of 10 GB.

In the ongoing setup, data needs to be added to a specific directory and acceessed inside the notebook via a URL. The following steps explain where and how to place the data correct and how to create the URL.

Adding Data

These steps will show where to place the data and how to name it correctly.

Installation of LFS

First please install LFS using the following instructions: https://about.gitlab.com/blog/getting-started-with-git-lfs-tutorial/#installing-lfs

Then complete the installation with the command:

git lfs install

Create a name space for your group

Inside the TU Cookbooks directory, go to the ‘data’ subgroup:

image

In case your group is not listed there, click on ‘New Project’:

image

Then click on ‘Create blank project’.

On the new page, please fill the fields ‘Project name’ and ‘Project slug’ as the same name of your Cookbook. Therefore, if your Cookbook group is called “Remote Sensing Cookbook”, your should fill both fields with “Remote Sensing Cookbook”.

Regarding the ‘Visibility Level’, please choose ‘Public’.

On the Project Configuration, do not select any of the option.

The page should look like this:

image

Tracking files

Next step is to clone your created new project and add your data there.

To track a file in LFS, you need to explicit tell LFS to track it. Do it via the command line with the “lfs track” command:

git lfs track "data/summary.txt"

This will also update your .gitattributes file.

Lastly, you just need to add the .gitattributes and the data file to git and commit them.

git add .gitattributes
git add data/summary.txt
git commit -m "add data ... "

Accessing the data inside the notebook

Some information is required to create the URL to access the data, as the project ID. To obtain it, please go to your created data directory and click on the three dots located in the right corner: image

Then you can fill the link with your project ID and data name following thi structure:

"https://gitlab.tuwien.ac.at/api/v4/projects/<Your project ID>/repository/files/<Your data name>/raw?ref=main&lfs=true"

For example, if your project ID is 12345 and your data is called ‘results.csv’, your path will be:

"https://gitlab.tuwien.ac.at/api/v4/projects/12345/repository/files/results.csv/raw?ref=main&lfs=true"

With this URL, you can directly read it with pandas.read_csv() or using another framework of your preference.

New upcoming solution

This is a solution that in the future will be replaced by an object storage service, as S3, for example. In this way the track of different versions of files and the amount of storage will be optimized.