Globus is a service that provides reliable data transfer between sites, to include HPC, workstations, and users's devices (e.g. laptops). Globus also enables data sharing with external institutions. This wiki article is focused on how data can be shared from ODU HPC using Globus service. For general usage of Globus for data transfer between HPC and other computers, please see the Globus Connect documentation.
Example use cases:
A research group at ODU wants to share a dataset with collaborators at the XYZ University and ABC National Lab.
A researcher at ODU wants to make a set of files available publicly to U.S. research community via Globus.
Before enabling sharing, please create a directory somewhere inside your home directory or the relevant /RC/home
or /RC/group
storage location to contain the data you want to share. Let's called this the shared-directory. Place and organize all the data files in this directory (or its subdirectories). All files and subdirectories in this directory will be readable by the recipient of the data sharing.
¶ Example case
Let's present an example where an ODU user with ID
wpurwant
wants to share research data with his collaboration team, "DeapSECURE CyberTraining", at the "XYZ University". We will follow this example in the step-by-step how to below.In this example, the user prepares the files in the folder called
/home/wpurwant/Testbeds/Globus/folder01
.
CAUTION: Please read this section thoroughly, including the Security Considerations below in order to share data properly while avoding unintended data exposure!
After the data to be shared is staged, now it is time to share it with your collaborators! Please open Globus File Manager, located at https://app.globus.org/file-manager/ . Log in with your ODU MIDAS credentials (choose "Old Dominion University" under the "Use your existing organizational login"). Once logged in, navigate to the location where you can see the shared-directory. Click that directory, but do not enter into it.
Click the Share
icon located on the right menu bar, it will open the "Guest Collections" page:
In Globus terminology, a shared folder is called a guest collection. Next, click the Add a Guest Collection
button, it will present a form to create a new guest collection:
Directory
entry, enter the full path of the shared-directory.Display Name
, enter a name of this collection. This name will be visible on the Globus system (although people won't be able to see what's inside the collection unless they have been given acess to it).Description
box that will help people understand what this collection is all about (for example: "Assembled genes from Drosophila melanogaster, samples obtained from Zimbabwe site S20-3A").Directory
and Display Name
are mandatory. Additional fields can be set, in particular:
Contact Email
. Organization
, etc.) to let other people contact you regarding the shared data.Force encryption on transfers to and from this collection
, if you want the data to be encrypted while being transferred.Apparently you can create two or more collections with exactly the same
Display Name
; but this is generally not a good idea as it makes it hard for us to distinguish what's what. So please do your best to create collection with distinct display names in order to avoid confusion.
After finishing the settings, press the Create Collection
button. Once completed, the data is ready to share via Globus, but there is one more step to perform before the data is truly sharedL You have to specify the recipients of the data share.
Once a collection is created, you can still add and delete files from the shared-directory as needed. These changes will also be reflected in real-time on therecipient's side.
Once the collection is created, you will be directed to the "Permissions" tab of this collection. By default, the collection is not shared with anyone until you add a permission to a person or a group.
To add a collaborator (i.e. a recipient of the data sharing), please click the Add Permissions - Share With
button.
Once the sharing form is opened, you can specify the specifics of the sharing:
Path
: Refers the subdirectory relative to shared-directory. In the illustration above, /
refers to the shared-directory's root, i.e. /isilon/home/wpurwant/Testbeds/Globus/folder01
in the example above. Specifying a subdirectory name here will grant the recipient access only to that subdirectory inside the real folder01
directory.
Share With
: You can share with an individual person, a group of users, or all Globus users. (Editor's note: we do not fully understand the "public (anoymous)" option. Apparently, even if the sharing level is set to "public", the recipient still has to log in to Globus.)
Username or Email
: A person is identified by the email address associated with with his/her Globus account. Please contact the recipient to get the correct email address to use. Alternatively, a Globus user name can be used.
An email can be sent to the recipient (recommended), containing the sharing link and an optional message from you.
Here's an example of the email received the the recipient, with the link to access the shared directory:
After adding the recipient, here's how the permission list looks like:
To access the files, the recipient must sign in to Globus with his/her own credentials.
The recipient does not have to remember this link. In case they forget, they can always login to Globus and look up all the shares they have access to at https://app.globus.org/endpoints?scope=shared-with-me
By default, the recipient will have read-only access to the data. If the situation warrants it, you can also grant write access to the shared folder, but be aware that the recipient can modify and delete any file in the shared folder as a result.
Your guest collection may be visible to all Globus users when they try to search collections by name; however, unless they have been explicitly added to the permission list, or the data is made available to all Globus users, they cannot access the data.
The Overview
tab of the share shows the key information about the guest collection. You can edit the attributes, manage the files in the collection, as well as delete the collection.
If you choose to delete a collection, all the people who used to have access to this collection will no longer have access to it.
While Globus provides a convenient way to share files and data with collaborators and the general scientific community, please be aware of the security implications listed below. Otherwise, your sharing may lead to an undesirable leak of the data:
All files, including those residing in the subdirectories of the shared-directory, are readable and downloadable by the recipients. Globus do not take into account the UNIX permissions of the files and directories in the underyling storage (i.e. from the HPC side). Therefore be very careful to place only files that you want to share with the recipients in this directory tree!
Symbolic link: UNIX symbolic links work when they point to files residing in the same directory tree as the shared-directory. Otherwise, it will be ignored. This is a security feature to avoid exposing files not residing in this directory tree.
Be very very careful granting someone a write access to your guest collection. Not only the recipient can add new files and folders, but all files and folders can be overwritten and deleted by the recipients. It is wise to use this feature only as a "drop-box" to receive data, or to work collaboratively with trusted parties.