Google Drive is a popular cloud storage platform to backup and share files. This article provides a step-by-step guidance to enable access and transfer data from your Drive to/from ODU HPC via rclone command-line program. By using rclone, you will be able to automate data transfer and synchronization between the Drive and the cluster storage.
This article assumes that you have installed rclone (or rclone is available) on your system. Refer to rclone downloads page if you need to download and install rclone.
On Wahab HPC, you will use
module load rclone
to make rclone available to your shell environment.
This guide can also be used to enable access to Google Drive from Linux, Mac, and Windows desktop.
Because of the web access involved somewhere in the steps, it is best that you use the remote desktop or virtual desktop.
The first step is to issue the rclone config
command. This will guide you through a series of questions, which will be broken up and commented throughout due to the length. First, we need to create a new remote, which is simply an rclone's terminology for a user-defined name for a particular Google Drive storage area. In the following instruction, we will use my-gdrive
as a name, but please feel free to specify a name that best describes your data (it must not contain whitespaces or begin with a dash [-
]).
No remotes found - make a new one
n) New remote
s) Set configuration password
q) Quit config
n/s/q> n
name> my-gdrive
Rclone will prompt your response after the >
character. Here, n
and my-gdrive
are the responses to the question. In the illustration above, no remotes have been created yet, so there are only a few options. If you have existing remote(s), you will see more choices of actions.
Next, we need to specify the storage type. Type in drive
for Google Drive.
Type of storage to configure.
Enter a string value. Press Enter for the default ("").
Choose a number from below, or type in your own value
1 / 1Fichier
\ "fichier"
2 / Alias for an existing remote
\ "alias"
3 / Amazon Drive
\ "amazon cloud drive"
...
12 / Google Cloud Storage (this is not Google Drive)
\ "google cloud storage"
13 / Google Drive
\ "drive"
14 / Google Photos
\ "google photos"
...
Storage> drive
The following steps will ask for a "client ID". It is highly recommended that you use ODU's client ID so that your rclone sessions would perform better (i.e. faster):
605919805393-odnfmddo2v24ffodmg80j6ht4oi4kftn.apps.googleusercontent.com
GOCSPX
). If you are on the cluster, please see the two ID strings in this text file: /cm/shared/applications/rclone/ODU-gdrive-client-id.txt
Google Application Client Id
Setting your own is recommended.
See https://rclone.org/drive/#making-your-own-client-id for how to create your own.
If you leave this blank, it will use an internal key which is low performance.
Enter a string value. Press Enter for the default ("").
client_id> 605919805393-odnfmddo2v24ffodmg80j6ht4oi4kftn.apps.googleusercontent.com
OAuth Client Secret
Leave blank normally.
Enter a string value. Press Enter for the default ("").
client_secret> ### SEND EMAIL TO ITSResearchAndCloudComputing@odu.edu for this value
The next prompt will ask what kind of access you want. in >99% of the cases, you will want to use the first option (drive
), which gives rclone the full read-write access to your data stored in the Drive (you can limit the write/modify access later when using rclone
).
Scope that rclone should use when requesting access from drive.
Enter a string value. Press Enter for the default ("").
Choose a number from below, or type in your own value
1 / Full access all files, excluding Application Data Folder.
\ "drive"
2 / Read-only access to file metadata and file contents.
\ "drive.readonly"
/ Access to files created by rclone only.
3 | These are visible in the drive website.
| File authorization is revoked when the user deauthorizes the app.
\ "drive.file"
/ Allows read and write access to the Application Data folder.
4 | This is not visible in the drive website.
\ "drive.appfolder"
/ Allows read-only access to file metadata but
5 | does not allow any access to read or download file content.
\ "drive.metadata.readonly"
scope> drive
Root folder: Do you want to allow access to the entire Drive? Or just a specific subfolder in your Drive? This is where you can specify it. If you leave blank, you will use the root folder of the Drive.
ID of the root folder
Leave blank normally.
Fill in to access "Computers" folders (see docs), or for rclone to use
a non root folder as its starting point.
Enter a string value. Press Enter for the default ("").
root_folder_id>
¶ What is my folder ID?
The Google folder ID is shown as a series of letters and digits in the URL of the corresponding folder from the web interface. You can use the "Get link" submenu (or button), which will return an URL like this:
https://drive.google.com/drive/folders/16hY6ZurF09Ax1GzsxqJDJNNxsv-P8ihe?usp=share_link
In the example URL above, the
16hY6ZurF09Ax1GzsxqJDJNNxsv-P8ihe
string is the root folder ID. (FYI this is a demo folder on Research Computing's Google Drive; it is safe but does not contain anything useful to you, most likely.)
The next prompt asks for a service account. Skip this by leaving it blank.
Service Account Credentials JSON file path
Leave blank normally.
Needed only if you want use SA instead of interactive login.
Leading `~` will be expanded in the file name as will environment variables such as `${RCLONE_CONFIG_DIR}`.
Enter a string value. Press Enter for the default ("").
service_account_file>
The next series of prompts are important. Use the auto-config to launch the web browser on the same machine to give rclone permission to access your data stored in the Drive storage. This will allow rclone to access your data from this machine only.
Edit advanced config? (y/n)
y) Yes
n) No (default)
y/n> n
Remote config
Use auto config?
* Say Y if not sure
* Say N if you are working on a remote or headless machine
y) Yes (default)
n) No
y/n> y
If your browser doesn't open automatically go to the following link: http://127.0.0.1:53682/auth?state=SOME_RANDOM_STRING
Log in and authorize rclone for access
Waiting for code...
You will need to authorize access from the browser. If you have not logged in to your ODU Google Drive account, please do so now and authorize access to this.
At this time, on the browser you will see a prompt like this:
rclone for ODU research computing wants access to your Google Account.
...
This will allow rclone for ODU research computing to: See, edit, create, and delete all of your Google Drive files.
Make sure you trust rclone for ODU research computing.
Despite its scary-sounding advice, you need to allow access. This is what connects the
rclone
program to your data to be able to manipulate them. It is your invocation fo the rclone program to the remote you specify that will "see, edit, create, and delete" the data on your Drive. You can always remove Drive access from rclone from your Google Account settings.
The next steps are the finalization of the config. Here, you need to know whether you need to access a personal or shared (team) Drive.
A personal Drive is a storage space associated to only you (though you can share a portion or the individual files with others); this is the common use cases.
A shared Drive (or sometimes called team Drive) is owned by an institution (e.g. by ODU, or another university or a lab) to which individuals with Google IDs can be allowed certain types of access (read-only, read-write, etc.).
In the first prompt below, "Configure this as a team drive?", you will need to respond "n" to access a personal Drive, or "y" to access a shared/team Drive.
Got code
Configure this as a team drive?
y) Yes
n) No (default)
y/n> n
--------------------
[my-gdrive]
type = drive
client_id = 605919805393-odnfmddo2v24ffodmg80j6ht4oi4kftn.apps.googleusercontent.com
client_secret = GOCSPX*******
scope = drive
token = {"access_token":"###REDACTED###","token_type":"Bearer","refresh_token":"###REDACTED###","expiry":"2022-11-17T05:07:15.32276879Z"}
--------------------
y) Yes this is OK (default)
e) Edit this remote
d) Delete this remote
y/e/d> y
Current remotes:
Name Type
==== ====
my-gdrive drive
Voila! Your Drive setup is good to go.
Let us now test if this access works correctly. Let us just list the contents of the root folder. From the terminal, type (do not include $
shell prompt):
$ rclone ls --max-depth=1 my-gdrive:
If all is well, you should see the listing of all the files in the root directories (no folders).
Here is an example from one of the staff members' listing (redacted):
$ rclone ls --max-depth=1 wpurwant-gdrive:
-1 BLANK - Old Dominion University, Norfolk Maturity/Capabilities Model Assessment.xlsx
129915 Position Statements and Bios_2020.pdf
67430 NSF_RFI_Response_final.pdf
22627 DEAPSECURE 2.0 brainstorming
-1 DataUp response.docx
20318 DeapSECURE-module-3-MachineLearning
-1 Fabric Benchmarking 2017.docx
-1 ODU Training.docx
-1 ODU Zoom meetings.docx
-1 PEARC19 Champion Related Activities.docx
-1 Research Computing Strategy brainstorming doc.docx
-1 Restricted-data-computing-platforms-ODU-2022.d20220407.pptx
The first number on every row is the file size. If it is -1, it indicates a native Google document (Docs, Sheets, Slides). Other files will show the file sizes.