Getting Started

  • Please follow the following steps to start using DataFlow on http://dataflow.ornl.gov.
  • Currently, DataFlow is tied the sole, Globus, data adapter, so Globus-related steps need to be followed to be able to use DataFlow In the future, the instructions would vary by the data adapter.
  • Additional steps may be necessary to use facility-local deployments of DataFlow.

Accessing DataFlow

  1. Get a UCAMS / XCAMS account:

    • ORNL Staff:

      • To activate your UCAMS ID as your XCAMS CADES user ID, visit https://xcams.ornl.gov and click on "I need an account."
      • In Step 2 of the "XCAMS New User Account Registration", look in the tan box on the right of the screen under "ORNL UCAMS Users", and click on “activate your XCAMS account” and complete the required steps.
    • Non-ORNL Staff:

      • Visit https://xcams.ornl.gov and click on "I need an account." and complete all steps of the registration process

        Note

        This request is reviewed and may take up to five business days for approval. Remote users who are not ORNL interns and not U.S. citizens, will need an ORNL host to provide them with a Cyber Access PAS from the Visitor’s Center before they can be approved for access to the external login node. ORNL interns who already have a PAS will not need a second one for this.

      • To get access to the external visitor login nodes, go to https://xcams.ornl.gov/xcams/groups/cades-misc/ and follow the instructions to request an account.

Data adapters

Please follow instructions for setting up the pre-requisites for the data adapter.

Globus

  1. Get a CADES account:
    • Navigate to https://xcams.ornl.gov/xcams/groups/cades-birthright.
    • Enter your email address (use the ORNL email if available) and click continue
    • Review the XCAMS user agreement, and select "Agree".
    • Enter your USER ID (UCAMS ID if you are ORNL Staff; XCAMS ID if you are not ORNL staff).
    • Enter your password.
    • Click "Submit" to complete the XCAMS request. An activation notice will be dispatched to your email address when your resources are ready for use.
  2. Get a Globus account:
  3. Get a Globus ID and ensure that your globus ID is linked with your institutional ID in your globus account:
    1. Log into globus.org
    2. Click on Account on the left hand pane
    3. Select the Identities tab in the window that opens up
    4. You should see (at least these) two identities:
      1. One from your home institution (that is listed as primary with a crown)
      2. Globus ID (your_username@globusid.org)
    5. If you do not see the Globus ID, click on Link another identity. Select Globus ID and link this ID.

Managing and sharing data:

Please use https://globus.org to organize, manage, move, and share data.

Downloading data to workstations:

  1. Globus Connect Personal - We recommend that most users use Globus Connect Personal to download data (uploaded via DataFlow or any data on a high performance computing facility) to their personal computers (laptops or desktops). Please follow these instructions to install Globus Connect Personal and set up your own Globus endpoint on your computer. Additional documentation is available here.
  2. Please see these instructions for alternative methods to download data.

Programming interfaces

Users interested in using either the REST API or the python wrapper to the REST API would need to:

  1. Get an API Key
  2. Get passwords encrypted

In order to get both these elements, users need to:

  1. Log into the appropriate deployment (central deployment at https://dataflow.ornl.gov or a facility local server) DataFlow with you UCAMS/XCAMS credentials
  2. Click on your name on the top right of the window. This should present a few options such as "API Access", "Settings" and "Log out"
  3. Select "API access"
  4. Follow the sections below to get both the API key and encrypted password(s)

1. API Key

An API Key is an encrypted string that is a substitute to your username and password for authenticating as yourself into DataFlow. In other words, this is how you will tell DataFlow that you are the person contacting DataFlow. Unlike a username / password, this string is encrypted and does not mean anything to anyone outside the scope of DataFlow. However, should someone else have access to your API key, they will be able to view datasets you created using DataFlow, change some default configurations, and possibly upload data to the destination storage (as yourself) if they also have access to your encrypted password (more below) for your Globus endpoint. Thus, the API Key should be kept safely, perhaps in a text file separate from your scripts. Consider limiting the duration of the API key if you feel that the script / key might be visible to anyone else after a certain point

  1. Create an API key if you don't already have one using the "+ Create API Key" button on the top right of the page.
  2. Select how long you would like the API Key to live.
  3. Copy the key for use with your scripts

2. Encrypted Password

While the API Key authenticates you to use DataFlow, you would still need to authenticate yourself to use the data adapter (e.g. Globus) that will finally copy data to the destination storage solution. For example, you need to specify appropriate username and password in order to activate Globus endpoints. Since sending passwords as plain-text is unsafe, DataFlow allows you to authenticate via an encrypted password instead.

Given that DataFlow is currently only supported by the Globus data adapter, it is recommended to encrypt passwords relevant to activate the source and destination Globus endpoints. By default, DataFlow uses ORNL XCAMS / UCAMS credentials to activate both the source (DataFlow server) endpoint and the default destination - CADES Open-Research Home. Thus, encrypting just the XCAMS / UCAMS password is sufficient for most users.

However, if you choose to use an alternate destination file-system that uses different (not XCAMS / UCAMS) authentication, you would need to encrypt the password for that endpoint.

In order to encrypt and password, visit the "API access" section of DataFlow.

  1. Under the "Encrypt my password" section, enter the password you would like to encrypt
  2. Click on the "Encrypt" button.
  3. You should see a long string of alphabets, symbols and numbers. Copy this string for use in your scripts

Note: This encryption capability is not suitable for two-factor authentication steps that require temporary / one-time passwords. For example, OLCF's passcode that is derived from an RSA token.

Python interface to REST API

In order to use the python wrapper to the REST API, users would need to:

  1. Set up a python 3 environment if you do not already have one. One popular method is to download and install miniconda3
  2. Install the ordflow python package by typing the following command in your terminal (Mac / Linux) / Powershell (Windows): pip install ordflow

API reference

  • REST API - The documentation for the REST API is available by clicking the "API Documentation" button at the top left of the "API Access" page. Consider using your API key to authenticate yourself and try out the REST API calls from the Swagger documentation.
  • Python Interface - The API reference has been documented in this page and a brief how-to guide is available in this Jupyter notebook