Depot Object User Guide
Depot Object storage is a high-capacity, fast, reliable and secure object storage service designed, configured and operated for the needs of Purdue researchers in any field and shareable with both on-campus and off-campus collaborators.Depot Object Overview
As with the community clusters, research labs will be able to easily purchase capacity in the Depot Object Store through the Data Depot Purchase page on this site. For more information, please contact us.
Link to section 'Depot Object Features' of 'Depot Object Overview' Depot Object Features
The Depot Object Store offers research groups in need of centralized object storage unique features and benefits:
- Available
To any Purdue research group as a purchase in increments of 1 TB at a competitive annual price or you may request a 100 GB trial space free of charge. Participation in the Community Cluster program is not required.
- Accessible
- Access to the Depot Object Store is via S3-compatible APIs. Applications requiring POSIX filesystem access should continue to use the Data Depot Filesystem.
- Capable
The Depot Object Store facilitates joint work on shared files across your research group, avoiding the need for numerous copies of datasets across individuals' home or scratch directories. It is an ideal place to store group datasets, models, or raw data.
- Controllable Access
Access management is under your direct control. Unix groups can be created for your group and staff can assist you in setting appropriate permissions to allow exactly the access you want and prevent any you do not. Easily manage who has access through a simple web application — the same application used to manage access to Community Cluster queues.
- Data Retention
All data kept in the Depot Object Store remains owned by the research group's lead faculty. When researchers or students leave your group, any files left in their home directories may become difficult to recover. Files kept in the Depot Object Store remain with the research group, unaffected by turnover, and could head off potentially difficult disputes.
- Never Purged
Depot Object is never subject to purging.
- Reliable
Depot Object is redundant and protected against hardware failures by Ceph replication and erasure coding.
- Restricted Data
Depot object is suitable for non-HIPAA human subjects data. See the Data Depot FAQ for a data security statement for your IRB documentation. The Data Depot is not approved for regulated data, including HIPAA, ePHI, FISMA, or ITAR data.
Link to section 'Depot Object Hardware Details' of 'Depot Object Overview' Depot Object Hardware Details
The Depot Object Store is built from a high-performance Ceph storage solution with an initial total capacity of over 4 PB. This storage is redundant and reliable, with APIs axcessible from any Purdue network .
Managing Buckets and Objects
Link to section 'Accessing Depot Object Storage' of 'Managing Buckets and Objects' Accessing Depot Object Storage
The S3 endpoint provided by Depot Object can be accessed in multiple ways. Two popular options for interacting with S3 storage via the command line and GUI are listed below.
Endpoint: s3.rcac.purdue.edu
Link to section 's3cmd User Guide' of 'Managing Buckets and Objects' s3cmd User Guide
s3cmd is a free command line tool for managing data in S3 compatible storage resources that works on Linux and Mac. This section provides a basic overview of using s3cmd to manage Depot Object storage.
Link to section 'Table of Contents' of 'Managing Buckets and Objects' Table of Contents
Link to section 'Installation' of 'Managing Buckets and Objects' Installation
To use s3cmd, first ensure you have it installed on your system. You can install it via pip:
pip install s3cmd
Link to section 'Authentication' of 'Managing Buckets and Objects' Authentication
Before using s3cmd to interact with your S3 storage, you need to configure your .s3cfg file.
The s3cmd configuration file should have the following format. Access keys and secret keys can be obtained via rcac-help@purdue.edu.
[default] host_base = s3.rcac.purdue.edu host_bucket = s3.rcac.purdue.edu access_key = <your access key> secret_key = <your secret key>
Link to section 'Basic Commands' of 'Managing Buckets and Objects' Basic Commands
- s3cmd list (ls): This lists all the buckets associated with your account.
- s3cmd sync: Syncs directories on your machine to or from S3.
- s3cmd put: Uploads an object to S3.
- s3cmd get: Downloads an object from S3.
Link to section 'Bucket Management' of 'Managing Buckets and Objects' Bucket Management
- s3cmd mb s3://<bucket>: Creates a new bucket.
- s3cmd rb s3://<bucket>: Deletes an entire bucket, including all objects within it. This is irreversible, so use with caution.
Link to section 'Object Management' of 'Managing Buckets and Objects' Object Management
- s3cmd put: This is used for uploading files to your bucket.
- s3cmd get: Use this command to download an object from S3, specifying both the bucket and object name where necessary.
Link to section 'More Information' of 'Managing Buckets and Objects' More Information
For detailed usage and options, run `s3cmd --help` from your terminal/command prompt.
- Download: https://s3tools.org/download
- How-To Documentation: https://s3tools.org/s3cmd-howto
Link to section 'Cyberduck' of 'Managing Buckets and Objects' Cyberduck
Cyberduck is a free server and cloud storage browser that can be used on Windows and Mac.
-
Launch Cyberduck
-
Click + Open Connection at the top of the UI.
-
Select S3 from the dropdown menu
-
Fill in Server, Access Key ID and Secret Access Key fields
-
Click Connect
-
You can now right click to bring up a menu of actions that can be performed against the storage endpoint
Further information about using Cyberduck can be found on the Cyberduck documentation site.