Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems accessing publicly available Google Cloud Storage buckets. #36

Open
Mobius1D opened this issue Jul 30, 2021 · 1 comment
Open

Comments

@Mobius1D
Copy link

Hello everyone,

I am a contributor at ReinforcementLearning.jl and we are working on enhancing the support for Offline Reinforcement Learning for which one of the main goals is to make publicly available datasets readily available natively. For that purpose I am creating a subpackage called RLDatasets.jl.

Some of the datasets are available publicly in GCS but I am not sure how to access public GCS buckets using the given API.

We would be accessing this dataset GCP bucket for instance. Since, I am relatively new to using Google Cloud Storage, it would be great if someone could shed light on this using the given instance.

Thanks in advance.

@mhudecheck
Copy link

mhudecheck commented Nov 20, 2021

Hi Mobius1D,

This is already possible. As long as you have permissions to access the bucket, you should be able to use GoogleCloud.jl the same way you would with a private bucket. See Sentinel.jl for an example.

using GoogleCloud
using JSON

# Set Credentials
creds = JSONCredentials(credentials)
session = GoogleSession(creds, ["devstorage.full_control"])
set_session!(storage, session)

# Set Bucket - We'll use Google's Sentinel 2 repository for now. See Sentinel.jl for how this works in action.
bucketName = "gcp-public-data-sentinel-2
 
# Get File List - You can set prefix = "folder/.../..." if you only want to retrieve files under a directory
rawFileList = GoogleCloud.storage(:Object, :list, bucketName; prefix="") 
io = IOBuffer()
write(io, rawFileList)
fileList = String(take!(io))
fileList = JSON.parse(fileList)

You can then download individual files by iterating through the file list, getting, if I remember correctly, the file["name"] key, and pushing it to GoogleCloud.storage(:Object, :get, bucketName, file["name"]). The process for handling IO is the same as above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants