Is it possible to use the command line to transfer data from Backblaze B2 to Amazon S3 without downloading to a local directory?

Question

Is it possible to use the command line to transfer data from Backblaze B2 to Amazon S3 without downloading to a local directory?

469 Views Asked by pythonbeginner At 02 May 2023 at 17:03

I would like to move files from Backblaze B2 to Amazon S3. The instructions here say that I should download them to a local directory. However, I am trying to transfer about 180 TB of data so I would prefer to not have to download them locally.

I found this post with a similar question, but I was wondering if there was a way to do this using the command line instead of ForkLift.

Thank you

Original Q&A

There are 1 best solutions below

**metadaddy** · Accepted Answer · 2023-05-02T22:00:05.857000

Yes, you can do this using the AWS CLI. The aws s3 cp command can read stdin or write to stdout by using - instead of a filename, so you can pipe two aws s3 cp commands together to read a file from Backblaze B2 and write it to Amazon S3 without it hitting the local disk.

First, configure two AWS profiles from the command line - one for B2 and the other for AWS. aws configure will prompt you for the credentials for each account:

% aws configure --profile b2
% aws configure --profile aws

After you run aws configure, edit the AWS config file (~/.aws/config on Mac and Linux, C:\Users\USERNAME\.aws\config on Windows) and add a value for endpoint_url to the b2 profile. This saves you from having to specify the --endpoint-url option every time you run aws s3 with the b2 profile.

For example, if your B2 region was us-west-004 and your AWS region was us-west-1, you would edit your config file to look like this:

[profile b2]
region = us-west-004
endpoint_url = https://s3.us-west-004.backblazeb2.com

[profile aws]
region = us-west-1

Now you can specify the profiles in the two aws s3 cp commands.

aws --profile b2 s3 cp s3://<Your Backblaze bucket name>/filename.ext - \
| aws --profile aws s3 cp - s3://<Your AWS bucket name>/filename.ext

It's easy to run a quick test on a single file

# Write a file to Backblaze B2
% echo 'Hello world!' | \
aws --profile b2 s3 cp - s3://metadaddy-b2/hello.txt

# Copy file from Backblaze B2 to Amazon S3
% aws --profile b2 s3 cp s3://metadaddy-b2/hello.txt - \
| aws --profile aws s3 cp - s3://metadaddy-s3/hello.txt

# Read the file from Amazon S3
% aws --profile aws s3 cp s3://metadaddy-s3/hello.txt -
Hello world!

One wrinkle is that, if the file is more than 50 GB, you will need to use the --expected-size argument to specify the file size so that the cp command can split the stream into parts for a large file upload. From the AWS CLI docs:

--expected-size (string) This argument specifies the expected size of a stream in terms of bytes. Note that this argument is needed only when a stream is being uploaded to s3 and the size is larger than 50GB. Failure to include this argument under these conditions may result in a failed upload due to too many parts in upload.

Here's a one-liner that copies the contents of a bucket on B2 to a bucket on S3, outputting the filename (object key) and size of each file. It assumes you've set up the profiles as above.

aws --profile b2 s3api list-objects-v2 --bucket metadaddy-b2 \
| jq '.Contents[] | .Key, .Size' \
| xargs -n2 sh -c 'echo "Copying \"$1\" ($2 bytes)"; \
    aws --profile b2 s3 cp "s3://metadaddy-b2/$1" - \
    | aws s3 --profile aws cp - "s3://metadaddy-s3/$1" --expected-size $2' sh

Although this technique does not hit the local disk, the data still has to flow from B2 to wherever this script is running, then to S3. As @Mark B mentioned in his answer, run the script on an EC2 instance for best performance.

Is it possible to use the command line to transfer data from Backblaze B2 to Amazon S3 without downloading to a local directory?

There are 1 best solutions below

Related Questions in AMAZON-WEB-SERVICES

Related Questions in AMAZON-S3

Related Questions in BACKBLAZE

Trending Questions

Popular # Hahtags

Popular Questions