Matlab Online S3 speed: copyfile vs. websave
5 views (last 30 days)
Show older comments
I benchmarked some reads from an s3 bucket via the URL and https (with websave) and via the S3 URI (via copyfile) in Matlab Online.
I performed two benchmarks. I downloaded the same 750 MB file 10 times by websave or copyfile and calculated the average time, and then I downloaded a 119.9 MB file a single time via both methods.
The stats:
750MB file read 10 times:
copyfile via s3: 38.4165 MB/sec
websave via https: 71.8340 MB/sec
119.9MB file read once:
copyfile via s3: 9.0168 MB/sec
web save via https: 27.7525 MB/sec
I was surprised to see that accessing the URL via https was faster than reading directly from the S3 bucket on Matlab Online. I thought Matlab Online ran on AWS and I had expected direct S3 access to be faster.
Can anyone comment on whether this is the expected behavior?
Thanks
Steve
1 Comment
Umar
on 13 Aug 2024
Hi @ Steve Van Hooser,
First, I would like to commend you for conducting this benchmarking experiment. Now, let me share my knowledge with you, the observed behavior of faster download speeds using HTTPS with `websave` compared to direct S3 access via `copyfile` is not entirely unexpected given the various factors influencing network performance and protocol efficiency. However, your findings highlight an important aspect of cloud computing: assumptions about infrastructure capabilities do not always translate into expected performance outcomes. This insight can guide future usage and optimization strategies in similar contexts.
Answers (1)
Divyam
on 20 Aug 2024
Hi @Steve Van Hooser, although it does seem counter-intuitive that accessing and downloading files via the cloud infrastructure is slower than using the HTTPS request, but it is expected behavior, network drives are slower.
MATLAB Online is not entirely hosted on AWS and uses in-house infrastructre which makes accessing/downloading files via the S3 bucket slower than a RESTful service or a product which uses AWS entirely. The combination of increased security protocols, high cloud cost and optimization efforts which are needed to be spent at improving the AWS functionality also add to the slower performance of the "copyfile" function.
The "websave" function uses RESTful web service to download data from the URL which are indeed faster than their network drive counterparts as RESTful web services are lighter, and they utilize direct HTTP downloads with simpler network paths.
0 Comments
See Also
Categories
Find more on Downloads in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!