I built this because I hit a wall with ML pipelines where I needed to feed S3 data into libraries that only understand local paths (like OpenCV imread, pandas, or PyTorch), and I didn't want to rewrite all my I/O code to use boto3 or s3fs.
Unlike s3fs which mounts S3 as a virtual filesystem (often slow for heavy random access), pos3 mirrors the specific data you need to a local cache before your code block runs. This means your script runs at native disk speed.
It handles the diffing/syncing automatically using a context manager:
Hi HN, I’m Sergey, the author of pos3.
I built this because I hit a wall with ML pipelines where I needed to feed S3 data into libraries that only understand local paths (like OpenCV imread, pandas, or PyTorch), and I didn't want to rewrite all my I/O code to use boto3 or s3fs.
Unlike s3fs which mounts S3 as a virtual filesystem (often slow for heavy random access), pos3 mirrors the specific data you need to a local cache before your code block runs. This means your script runs at native disk speed.
It handles the diffing/syncing automatically using a context manager:
---
import pos3, pandas as pd
with pos3.mirror():
---It's open source (Apache 2.0). I’d love to hear your feedback or if you've solved this differently!