In the AppImage project, we are providing infrastructure to distribute desktop Linux applications inside squashfs files, bundled with all the dependencies that cannnot be assumed to be part of each target system. In a way, an AppImage is very similar to a Linux Live ISO but instead of holding an entire operating system, it is holding a single application together with resources the application needs to run.
We would like to make use of a distributed model for, well, distributing the AppImages, but the question is how to do so in an efficient way.
Technically, a type-2 AppImage file is a squashfs filesystem appended to a small ELF executable that FUSE-mounts the squashes filesystem and executes the payload application from there.
Every Qt application, for example, will ship the same Qt libraries. If we put the AppImages on IPFS as-is, there is probably little to no deduplication going on, and a cluster storing the AppImages would store the same Qt libraries many times over.
With content-aware chunking (e.g., chopping the compressed squashfs after each file, I would imagine much more deduplication to be possible, resulting in a much reduced storage and data transfer volume. squashfs can do various compression schemes.
For AppImage, we are not married to squashfs. If there is a more suitable read-only compressed filesystem that can be FUSE-mounted and read with great performance, we could also switch to something else.
What would you recommend us to do?
(Similar earlier question: https://discuss.ipfs.io/t/what-is-better-large-containers-or-large-sets-of-files/263/3)