How many CIDs could be direct child of a CID?

how many files or directories could be put as the direct child of a CID ?
is it possible to have thousands , millions , billions of CIDs as first child of another CID ?

There is a limit unless you enable sharding: go-ipfs/experimental-features.md at master · ipfs/go-ipfs · GitHub

The limit is when the directory block is larger than 4MB (ipfs block stat <cid> should tell how big a block is). These blocks can be created locally, but they will not move around in the network. No unit bigger than 4MB can be transferred to other nodes.

2 Likes

and this sharding does not introduce other problems ? , like IPNS which is a solution but an unusably slow one .
wouldn’t you recommend a hierarchical folder distribution in which the user avoids giving a CID more than a certain number of children . instead of using sharding which if is not by default active I assume puts some sort of a burden on the system .

The main thing is that with sharding enabled the CIDs of the generated folders are different. That is not really a burden on the system, it is just a different way of building the DAGs.

The reasons why it is not active by default are in the doc I listed, main one being that it uses the new format even when not needed (I think).

If you can live without enabling sharding then do, but it is there for the times when having folders with lots of links becomes a hard requirement.

if we’re talking about not only more than 4MB but potentially millions, billions, trillions of children is the sharding still good , or another strategy is needed ?
and is there a an estimate of how many children make about 4MB ?

if we’re talking about not only more than 4MB but potentially millions, billions, trillions of children is the sharding still good , or another strategy is needed ?

The difference between sharding and not sharding is that sharding makes the needed hierarchical structure transparent. In both cases there is a hierarchical structure. The same type of issues will appear when you request to list items in a sharded folder with a trillion of them vs. requesting the recursive listing of a trillion items in a hierarchical folder structure manually created.

I don’t know other approaches, other than not having a trillion objects.

and is there a an estimate of how many children make about 4MB ?

A ballpark would be 4MB/34byte ~= 100k directories? Please double-check.

1 Like

any estimate of how many folders before things getting weird and unusable (using sharding)

I have no idea, directly proportional to machine specs, type of storage, etc…

1 Like

and does this 4MB limit or any other limit apply to the number of objects one could place in an IPFS nodes root ?

Yes. By default, when adding an file to IPFS, it gets chunked in 256K pieces. If those pieces were larger than 4MB, then they will not be able to be moved around in the network. This applies to any raw-object (i.e. something manually created with ipfs dag put etc).

100K x 256KB = 25.6GB
that means a file bigger than ~25.6 GB could cause the problem .
but my question is the number of objects at the root of an IPFS node , not the number of chunks in a CID .

is this a good method to avoid the 4MB limit problem :

instead of :
/3463901
have this :
/3/4/6/3/9/0/1/=

Note that you can make your chunks bigger too (up to 4MB).

That is essentially what directory sharding does in a way that is transparent to the user.

1 Like

so how do you compare this “transparent to the user” sharding to the IPFS machine sharding ?
specially from the performance point of view , is it the same , better or worse ?