Unfortunately, I believe that's the only way. We have to re-read and re-hash the files to verify that they exist in the repo (we assume that they may have changed although we could probably relax this constraint for the filestore). Note, we won't (or shouldn't at least) actually write them to the repo again.
One way to avoid this would be to use MFS and add the files one-by-one. That is,
FROM="$1" # local directory
TO="$2" # directory in MFS
find "$FROM" -type f -readable -o -type d -readable -executable | while IFS= read -r -d '' fname; do
if [[ -d "$fname" ]]; then
ipfs files mkdir -p -- "$TO/$fname"
elif [[ -f "$fname" ]]; then
if ! ipfs files ls "$TO/$fname" 2>/dev/null; then
# will be pinned in the next command (you should probably disable GC)
cid="$(ipfs add --pin=false --local -q "$fname")"
ipfs files cp -- "/ipfs/$cid" "$TO/$fname"
echo "not a file or directory: $fname" >&2
Note, that script is rather slow... a better one would list the directory you want to import, the target MFS directory, find the diff, and then add the files in batches. However, writing that script is a bit of an endeavor.
I've opened an issue to discuss adding a command to do this to ipfs:
FYI, the next release should make this a bit better. We figured out why adding large datasets causes problems with go-ipfs (we had a leak that has been fixed).