I’d take a look at https://docs.ipfs.io/concepts/how-ipfs-works/#distributed-hash-tables-dhts and some of the great videos from IPFS Camp https://www.youtube.com/playlist?list=PLuhRWgmPaHtSsHMhjeWpfOzr8tonPaePu (The Lifecycle of Data in DWeb - IPFS Camp Workshop may be particularly useful to you)
The most important thing to note is that IPFS is not free permanent data storage. At a high level the DHT has a bunch of peers that have volunteered to provide some service to the network, keeping the resource utilization of this volunteer service low is important as it incentivizes higher usage. If you could just ask random people on the internet to store 1TB of data for you that would likely lead to serious problems both in adoption (I don’t have the disk space, bandwidth, or motivation to store TBs of data for random people online) and legality (I’m not a lawyer and cannot provide legal advice, but if I were to store/provide illegal content just because a random person online asked me to that seems like it would be bad news).
So what does the DHT do? For immutable IPFS data the DHT handles provider records these are effectively advertisements where you tell the network “I have file F”. This makes it possible for you to then ask IPFS “get me F” and it will ask the DHT where to find F and then use bitswap to actually get the data from the peers that have advertised having it. This is as opposed to the classic web where if data used to be stored at dnsdomain1 .com/myfile and then went offline I have to use a search engine to look for “myfile” and see if there’s anywhere else where it might be. Since the data is content addressed and there’s a shared “advertising” space we can find the data no matter where it might be.
Side Note: Bitswap will also just ask peers you’re connected to if they have the data so doing a DHT search isn’t necessary if you’re already connected to someone who has the data you’re looking for
Side Note: By default when you download data you also advertise to the network that people can download the data from you. This means that if you waited a while (however long a reprovide cycle is in your config file) you would notice that adding an 11th node would download data from all 10 pre-existing nodes.