How Bittorrent Works

BitTorrent is sort of a download manager. More accurately, BitTorrent is a peer-to-peer file transfer protocol.

(Note: The following is intended for newcomers to the mere idea of BitTorrent. See this page for the actual protocol documentation.)

The problem

Imagine this. Everybody wants to download the latest 100MB patch for Super Game 76. So Super Game 76's makers, Genericasoft, upload patch on their servers so that all of Super Game 76's 10,000 players can download it.

But Super Game 76 is insanely popular. This means that in a relatively short space of time, Genericasoft's servers are going to have to provide 10,000 copies of that 100MB file. That's gonna strain any server, right? Genericasoft runs out of bandwidth, everybody finds their download to be running horribly slowly (if at all), nobody gets the patch they need, and it's all bad.

Or maybe your site hosts some cool video - say 10MB - for people to download, but then suddenly Slashdot links to you or something, and bam! you have about a hundred thousand people trying to download the file at once, something your server isn't prepared for, causing it to nosedive.

The solution

If a hundred thousand people are downloading at once, that's a whole lot of combined downloading bandwidth. But it's also a whole lot of combined upload bandwidth... none of which is being used for anything. Can't we use that to our advantage? Yes, we can!

Suppose Genericasoft divides their 100MB patch into, say, four hundred smaller chunks. Then, to get the whole patch, you just need to get four hundred chunks instead of one big one. And let's suppose that when people connect to Genericasoft to get their patch, they don't download the chunks sequentially... they just get whatever chunks are available until they have the complete set. And let's suppose - this is the clever bit - that everybody knows which chunks everybody else has.

Now you connect to a special Genericasoft server called a "tracker". A tracker is a server dedicated to keeping track of which chunks everybody has, and keeping everybody up to date. So Genericasoft's tracker says to you "Right, I've got all four hundred chunks. Person A over there has chunks 2 and 3 only. Person B has all four hundred chunks. Person C just has chunks 1 to 100. Person D..." until it's told you who has what chunks available. Then, and this is the really clever bit, instead of getting all four hundred chunks from Genericasoft, you get each chunk from whoever happens to have it available. You could get chunk 1 from person C, chunks 2 and 3 from person A, chunk 4 from Genericasoft itself, or whatever. The point is, you don't get all the chunks from Genericasoft. The majority of them, you get from other people who are also downloading at the same time as you are.

The clear advantage here is that Genericasoft saves an awful lot of bandwidth. Since you can download many chunks at once, it also means you can download as fast as your personal internet connection can manage - you aren't limited by whatever connection speed Genericasoft is stuck with.

How to use BitTorrent

The above concepts were invented by Bram Cohen. He named the protocol "BitTorrent", and also came up with the first BitTorrent computer program ("client"), which is also called BitTorrent.

Once you have your client installed, you go to Genericasoft's website. Genericasoft will have set up a small file, of the order of a few dozen kilobytes, called a "torrent", freely available for anybody to download. It'll be called something like "sg76patch.torrent". Instead of downloading the 100MB patch, you download this relatively tiny file instead. The torrent contains all the information about the patch that your client needs to download it: the name and location of the tracker that it needs to connect to, the name of the file it's downloading, the size of the chunks it's been split up into, what order they go in... stuff like that.

Then you double-click on it to open it. .txt files open in Notepad. .doc files open in Word. .torrent files open in BitTorrent! Your BitTorrent client will open up, ask you to select a location to save the patch, and whizz away, finding chunks and downloading them until it's got all four hundred (or whatever. The number of chunks can vary hugely). As with KaZaA, Direct Connect and so on, all this downloading can be done in the background while you do other things, and could take any amount of time, depending on the size of the patch, how much bandwidth you have, and how many other people are also downloading it at the same time as you. When it's done, it'll stitch the chunks back together and say "I'm done!" Then you can close BitTorrent and get on with installing your patch.

Note that BitTorrent is not in any sense searchable, like KaZaA or eMule or a Direct Connect hub. You can't just run it and type "Super Game 76" to find the patch you're after. Instead, you have to go out there on the big wide internet and find the torrent you're after manually.

BitTorrent etiquette

As you may have figured out, if you are downloading chunks from other people using BitTorrent, then they must be uploading chunks to you. Similarly, other people will be downloading chunks from you. If you are downloading, you must upload! Otherwise, the entire exercise is pointless, and Genericasoft might as well serve every patch individually all on their own. All BitTorrent clients will force you to upload as well as download. Most of them will also keep a record of how much you've uploaded compared to how much you've downloaded (your share ratio). Ideally, to keep the universe in karmic balance as it were (and to preserve the BitTorrent network), you should upload as much as you download; i.e. your share ratio should ultimately be 1.00 or higher.

People who have all of the chunks no longer need to download anything, so they can quit out. However, you can leave BitTorrent open and stick around anyway, letting people download anything they need from you. If you do this you are called a "seed". Without at least one seed, it should be obvious that any BitTorrent network will collapse, because sooner or later there will be a chunk which nobody has, meaning nobody can finish their download. In this example, Genericasoft's server would probably be a seed, but if you're feeling nice, you can seed too. The chances are that when your download has finished, your share ratio will be less than 1.00 anyway, so in this case you should seed until you reach 1.00 anyway.

Not that there's any reason you should stop there. In fact, you're entirely free to keep seeding for as long as you like. You can even open up the torrent again at a later date, after you're all done, and just do a little more seeding on general principle.

Different BitTorrent clients

Cohen's original client is very simplistic. However, it's short on features. Fortunately, BitTorrent is an open-source program, meaning that any programmer can take a look at Cohen's code, and add bells and whistles of his own. This has resulted in a bundle of other BitTorrent clients being made independently. My preferred client is �Torrent (probably pronounced "microtorrent"), which is much more powerful and versatile than the original BitTorrent client, and designed specifically to be as small as possible (it's just over a hundred kilobytes!). There are many more out there though.

How the illegal downloading scene works

We've established that BitTorrent is a good way to get content to lots of people in a short space of time without expending huge amounts of bandwidth. What we also find is that BitTorrent is a great way to distribute copyrighted material.

It works like this. Instead of it being something legal (a game patch or the latest Linux distribution), the file being distributed can easily be illegal, or at least of dubious legality (an album (you can distribute many files at once using the same torrent), the CD or DVD images of a new game, a movie, a piece of software or the latest episode of a TV show). And instead of somebody reputable like Genericasoft or a Linux developer group, the person who makes and distributes the relevant torrent and maintains the tracker can be anybody. There are in fact a bundle of groups of individuals which churn out tonnes of these torrents.

Distributing the torrents for these movies/games/videos becomes a little more complicated as, while technically a site providing torrents isn't directly providing any copyrighted material, it is still legally very dodgy ground. There are a whole bunch of sites dedicated to circulating these torrents, but finding them isn't trivial.

Pros and cons of BitTorrent compared to other P2P applications

BitTorrent is best consumed as part of a balanced diet of P2P apps because, like all P2P apps, it has both strengths and weaknesses.

Pros:

  • No leeching is permitted. Yes, this IS a pro if you think about it.
  • Alleviates server strain as described above.
  • Unlike KaZaA, there's no underlying network of nodes or servers which can be shut down. As long as torrents are made and distributed, and trackers remain online, BitTorrent will continue to run.
  • BitTorrent is totally free. It does not and never will contain adverts.
  • BitTorrent is open-source. Anybody can make a BitTorrent client with whatever features he wants.
  • No spyware, adware, malware, popups, or other undesirables are bundled with it.

Cons:

  • You have to upload as well as download. All BitTorrent clients which allow throttling will adjust your download throttle to match your upload throttle: if you're not letting people download from you then you won't be allowed to download from them.
  • Not searchable. Big sites full of torrents are easy to find, but no single site carries all the torrents. You may have to actually do some looking before you can find what you want.
  • BitTorrent is good for downloading stuff which is popular now, because the number of seeds and peers will be large. Old, obscure and non-mainstream material, on the other hand, is difficult to find. Looking for an album that's at number one in the charts right now? No problem. Searching for an obscure EP from 1998? Even if you manage to find a torrent, it's entirely possible that the tracker might have been taken offline, or that there are no seeds, leaving you up the creek.
  • BitTorrent is totally public and hence insecure. While nobody can use it to send you viruses or anything like that, your IP address is visible to everybody else using the torrent. If you're using BitTorrent, you can be tracked down easily.

Corrections

You can email me here if any of this is glaringly incorrect.

Back to Blog
Back to Things Of Interest