Getting Deep With The Deep Web
If you’re not familiar with the deep web, it may all seem very mysterious. It’s often portrayed as an unregulated and lawless world beyond the reach of search engines, where everything from hard drugs to hit men are available in exchange for digital currencies like bitcoin.
The reality is rather more mundane.
The deep web is basically a parallel internet beyond the World Wide Web we’re all familiar with, operating to similar rules but containing content that’s not designed to be scanned and ranked by Google or Bing. The similarly titled dark web is the true home of illicit activities, yet even these murky waters have redeeming features. In this blog, we offer a potted overview of what’s out there, and how to find it…
Beneath the surface web
Rather than a specific area or concept, the deep web is basically any internet-hosted content which hasn’t been indexed by search engines. Some of it is surprisingly mundane, such as webmail and online banking. For reasons of personal privacy, these pages aren’t publicly indexed. Corporate and military databases are also protected against indexing to maintain the confidentiality of data fields. Indeed, deep content is encrypted by definition, deliberately placing it beyond the scope of search engine crawlers.
It’s surprising how often we dip below the surface of WWW sites and search engines in order to access deep web content. Every time you log into Yahoo webmail or check your balance with Bank of America, you’re using services that are cloaked from search engine crawlers. Other examples of deep content include paywalled media sites, ecommerce databases and archived web pages like the ones on Wayback Machine. Content management software falls into this category as well. Editing your WordPress website in a CMS takes place beneath the surface web, right up until you click ‘publish’ to make any changes live.
It’s impossible to quantify how much information exists below the surface web, but educated estimates suggest 96% of online content is in the deep. Common analogies include icebergs, or the ocean’s depth when viewed from the surface. Because deep content cannot be accessed from search engines, it’s usually necessary to find a document containing site links, or enter an IP address directly into your browser. A login page for paywalled or personal content could be publicly visible and accessible via a standard bookmark in your preferred browser, but unique ID credentials will be necessary to access behind-the-scenes content.
Going dark
It’s important at this stage to differentiate between the deep and dark webs. These terms are used interchangeably by people who don’t understand them, but the latter is where most of the internet’s uglier secrets are buried. Dark web content requires specialist browsers to access it; because it’s intentionally hidden from public view, websites are generally found by word-of-mouth recommendation or directory listings. Similar to the surface web’s earlier years, sprawling pages contain links to hundreds of external sites with brief summaries of what’s located there. A considerable proportion of dark web material is highly illegal.
Deep, dark and surface web pages are all accessible through the mysterious Tor browser, which is commonly believed to be illegal. In fact, The Onion Router is an entirely legitimate web browser developed by the US Government in the 1990s for military communications, and still part-funded by the Department of Defense. As a result, it operates along rather different lines to conventional platforms like Safari and Chrome. Browsers normally try to receive and display information as quickly as possible from a host web server. However, with Tor, data packets are encrypted and bounced around the world en route to their destination. This makes it impractical to track which data packets – and therefore what content – is being supplied to a particular device at any given moment.
For your browser only
Tor goes to considerable lengths to protect user anonymity. It doesn’t support cookies, and closing down the browser effectively deletes that session’s history. The homepage contains tips on browsing anonymously, and users can surrender spare bandwidth to supporting data transfer for other Tor users, which helps to address the slow loading times caused by its connection model. However, it’s the location spoofing facility that really maximizes user privacy, attempting to replicate the security generated by a virtual private network. Tor experts generally recommend using a VPN for optimal privacy, and also disabling JavaScript.
Once Tor loads (which takes a while), it’s effectively a standard web browser. Its rather dated-looking interface is based on an older version of Mozilla Firefox, supplied with the privacy-oriented DuckDuckGo search engine pre-installed. Putting aside its absurd title, this useful engine keeps no logs of user history or search results. The latter will include URLs comprising several dozen random characters and the .onion top level domain. These are dark web addresses, and Tor is the only mainstream browser capable of accessing them.
Honest endeavors
If you think the dark web provides a haven for pornographers and hackers, you’d be right. However, this unregulated corner of cyberspace offers positive attributes as well. Alongside disruptive and malicious cybercriminals, ethical hackers and whistleblowers benefit from its cloak of anonymity alongside journalists and dissidents. Political activists in repressed countries anonymously share news and information their governments may censor or cover up. And as the surface web becomes increasingly monitored by ISPs and state bodies, growing numbers of ordinary citizens are embracing the privacy provided by online software that won’t sell their browsing history to advertisers.
Buying and selling through the deep or dark webs usually involves bitcoin. This anonymous digital currency can be traded through ATMs and legitimate websites, but it soon becomes impossible to track what happens to your currency. While there are concerns bitcoin is becoming an investor-led bubble like the dotcom stocks of the late Nineties, it does provide an instant-transfer global currency with no exchange rates or records of who paid for what. As a result, it’s perfectly suited to the deep web and its darker cousin.