Every time you click a link

Most people don’t know what goes on behind the curtain when you click a link while surfing the web.
And nor should they.
Know or care.

I guess its clear that I am not exactly normal, but in my humble opinion, everyone should have a smattering of knowledge of something they spend so much time doing….. But most of all, I want people to know just how much of their life is leaking online.

So, this might be worth a read;

A small technological marvel occurs on almost every visit to a web page. In the seconds that elapse between the user’s click and the display of the page, an ad auction takes place in which hundreds of bidders gather whatever information they can get on the user, determine which ads are likely to be of interest, place bids, and transmit the winning ad to be placed in the page.

What follows is somewhat technical, but thats the guts of what I wanted to point out. Every time you click a link, a LOT of stuff happens in the background and without the knowledge of the clickee.

Please keep in mind that this happens even if you run an ‘ad blocker’ in your browser. All that does is stop the ads from showing up, it does not stop the auction, it does not stop information about who or where you are from going to the ad company(s). It does not stop the ad from being downloaded through your Internet connection. It just stops them from being displayed. Thats all.

Ok, with that basics out the way, let’s dive in a little and see what’s going on.

How can all that happen in approximately 100 milliseconds? Let’s explore the timeline and find out what goes on behind the scenes in a modern ad auction. Most of the information I have comes from two companies that handle different stages of the auction: the ad exchange AppNexus and the demand side platform Yashi. Both store critical data in an Aerospike database running on flash to achieve sub-second speeds.

First up, it’s interesting that the whole database is running on flash memory… Here speed is king and trumps even cost.
This is something that is going to become more common I suspect, the right sort of memory for the data and speed required is going to be balanced against cost.

Let’s set a bit of context for the magic of ad displays by looking at how much time a web page takes to load.

Steve Souders’s classic High Performance Web Sites (released seven years ago) reported that the top ten U.S. web sites in 2007 required download times that ranged from 1.7 to 22.4 seconds. Things have certainly improved since then — although it’s hard to say how much resulted from faster Internet connections and how much from performance tuning.

Rolling forward to this year, Akamai recently released its quarterly State of the Internet Report. Akamai does Real User Monitoring (RUM) through a product called Aqua Ion it released last year. Average page load time is not very impressive, although of course it varies widely among regions. Typical load times are 3 or 4 seconds for broadband, and 5 to 10 seconds for mobile networks. (I’m making gross generalizations from the table on p. 57 of the report.)

I have taken the highest of glances at this topic and would love to have the time to do a deep dive on it. I once sat enthralled for a full half hour and watched a video of a guy giving a tech talk on how to get a web page to load in under one second.
That video started a curiosity for knowing what goes on behind the curtain every time a link is clicked (that and it being a major part of my day job to build web pages that display different (aka oddball) content and to write code to scrape data off webpages and reformat said data for consumption in other processes).
I have to admit that my Dad getting off dial up in the past ‘little while’ has also given me some context – that fact impacted (in a good way) my personal website design and construction for the past many years.

That will do for the topic of speed for now, how about privacy.

Andy McConaghie and Larry Nolan of Yashi explained to me that the browser passes the user’s IP address, the user agent string, and available cookies to the exchange (which shares this information ultimately with the DSP). The browser also passes information on the web page being visited and characteristics of the desired ad. These characteristics include whether it’s banner, pop-up, video, etc.; the size of the area where it will be displayed (which gives a good indication where it will appear on the screen); and whether a video will auto-play when loaded.

Each party in the auction (the exchange and DSP) tries to identify users, normally through cookies. Here, of course, is where so many privacy advocates see risks to the user. Indeed, both exchanges and DSPs try to find out as much about our preferences as possible by sharing their knowledge of the user with third-party sites that collect demographic data about us, information on our shopping habits, etc. However, none of these companies really wants to identify who we are — they just want to get us the sports equipment or gadget we like. In short, the exchanges and DSPs are uninterested in personally identifying information such as your name.

Pretty crazy the lengths they will go to. I find it fascinating that so much can be gleaned by simply having big data sets to work with.
Just simple things like knowing the version of the browser you are using can tip off what sort of person you are – geeks are more likely to be up to date – an IE 6 user… never mind….

If you are not aware, the users (thats you) IP address gives a pretty good indication about your location. There is not a lot you can do about that unless you want to jump through some hoops (like use a VPN – Virtual Private Network) which will put you on the web as if from a different location – in some cases, as if you were in a different country!)
So what I said about your information leaking online. Yeah. I was not kidding. By just surfing the web, you are screaming to all sorts of people ‘exactly’ where you live.

Uh, Im over 1000 words on this one. Sorry BA. I will stop now.

Long blog long. It’s really really really interesting what goes on behind the curtain every time you (or I) click a link.