What is the Value of Demographic Data?

Nov 16, 2006 @ 5:07 AM
Discuss: 1 Comment
Permanent Link: Save
Category: Web
Post to: del.icio.us

The group I am working in at Microsoft has to do with business intelligence and advertising displays–this is a very interesting space to be in at the moment as it is central to the monetization strategy for the MSN division (now Windows Live). It’s something that has been forced upon all Internet players.

In the pre-Google era, advertising-based services were a failed model–to use the then oft-quoted quip “there is no such thing as a free lunch”. That changed when Google proved that ad-based services can make money and they did this through state-of-the-art datamining and simple two line text-based ads which marked a complete shift in the conventional thinking. It was very innovative and it undercut premium services like Hotmail which eventually followed suit. It is the driving force behind Google’s billions (2.7 billion in revenue for their most recent quarter).

Recently I got a an internal newsletter inside which I stumbled upon a few Microsoft bloggers in this space and spent a bit of time reading through some interesting posts. What prompted me to write this post was Mark Jacobson’s point regarding user profiling and how Google lacks demographic data on its users. You have to ask yourself, why is it that Google is not interested in this information and how much of a disadvantage do they have? Do they know something we don’t?

If you sign up for Hotmail or any of the Passport services, one of the forms you encounter will require your birth date, gender and location. Google on the other hand doesn’t ask you for any of this information–their form is extremely minimalistic in that regard.

Why Demographic Data

Most conventional marketing and research is based on targeting demographic profiles. If you are a 29 year old male living in the 90210 zip code (Beverly Hills) it would indicate that you are in the 1MM+ income group. That helps in targeting ads to you. Your very first ad could be for a new Porsche or a Versace suit. This is traditional marketing and while I don’t know how optimized those heuristics are, but on the web you can–and should–leverage a lot of the newer metrics it has to offer.

As an aside, ads on television networks also have a lot of room for improvement in the ad relevance arena as we step into the digital age, however, there are technical hurdles (harder to datamine videos; harder to show different ads to different people).

Context is Critical

So I am a 19 year old whose zip code is in the university district and you can show me ads about a Dave Matthews concert in the area but how likely am I to click it based on my age and zip code? The probability that I am a student is pretty high because I’m the right age and in the right zip code but that also assumes you have prior research to deduce user profiles for that demographic data. And now you still need to know if I like rock, heavy metal or house music. You also don’t know if at that very moment I am very busy researching symptoms of an illness or researching my next big investment move in a particular stock and don’t give a hoot about some concert.

On the web you are what you read. It’s about contextualization. What should matter to advertisers is what I am looking at, at any particular moment because that is what is most relevant to my interests at that very moment. With demographic data you are targeting most 21 year old students for most of the concerts most of the time (up until the concert is sold out or the marketing campaign ends). All the while you are hoping to cast the biggest net possible to grab a tiny audience of concert-goers in that region. There is a lot of wasted effort. Most of the people don’t care about the concert. The clickthrough rates will be very low. In fact, this model was so bad that nearly all services I can think of that were built around ads failed with this business model around the dot com glory days.

If you shift your focus to contextualization for achieving higher clickthrough, you get a lot more out. I am on a page looking at the discography of U2 and you show me an advertisement to purchase the U2 iPod Nano or a U2 CD. Those ads are agnostic to demographic data. This means that you cannot show me an ad for a U2 concert in my state because you don’t know my zip code. However, as an advertising company you can only so show many ads on any page and the U2 concert ad is better left for the Seattle radio station website where locals visit. Local ads on localized content, adult ads on adults content, children’s ads on children’s content, rich ads on luxury content and so on.

When you have good contextualization, the value of demographic data falls considerably. That said, demographic data obviously has its uses. A page like msn.com which is generic does not represent the mood of the reader since it’s dynamic and very general. On a page like that, knowing if your user is male or female helps promote the right kind of ads.

Anatomy of a Well Designed AJAX Login Experience

Nov 14, 2006 @ 6:29 AM
Discuss: 1 Comment
Permanent Link: Save
Category: Web
Post to: del.icio.us

AJAX in itself is a very simple technology, especially with the availability of the right tools. You make a request and pass some parameters, get a response and you render appropriately. The only thing is that the entire process is tedious and if you are the kind of person that likes to do their own plumbing, things can take time. I recently started a web 2.0 project and am nearing completion of the login page. If you are at the beginning of the web2.0 learning curve (barely familiar with JSON, REST, and Prototype) you will find this post interesting. Or, if you have ever designed your own login page, you might find some details here quite useful.

Get Equipped

First things first, find the right tools for the job.

You need a cross-browser Javascript framework to overcome the discrepancies in different browsers and simplify your life. You can skip all those AJAX tutorials out there and just take things for granted when you adopt the Prototype framework. It's small and has a clean cut library. Though, I should add, I felt the absence of a cross-browser attachEvent but this here script filled that void. As with all things open source, there are different camps and I considered joining the JQuery camp, however, Prototype has a better roadmap and larger community due to its Ruby on Rails endorsement so I went with that.

Next up is client-side validation to help you conserve server load and precious bandwidth by avoiding trips to the server just to validate the password field is not empty. Andrew Tetlaw's validation.js is built on top of Prototype so it fits the bill quite well. It's small and I really love the way it tackles validation using CSS classes. For example, the validate-email class verifies the field has a valid email address (I was also able to define my own custom validators and error messages very easily):

HTML:
  1. <input type="text" size="30" id="email" class="validate-email" />

And what would an AJAX application be if you didn't sprinkle a few animations and effects. I simply wanted to fade text in and out (for usability reasons I will mention later). Be warned though, it is easy to get tempted by all the cool effects and I wouldn't be surprised if all sorts of fancy effects start showing up as these FX libraries trickle down to the lower echelons of web enthusiasts. At first, I went with moo.fx because it's supposed to be superlightweight. However, I ran into some issues with it in IE and switched to Script.aculo.us. Contrary to my initial beliefs, Script.aculo.us is just as small as moo.fx when compressed and possibly better in other arenas.

Speaking of compression, it's probably a good idea to use a reliable JavaScript compression tool before going live with the website. I haven't done this yet and I am certain I'll run into a few other bugs but I'm also certain someone else has already compressed the popular libraries and shared them out. Since all my libraries have relatively big communities, I should be safe here.

Lastly, to exchange data between the web service and the client side Javascript, there are two camps: JSON or XML. I like JSON (JavaScript Object Notation) because it's very lean compared to XML and much faster because it doesn't need any complicated parsing. Instead of using <firstname>aleem</firstname>, JSON simply passes { 'firstname':'aleem }. And then, instead of parsing the XML using XPATH or whatever, JSON simple calls the eval() function which treats the string literal as an object. The only reason to go with XML is if you want to exchange that data with other non-AJAX services. Even so, JSON is well-formed so it's likely you can convert it to XML (there might be tools out there already to accomplish this). Prototype does the client-side parsing and on the server side I chose JSON.NET which takes .NET objects and converts them to JSON for Javascript consumption. It's open source so I can optimize the library and very easily get rid of functionality I don't need.

A Simple Form: Username, Password and Submit

It seems deceptively simple but if you want it done right, there's quite a bit of plumbing to do. Let's begin with the client-side plumbing. Using my validation tools I ran the following client-side validation:

  • Username: should be non-empty, minimum of 5 characters, maximum of 15, alphanumeric and must start with an alphabet. The regex looks like this: ^[a-zA-Z][a-zA-Z0-9]{4,14}$
  • Password: should be non-empty

Further, every time the user hits the submit button, the error messages fades back in. The reason for doing this is because the user will not know the form has been resubmitted using AJAX so the fade-in provides a visual cue that a new action was performed.

The same validation needs to be repeated on the server side. This redundancy is necessary for obvious security reasons--anyone can bypass the form and submit directly to the server. Assuming the validation goes well, the server can respond in any of the following ways:

  • Username does not exist. Display the error on the client side in red and set the focus() to the Login field for convenience.
  • Password does not match. Display the error in red, set the focus() to the Password field and select() the password field text. The reason for selecting the password field is so that the user can start re-typing the password right away. This is not necessary for the Login field however, as it is not asterix'd like the password field and can be corrected visually. Also, select() is better than resetting the field since it gives the user the option to deselect and do the correction.

I did run into Javascript quirkiness here. The order in which the event handlers are executed is convoluted and seemingly non-deterministic in IE. I took me time figuring this out and discovering this is a known issue. Remedying it was simple though--I stuck with only one event handler when order of execution was important. This problem came up during form validation--attaching client side validation and server side validation to the form and having them run in order so if the first failed, the latter would not be executed.

The Server Side

On the server side I am using ASP.NET and wrote HTTP handlers using ASHX files. I also turned off AutoEventWireup which is excessive and unnecessary--it automatically wires up event handling functions to the page. The AJAX framework passes requests over HTTP using GET and POST in general. This is the REST approach and if the Wikipedia explanation seems confusing, just ignore it and grasp the following HTTP request:

HTML:
  1. POST /ods/serviceall.ashx HTTP/1.1
  2. Host: localhost
  3. Content-Type: application/x-www-form-urlencoded
  4. Content-Length: 30
  5.  
  6. r=login&username=FOO&password=BAR

That's the request that gets sent out when the AJAX framework makes the call to the server. The thing to note is the last line which contains key/value pairs with three keys: r, username, password. When the server gets this, it parses this and makes a call to login("FOO", "BAR") and sends an HTTP Response back:

HTML:
  1. HTTP/1.1 200 OK
  2. Server: Microsoft-IIS/5.1
  3. Date: Tue, 14 Nov 2006 03:19:38 GMT
  4. X-Powered-By: ASP.NET
  5. X-AspNet-Version: 2.0.50727
  6. Set-Cookie: ASP.NET_SessionId=cefnyt45pspxr1uiy34lyrj4; path=/; HttpOnly
  7. Set-Cookie: .ASPXAUTH=C3134DF74BDCD48131A084AFFA794C970C0F9998A244B8A707844EF5A8260C40E2164A4B98FD7AE2B6D52D40DD05391B19BEFC8F9D5BA4C627CF4D3C0864F42C6703C7525AA4A3F80DBB2A4774D43388; expires=Tue, 14-Nov-2006 03:49:38 GMT; path=/; HttpOnly
  8. Cache-Control: private
  9. Content-Type: text/html; charset=utf-8
  10. Content-Length: 4
  11.  
  12. true

Ignore the X-Powered-By and other obviously useless headers (I intend to drop them from the server response). When the client-side AJAX framework gets that response it first looks to see if the content contains "true" (last line). If it does then authentication succeeded. For the sake of standardization I will switch to JSON and respond with something like { 'result': true } and then simply use: if(result){...} in my Javascript code. However, I use Enumeration types on the server and JSON.NET does not support serializing Enums to JSON, so there's some work for me here. Anyway, so I grab grab the entire Set-Cookie header and pass it to document.cookie to set the authentication cookie on the client. This cookie is used to verify that the user is authenticated and the login process is now complete.

End of the Beginning

Understanding the guts of the system is not critical to getting the job done but if you want to get intimately familiar with Web 2.0 underpinnings you might want to consider using WFetch or the Live HTTP Headers plugin for Firefox which is terribly useful. The Venkman Javascript Debugger is indispensable if you are into any kind of Javascript development.

I have had a recent inclination toward ASP.NET even though PHP has been very good to me. If you are comfortable with ASP.NET and Visual Studio then you certainly want to use it for learning. PHP will not allow you to set breakpoints, inspect the stack or make runtime modifications. If you are using PHP though, be sure to turn on error reporting if you haven't already.

My login form still needs a "remember me" checkbox and it needs to handle timed out users so after re-login they can be redirected to the page where they timed out. Once the login experience is all squared out, I will move on to designing a quick and simple sign-up experience.

5/29/2007 Update: I wrote Prototype.js does not offer a cross-browser attachEvent, but Event.observe offers just that. For some reason I overlooked it at the time of writing this post.

Why Microsoft and ASP.NET Cannot Threaten the PHP Moat

Nov 7, 2006 @ 7:52 PM
Discuss: 3 Comments
Permanent Link: Save
Category: Web
Post to: del.icio.us

ASP.NET does not aspire to PHP but it has good reason for envy and the two are undeniably in competition. The ASP.NET platform is quite something when you contrast it against PHP which doesn't even have a development platform worthy of note. It has a clean cut architecture, easily discernible roles for each component and object oriented support. The Visual Studio .NET development environment alone is enough to persuade people to switch to ASP.NET because it's that good--not only for development and debugging but also as a learning tool to understand the flow of the application, inspect the stack and heap, request and response, the current state of objects and other advanced features.

Why then is ASP not giving PHP a good run for its community and user adoption?

Timing

ASP was a late bloomer. PHP was one of my first obsessions with programming (discounting BASIC lessons I took when I was 13 and fell out of a year later because I took to playing games) along with the then rudimentary JavaScript. ASP was no where to be seen primarily because it was not free and endorsed mostly by the professional community. Visual Studio Express attempts to remedy this, though the thought of what could have been if it had been introduced around the time PHP3 was making its rounds would certainly makes some executives flinch.

Platform Strategy

Not all of it was timing. Arguably, even if all the tools made available in Visual Studio Express suite (SQL, Visual Web Developer et al) had been offered earlier, ASP might not have achieved mass user acceptance. A product like WordPress (which powers this site) could never have gained the same level of support if built in ASP.NET. ASP.NET gives protection to the source and is most conducive to building proprietary solutions. You could write a charting library, wrap it up as a DLL and put out a trial version to everyone. Since it runs on the web, enforcing licensing is much easier by having it ping a licensing server. Monetizing it is easy. In PHP this is not the case. The source is there for everyone to read and anyone can and probably will write a free version of any WordPress plugin that you would want to charge for today.

Further, the development ecosystem is scattered with tools and libraries attached to viral licenses that oblige developers to share any derivations arising from these tools. When the ecosystem is built on an open-source, share alike license its difficult to build on the shoulders of the community without giving back to the community. The community has a built-in, self defense mechanism that ensures its survival and results in rapid growth. WordPress and Firefox have quite literally every plugin you could possibly ask for. Installing the plugins is trivial and hacking or extending them isn't too hard if you are a developer.

No comparable WordPress equivalent exists in the ASP.NET world wherein an option is available to keep the source protected and more often than not, the option gets excercised, locking out community engagement in turn. ASP.NET has typically been aimed toward the professional not the amateur though it is now trying to find some middle ground to increase user adoption but I doubt it will ever get there for the fundamentals reasons mentioned here.

Costs

The monetary costs, as well as the cost of adoption are both higher for ASP.NET from my experience. Hosting an ASP.NET application on a Windows Server requires higher licensing costs for the hosting providor which in turn are transferred to the customer, resulting in more expensive hosting for ASP.NET applications.

The cost of learning for ASP.NET is also higher. Visual Studio .NET does provide all the plumbing and template for starting off but the underlying plumbing is quite complex. The ASP.NET page life cycle is not trivial and if you aren't comfortable with a debugger you'll find it harder to experiment with it. PHP's scripting model although crude, allows would-be developers to simply dive in. My first PHP page was as simple as saving the file with a .php extension and wrapping the code in php tags with a print statement. No includes, no headers or strong typing, no nothing. ASP.NET will require building out an entire solution if you follow the prescribed development route and though automatic, it still makes it harder to just dive in.

The lower cost of initial adoption results in a wider user base falling into the PHP funnel.

Communities

PHP has a feedback loop built around good karma. Some random people helped me overcome my hurdles with PHP and in turn I learned to empathize with other new comers and offered my support and experience. I find that ASP.NET does not have the same level of community support which scares some people away and doesn't provide the confidence of crowds. The network effects of a larger community also means that there are more people along each stage of the learning curve. Whether you are just starting off or you are making the switch from PHP4 to PHP5 or you are trying to hack some low-level module, the likelihood of someone else tackling the same problem around the same time is much higher and this kind of peer support is a good motivator.

My Sentiments

I am a long time PHP fan and love the fact that WordPress is a product of it. However, if I were to build a web service or website for my own purposes I would be more inclined to go with ASP.NET. Some of my old ticks are still there and I prefer to do much of the underlying plumbing by hand as I find generic solutions are excessive. I also get a greater understanding and consequently feel more comfortable with the application.

Microsft's partnership with Zend should be a win win for both communities. Imagine loading your PHP solution in VS.NET and having auto-complete--writing WordPress plugins would become a breeze.