The Psychotic State Number 2 September 10, 1998 Recently Kyle Shannon and I had a fairly long discussion about what the tech department can do to raise our profile and serve the company better. One thing that we can enhance is what I might call "evangelical fervor" - selling ourselves into the rest of the company. They need to know what we can do, and what we see that is. I decided to use this note to discuss some software products that IMHO we should be talking to the rest of the company about. By no means do I mean to imply that these are the coolest new things that are out there - in fact I'd love to see suggestions for other things we should be talking to the company about. Forgive me if these are too easy. Cool Ideas To Think About LDAP: Okay this one isn't so new. LDAP stands for "Lightweight Directory Access Protocol" and it has been around since about 1995. LDAP is a sort of hierarchical database (not a relational database such as Oracle, Sybase, MSSQL, etc) that is generally used for the storing of directories of users. It is optimized for the usual situation of a user directory: (a) the things being stored are mostly strings and (b) the items in the directory are looked up far more often than they are modified or inserted. An entry in an LDAP directory looks something like this: CN=Mitchell Golden, O=Agency.com, C=US This is a hierarchical name for an entry. The "Common Name" of the entry is "Mitchell Golden", the "Organization" is "Agency.com", and the "Country" is "US". This entry can have attributes attached to it, and each attribute can have values. For example, we can have an "email" attribute that attaches to this entry, and the email attribute can have values "mgolden@agency.com" and "mitch@spiralmedia.com". What's the big deal of this? Can't we do this sort of thing with an ordinary relational database? Well, one could say that it's not that big a deal, in the sense that there are certainly other ways to do the same sorts of things. And yes, one could certainly implement all of this in a SQL based database. But there are a few reasons that LDAP is catching on. (*) LDAP is a server based protocol. This means that there can be _one_ LDAP database in the company, and all this directory data is centralized. The protocol is very nicely set up so that multiple servers can replicate data back and forth and stay synchronized (and yes, SQL databases can do this too). (*) LDAP has a very nice, simple API, which is allowing many, many applications to connect to the LDAP server and use it for authentication or other purposes. This means that LDAP is singularly suited for an intranet, and indeed Agency.com is using it for ours. (*) LDAP is fast and can handle huge numbers of entries. (*) LDAP and its API are open standards (unlike the SQL protocols). The original implementation was written at the University of Michigan, and Netscape has one. So what could we do with it? Suppose a company is using LDAP to control email access, or access to their desktop machines. If we know that this is happening, we can use it to control access to their intranet, and thereby eliminate the need to specially set up an authentication system for an intranet too. We can build a system that requires a login to reach some pages on the site - and that login can be the same username as the person uses to fetch his email, log into the desktop, gain access to printers around the company, etc. Actually we _have_ built a system that sits on top of LDAP - it's the virtually complete Agency.com intranet, which Tony Ward has been directing. Interestingly, there is a company called Netegrity (why do all these companies have such silly, interchangeable names?) that builds a very rich, complex website administration and authorization tool. We got an evaluation copy recently, and Chris Stetson is looking at it. For you former-OM folks, it's sort of Sidney-on-steroids. Sidney was OM's attempt to build an site authentication and tracking system that allowed different users to have access to different parts of a website. Netegrity's Site Minder is the same, but it is built on top of LDAP, not Index Plus (which is an odd database I don't know too much about. Nor does anyone else in the world.) As more of the sites we work on are intranets, or even public facing sites with registration or personalization, I expect that we will be running into LDAP more and more often. Automated Collaborative Filtering: Another old/new idea I'd like us to think about again. ACF is an idea that came out of MIT, and has launched at least two companies: Firefly and Net Perceptions. Most of you have probably heard of this - it is the engine for the recommendations made by the Amazon.com site. Many folks have raved about Amazon's recommendation engine. It litterly puts books the visitor is likely to use right on the home page - and certainly it has helped them move merchandise. A few months ago Ralph Seaman launched our first site that made use of this technology - Billboard Talent Net. Yet somehow it has always seemed to me that we have missed opportunities to use this software on various other sites. The idea of ACF can be illustrated by considering Amazon.com. The site's goal is to sell books. In the process of selling, it asks people to rate books somehow. There are various ways to do this: it can note which ads for which books on the site the visitor looks at, it can note which books the person buys, it can flat out ask what the visitor thinks of certain books. After a while the site will have ratings for a group of people. Some books are rated high, some low. Now consider two people, A and B. There are some books that A and B have both rated. We can ask the question, On the set of books that both A and B have rated, are they "close"? That is, How well do A's ratings correlate with B's? Suppose that A and B are "close", so that A and B tend to rate similarly on the overlap set. Then we can probably deduce that on the set of books that A has rated and B hasn't, B's ratings will be similar to A's. This allows us to recommend books to B - books that A says she liked but B hasn't seen. What I'd like to think about is how to go beyond this model. There must be many other things we can do with this sort of software. We need not be locked into the notion that we can only use these techniques to sell goods. (Even the vendors themselves seem often to think about their products this narrowly.) Anytime we want to know the answer to the question "Who is this visitor like?" we can use ACF. We don't need to be selling them anything, at least not at the moment. We could, for example, think of the recommendation engine as a way to improve site navigation generally. If you know that two people are alike, you should put links to what A likes on B's home page. But there are even wilder ideas. Suppose we determine that B is like A, but A is known to have difficulty reading English. Then we could switch to a version of the site written in simpler terms, or without slang. As we move away from "one size fits all" sites, we should always keep the ACF arrow in our quiver, more than I think we do sometimes. SMIL Synchronized Media Integration Language (pronounced "smile") is the latest thing to come out of the W3C. I can't tell if it's the wave of the future or not, but it's certainly interesting. Simply put, SMIL is to multimedia what HTML was to text. That is, it allows formatting and linking of multimedia elements. Here's a sample SMIL file: The and tags indicate parallel and sequential elements, while the rest indicate the inclusion of text, audio and video. Naturally, you can hyperlink these documents with tags. The hyperlinks are like image maps in HTML, except that they are extended to TIME as well. That is, you can say If you click here in the first 5 seconds you will link to A.HTML, in the next 15 to B.HTML. Without going too much into it (and the spec for this is on the W3C's site) you can see that this is not as powerful as Director, for example, but it is built on top of HTML 4.0 with CSS and so it likely to be cross-platform. (Except that Microsoft hasn't signed on to the spec, and seems to be going its own way in these matters. Aren't we tired of the browser wars?) Among other things, it lacks a good programming environment to respond in complex ways to user input. You have to write code in Javescript or Java. And there are other things coming. On top of SMIL, the TV broadcasters are building something called BHTML, Broadcast HTML. It's all happening very fast and it's tough to keep track of. Expect that in just a few years, a huge amount of what we know will be obsolete! That means our clients will be looking to us for knowledge of things we've never done so far. I for one find the prospect exciting.