About Us Our Team Our Investments News Investor Info Blog

Archive for the ‘Mashup’ Category

Online Ad Targeting: Fine Grain Targeting vs. Coarse Grain Delivery

Monday, July 21st, 2008

Behavioral ad targeting has been getting a lot of attention in the press lately, especially around the US Congress’ interest ing the technology. In general ad targeting of various shapes and forms is also becoming a busy space for startups - various types of targeting technologies trying to understand the users intent, and provide them with an appropriate advertisement.

When I looked a bit closer into what most of these ad targeting companies do, it turns out that after they have used whatever mechanism (contextual, behavioral, demographic, psychographic etc.) to decide who you are and what interests you, they translate this into one of a very small number of consumer segments, pick an ad for that segment and display it to you. So all that fancy computation upfront to provide a canned ad. Seems kinda of a waste. Wouldn’t it make more sense to create a personal, data driven ad (from one or more advertisers) to leverage that information? That would be the “holy-grail” of true 1-1 personalized advertising.

This growing impedance mismatch between the fine grain targeting ability of ad networks vs. the coarse grain delivery capability of advertisers is going “short-circuit” the ability of these targeting technologies to show their full potential.

Data Integration and Mashups

Saturday, November 10th, 2007

I am attending Mashup camp and university here in Dublin (the weather reminds me of a poem that a friend of mine wrote about Boston in February - gray, gray, gray, Gray!). IBM was here in force at Mashup University giving three good presentations (along with live demos) on their mashup stack. They were saying that the products based on this stack should be coming out early next year (we’ll see, since from my experience it can be very difficult to get out a new product in an emerging area in IBM - since you can’t prove that the space\product is valuable enough). They have decided to pull together a whole stack for the enterprise mashup space (the content management layer, the mashup layer and the presentation layer -see my previous post on mashup layers). One thing that struck me, especially when listening to the IBM QEDwiki and Mashup hub presentations, is how much those upcoming set of tools for enterprise mashup creation are starting to resemble  “traditional” enterprise data  integration tools (e.g. Informatica and IBM\Ascential). These new tools allow easy extraction from various data sources (including legacy data like CICS, web data  and DBs), and easy wiring of data flows between operator nodes (sort of a bus concept).  The end result isn’t a DB load as with ETL, but rather a web page to display.  No real cleansing capability yet, but my guess is that will be coming as just another web service that can be called as a node in the flow. So it is like the mashups are the lightweight cousin of ETL - for display rather than bulk load purposes. It will be interesting to follow and see how ETL tooling and mashup tooling come together at IBM, especially since the both the ETL and mashup tools tools are part of the Data Integration group at IBM.

Microsoft seems to be taking another route, a more lightweight desktop like approach, and focused on the presentation layer. Popfly is a tool that also allows you to wire together data extraction (only web data as far as I could tell, though it could be extended to other data types) and manipulation nodes – as you link the nodes, the output of one node becomes the input of the next etc… It seemed very presentation oriented, and I didn’t see any Yahoo! Pipes like functionality or legacy extraction capability.

Serena is presenting tomorrow, it will be interesting to see what direction they have taken.

Open APIs

Tuesday, September 25th, 2007

Kudos to Google (soon) and Facebook (already) for offering open APIs, empowering the development community to create interesting (and hopefully profitable) applications based on those APIs. Opening the APIs allow the developer community to develop interesting applications, and enrich everyone’s user experience. However, there is a basic limitation of the current notion of open API (unless it is an open source project) – the owner of the API gets to decide for the developers what is opened (i.e. what programmatic access is allowed), and what remains unavailable. Sometimes limitations are created on purpose – limiting what developers have access to for business, security or other reasons. It is clear the owner has the right to limit usage to protect their rights – but limiting access will just stifle creativity – especially if the APIs are too limiting. Also, in many cases the limitations are artificial – the owner just hasn’t had time to develop all the possible APIs, or haven’t through all the use cases (if that is even possible) leading to a limitation that stops somebody building some really useful new application.
The only way to get around this is to allow the developers to create APIs themselves, or make it possible for anyone to extend and change the APIs and submit it back to the community - not be reliant on the owners to develop it for them. This would lead to a rich evolving set of APIs maintained by the developer community. Until then – open APIs will never be truly open.
And about the owner’s rights - my guess is that this will need to be done contractually rather than programmatically.

Vertcal Mashup Platforms

Wednesday, September 12th, 2007

Gartner just put out a report on “Who’s Who in Enterprise Mashup Technologies” whcih contains all of the usual enterprise paltform companies and all the usual web mashup players . They gave some good, though standard advice that you should understand the problem, before you choose the technology (duh?) - but I thought it was interesting that they didn’t try to define a best practices architecture, or give some guidance on how to combine technologies or choose between them (see my post below).

One thing that was clear is that all of the Mashup Platforms are trying to be generic - allow users to build any type of mashup application. As always, being generic means being more abstract - and making it harder for people to easily build a mashup for a specific domain or vertical. This isn’t unusual for platform builders, since by building a generic tool they can capture the broadest audience of user. But I think that they might be making a mistake with respect to Mashup Platforms - the whole idea is to make it easy for anyone to build “situational applications” - that solve a specific need for information quickily, and that can be used by non-developers. For me, that means that platforms will have to be tailored to the domain of the user.

I am expecting that in the next wave of Mashup Platforms we’ll start seeing vertically oriented mashup platforms that will make it even easier to build a mashup for a specific vertical - from standard verticals like Finance, to more consumer vertical like advertisements.

Walled Gardens on the Web (and elsewhere)

Wednesday, July 25th, 2007

Facebook has been getting a lot of press lately - one discussion item that caught my eye was a number of blogs and discussions around whether Facebook can thrive as a  “walled garden” (which refers to a closed set or exclusive set of information services provided for users (click here for the Wikipedia entry).

The main issues raised were the viability of a walled garden on the internet, the pluses and minuses walled gardens - both for the provider and for the consumer (you can find an interesting discussion at http://www.micropersuasion.com/2007/06/walled-gardens-.html). Most of the examples talk about AOL and how it failed as a walled garden, as did cellular providers that tried to limit WAP access to only certain sites.

I am not sure I actually understand the point - since the the whole internet is just sets of walled gardens - how many websites let you use thier information freely. Very few have comprehensive (or any) APIs, more have feeds that give you limited access to the information actually available. So how is Facebook any different?

One key difference is that opposed to most sites - Facebook has collected your own, personal information (or that of your friends). People want to be able to do with their own information whatever they please. So I think the right analogy isn’t the AOL walled garden approach, but rather something even more “ancient” - the client server revolution  of the 80’s. For years after GUIs and PCs were available it was still very hard it was to get your own organizational information out of various legacy systems to use in new applications. Even though the information was yours  - you couldn’t get at it to use as you like - either because the vendors couldn’t keep pace with the emerrging technologies - or didn’t want to (so they could keep it “hostage”). This gave rise to an imperfect, but usable technical solution that let people get at their information even though the system didn’t have the capability - a whole new set of “screen scraping” technologies that emulated users to get the  desired information out of applications.

So I think that the same will happen here - either the walled gardens will open up or  people will figure out to get at it some other way.

Mashups and Situational Apps

Saturday, July 7th, 2007

Mashups both for prosumers (a new term that I had first heard from Clare Hart at the “Buying & Selling eContent” conference) - high-end consumers and creators of content and for scripters (my own term since I am not sure what exactly to call these high end-users - for example the departmental Excel gurus that create and manage departmental Excel scripts and templates).

The search for tools that empower these domain experts to create applications without programming has been around since at least the 80s (i.e.  4th generation programming languages) - which led to various new forms of application creation - but the only one that has really evolved into a “general use”  corporate tool for non-programmers has been Excel (though not really a 4GL). The reasoning behind those tools was to put the power to create appplications into the hands of the domain expert, and you will get better applications, faster. One new evolution of these types of tools are Domain Specific Languages (DSL) that make programming easier by focusing on a specific domain and building languages that are tailored to that domain.

So much for the history lesson - but what does that have to do with Mashups and  Situational Apps?  Well they both focus on pulling together different data sources and combing them in new ways in order to discover new insights. Mashups seem to be the preferred web term, Situational Apps is a term coined by IBM for the same tyoe of application in a corporate setting.

These types of applications (and application builders) have a lot in common:

1. They all start from a data feed of some sort. either RSS or XML.

2. They focus on ease of use over robustness.

3. They create allow users to applications easily to solve short term  problems.

Many of these tools are experimental and in the Alpha or Beta stage, or are Research projects of one type or another (QEDWiki, Microsoft Popfly, Yahoo Pipes, Intel MashMaker, Google Mashup Editor). As these tools start maturing, I think we will see a layered architecture emerging, especially for the corporate versions of these tools.  Here is how I see the corporate architecture layers evolving (click on the chart to enlarge it):

Mashup Layers

I think the layers are pretty self explainatory, except for the top-most Universal Feed Layer which is simply an easy way to use the new “mashup” data in other ways (e.g. other mashups, mobile).

If you look at the stack there are players in all layers (though most of the mashup tools I mentioned above are in the presentation and mashup layers), and the stack as a whole competes very nicely with a lot of current corporate portal tools - but with a much nicer user experience - one that users are already familiar with from the web.

One important issue that is sometimes overlooked is that mashups require feeds - and even though the number of web feeds is growing, there is still a huge lack of appropriate feeds. Since most mashup makers rely on existing feeds they have a problem when a required feed is not available. Even if the number of available feeds explodes exponetially there is no way for the site provider to know how people would like to use the feeds - so for mashups to take off, the creation of appropriate filtered feeds is going take on new importance, and the creation of these feeds is going to be a huge niche. Currently “Dapper” is the only tool that fills all the needs of the “universal feed layer” - site independence, web based and an easy to use, intutive interface for prosumers and scripters.

ACAP Conference

Friday, June 29th, 2007

I attended the first annual ACAP conference this week (ACAP stands for Automated Binary Access Protocol, their website is http://www.the-acap.org/). They are defining a standard mechanism that will allow publishers to communicate permissions information about content that can be automatically recognized and interpreted by any automated process that would like to use the content.

Currently the only mechanism available to publishers is the Robots Exclusion Protocol (REP, some people just call it robots.txt, the file placed in the root directory of a website that defines which files should or shouldn’t be visited by crawlers). There is also the Robots META tag which allows HTML authors to indicate to visiting robots if a specific document may be indexed, or used to harvest more links.

These mechanisms don’t give publishers very much control over how their content is used and they would like much finer grained control - both at the content level (not just files) and on the usage level (how the content can be used). A while back Yahoo announced a “robots-nocontent” tag that allows marking certain sections as unrelated to the main content of the page, and should therefor be ignored by search engines. It is an interesting addition to REP, but really doesn’t do much to take into account publishers concerns - since it is focused on control over indexing, not delivery (e.g. caching, summarization), and hasn’t been picked up by other search engines.

So what happened at the conference? Attendance was good, lots of people from various publishers - heavily European, but not exclusively. Google, Yahoo and MSN sent representatives, but it was clear that this was a conference led by publishers. There was a certain amount of “Google\Yahoo\MSN” bashing going on, mostly because they haven’t been willing to sign up as members of ACAP.

It is clear that the current lack of control that publishers have over their content on the Internet is a problem and it is preventing them from putting more of their content online (books, periodicals etc). It is also clear that the two groups (search-engines and publishers) have very different mind-sets about what is important. Publishers want to control all aspects of delivery and use of their content, since they make money from that content. They worry that aggregators give away the baby with the bathwater, and by creating their own summaries and caching pages search engines are lowering the value of the actual content. Search engines want access to as much content as possible, index it, summarize it and make things as easy to find as possible on the web. They make money from people accessing this aggregated information.

Then of course there is the whole issue of copyright - the publishers whole business model is based on the fact that owners of content have a proscribed set of rights that allow them to have control over their content, its delivery and usage. Search engines believe that copyright is a bit antiquated, and that by allowing people to find content, they are doing the publishers a service. I think that they are both right - but they come from such different worlds and mind-sets they have troubling understanding each other, or addressing each other concerns. They need to understand that they are in a symbiotic relationship and need to figure out how to work together.

It seems to me that they may be worried about “yesterday’s” technology - with the advent of tools for extracting parts of a page, ad-enabled syndication and mashups - the issues will become even thornier, and any solution will need to take those emerging capabilities into account too. ACAP could become very influential as forum for creators of these technologies and publishers to sit down, understand each other and work out solutions to the issues. If ACAP can make that happen - then they will have real influence over how content is delivered on the web.