About Us Our Team Our Investments News Investor Info Blog

Archive for the ‘Content’ Category

Personalized Feeds (or more on Open APIs)

Friday, October 5th, 2007

 I just read an interesting study on the problems with existing news RSS feeds from the University of Maryland’s International Center for Media and Public Relations. I think it is a great example of how user’s can’t depend on the organization that creates the content to provide access to the content in the form or format most useful for them, and why the ability for users to create their own feeds is so valuable. To quote from the study:

“This study found that depending on what users want from a website, they may be very disappointed with that website’s RSS.  Many news consumers go online in the morning to check what happened in the world overnight—who just died, who’s just been indicted, who’s just been elected, how many have been killed in the latest war zone.  And for many of those consumers the quick top five news stories aggregated by Google or Yahoo! are all they want.  But later in the day some of those very same consumers will need to access more and different news for use in their work—they might be tracking news from a region or tracking news on a particular issue.

It is for that latter group of consumers that this RSS study will be most useful.  Essentially, the conclusion of the study is that if a user wants specific news on any subject from any of the 19 news outlets the research team looked at, he or she must still track the news down website by website.”

Bottom line, as long as we depend on publishers as both content providers and access providers we as consumers of content won’t be able to get what we need in the way we need it - just like with APIs.  The only way to solve the problem is to allow users or some unaffiliated community to create the access to content (or API), as opposed to limiting that ability to only the publisher.  As web 2.0 paradigms catch on with the masses, turning more and more of us to prosumers, this will become more and more of an issue.  Publishers that try to control access will lose out to those that provide users the to tailor the content to their own needs. Publishers need to understand that this benefits both them and the users.

I see signs that this is actually starting to happen (in a small way) with the NYTimes and WSJ both announcing personal portals for thier users. The jump to personalized feeds isn’t that unthinkable…

ACAP Conference

Friday, June 29th, 2007

I attended the first annual ACAP conference this week (ACAP stands for Automated Binary Access Protocol, their website is http://www.the-acap.org/). They are defining a standard mechanism that will allow publishers to communicate permissions information about content that can be automatically recognized and interpreted by any automated process that would like to use the content.

Currently the only mechanism available to publishers is the Robots Exclusion Protocol (REP, some people just call it robots.txt, the file placed in the root directory of a website that defines which files should or shouldn’t be visited by crawlers). There is also the Robots META tag which allows HTML authors to indicate to visiting robots if a specific document may be indexed, or used to harvest more links.

These mechanisms don’t give publishers very much control over how their content is used and they would like much finer grained control - both at the content level (not just files) and on the usage level (how the content can be used). A while back Yahoo announced a “robots-nocontent” tag that allows marking certain sections as unrelated to the main content of the page, and should therefor be ignored by search engines. It is an interesting addition to REP, but really doesn’t do much to take into account publishers concerns - since it is focused on control over indexing, not delivery (e.g. caching, summarization), and hasn’t been picked up by other search engines.

So what happened at the conference? Attendance was good, lots of people from various publishers - heavily European, but not exclusively. Google, Yahoo and MSN sent representatives, but it was clear that this was a conference led by publishers. There was a certain amount of “Google\Yahoo\MSN” bashing going on, mostly because they haven’t been willing to sign up as members of ACAP.

It is clear that the current lack of control that publishers have over their content on the Internet is a problem and it is preventing them from putting more of their content online (books, periodicals etc). It is also clear that the two groups (search-engines and publishers) have very different mind-sets about what is important. Publishers want to control all aspects of delivery and use of their content, since they make money from that content. They worry that aggregators give away the baby with the bathwater, and by creating their own summaries and caching pages search engines are lowering the value of the actual content. Search engines want access to as much content as possible, index it, summarize it and make things as easy to find as possible on the web. They make money from people accessing this aggregated information.

Then of course there is the whole issue of copyright - the publishers whole business model is based on the fact that owners of content have a proscribed set of rights that allow them to have control over their content, its delivery and usage. Search engines believe that copyright is a bit antiquated, and that by allowing people to find content, they are doing the publishers a service. I think that they are both right - but they come from such different worlds and mind-sets they have troubling understanding each other, or addressing each other concerns. They need to understand that they are in a symbiotic relationship and need to figure out how to work together.

It seems to me that they may be worried about “yesterday’s” technology - with the advent of tools for extracting parts of a page, ad-enabled syndication and mashups - the issues will become even thornier, and any solution will need to take those emerging capabilities into account too. ACAP could become very influential as forum for creators of these technologies and publishers to sit down, understand each other and work out solutions to the issues. If ACAP can make that happen - then they will have real influence over how content is delivered on the web.