Posts filed under ‘web 2.0’

John Battelle revisits the Database of Intentions

An excellent read at from both Battelle himself and the many comments on how the Database of Intentions has evolved from the early years of the century when Google started to come of age.

With the ‘eruption’ of Web 2.0 he now identifies four key fields in the database and four signals:

  • The Purchase → “What I buy” (fifth added later:
  • The query → “What I want”
  • The social graph → “Who I am” / “Who I know”
  • The update → “What I’m doing” / “What’s happening”
  • The check-in → “Where I am”

Battelle argues for a more catholic definition of search – an extension from web search to include these other signals.

Not so sure, myself. Whilst these signals and many other signals  are vitally important to the evolution of the web, I’m inclined to agree with the commentator who suggests that social graph, updates and check-ins are refinements or attributes to intention rather than fields themselves.

Also surprised there is no reference to semantics or Semantic Web – aggregation, filtering and pull around a user’s intention or need.

Still, the key message for me is in his conclusion:

If you’re not viewing your job to be a curator, clarifier, interpreter, and amplifier of the Database of Intentions, you’re soon going to be out of business. The Database of Intentions is the fuel that drives media platforms, and as I’ve argued elsewhere, every business is now a media business.


8 March 2010 at 23:20 Leave a comment

Wikia search launches

Wikia search Wikia Search launched today. Nice simple interface; but, as the site says, “the results are pretty bad”. But the concept is that trusted user feedback from a community of users acting together in an open, transparent, public way will improve results. This seems to be based on providing user feedback and creating mini articles that provide short definitions, disambiguations, photos and ‘see also’s.

Users get to see the Nutch relevancy score by clicking on the numerical score by each result.

7 January 2008 at 23:13 Leave a comment

Google PSE or Google’s Semantic Web

A summary of an interesting Bear, Steans equity research paper (PDF) from May 2007.

Google is introducing a new layer to its search and indexing methodology. Google’s patent applications were published in February 2007 and call for a Programmable Search Engine (PSE).

PSE will augment its current PageRank algorithm and change the way in which relevance ranking occurs for some types of web pages.

Under the PSE, web page data will be more structured and webmasters will be able to communicate 2-way to Google’s PSE.

Web pages will be indexed more effectively and web site owners will have the ability to instruct Google about what it can and can not do with the web page’s content:

  1. Provide more granular detail on search results (think car inventory on a lot, not just the local dealer’s phone number)
  2. Provide more personal results (Google could customize search results to the individual user, based on their preferences and past behavior)
  3. Reduce spoofed results by spammers and SEOs
  4. Index password protected information (sometimes called “deep web” or “invisible web” material), with permission (think of information behind in a site with which people have subscriptions)
  5. Index dynamic sites (sites that change based on what the user asks for – think of flight information on sites like Expedia)
  6. Do a much better job indexing non-text based information (think video or audio based content)
  7. Cross-integrate information from different web pages to provide more complete results to answer a question more completely
  8. Finally, Google would be able to leverage the new found ability to provide more granular information to better target advertising, increasing advertisers ROI

PageRank no longer enough:

  1. Spammers/Black Hat ‘arms race’
  2. Can’t offer vertical search or deal with deep web/rich media
  3. Advertisers want more precision.

Key components of PSE:

  • Programmable – via XML to guide indexing – essentially Sitemaps
  • Partnerships – webmasters to become content partners, not anonymous sources of data. Onus will be on webmasters to conform to structured data format
  • Aggregate multiple data sources – across the web – will alter the playing field for web search, relevance and current advantage of vertical search engines
  • Targetted for users and advertisers – customised to context (incl device) of user
  • PSE will learn and grow – as it accepts instructions from usage analysis, webmasters (Sitemaps etc), users and advertisers. As well as an ontology for all the data held in PSE, there will be a ‘database of databases’
  • Opening up Rich Media Web – users can use XML to specify info, output and formats they want
  • Barriers to competition – Competitors could emulate, but Google has scale in place
  • Semantic Web – PSE takes an important step towards Google delivering Semantic Web functionality. In stead of a flat index and a keyword index, Google has a database of databases. With the original content and the site’s metadata data can be accessed in a more manipulated form.
  • PSE does not replace PageRank
  • PSE will take instuctions from XML files – so it’s a more open 2-way design, but Google ultimately defines formats

The five patents

  1. PSE – a layer on top of PageRank. Importance of context: Metrics of the rules and instructions – PSE will learn which are the ‘popular’ rules; Metadata at any level; Usage tracking will help PSE focus on a contextual subset of results – possible privacy concerns; generates metadata on pre- and post-processing operations – so can learn what users do after seeing results and push related content/ads
  2. Aggregating context data.
  3. Sharing context data.
  4. Detecting spam
  5. Generating Ads

2 January 2008 at 23:22 Leave a comment

But search will eventually change all websites

Matt Chapman in Information World Review 01 Jun 2007, summarises John Batelle’s conversation at HP ‘s Print 2.0 conference in New York.

Google is now the default interface for the web but will not always be the dominant point of access, “The web has an interface and I would argue that it is Google right now.”

“Search is that interface but it is not always going to be, just like it was not always DOS and it is not always going to be Windows.”

Batelle compared Google’s sparse homepage to the command line from DOS that was used to get information from a computer.

“Where are we now in search? Well, the command line, but with a huge difference: we are not talking in the computer’s language, we are talking in our language,” he said. So search is facilitating a conversation.

Batelle argued that the way search treats its users would eventually have wide-reaching effects for all websites.

“Think about what search does. You come to a place (Google, MSN, Yahoo, whatever), you say something and the whole place reorganises around what you just said,” he explained.

“And this is an interface that we are getting so used to that we are going to start getting mad at businesses that do not do that for us.”

See the videos at HP Corporate TV.

9 June 2007 at 22:18 Leave a comment

Web 2.0 as ‘content customisation and personalisation’

Mark Iremonger, head of digital, Proximity London, has an interesting article in May’s Revolution magazine. In Digital direct: You have to make your content useful, entertaining, or both (registration required) he makes the makes the point that:

“…brands need to create content that consumers value to earn the right to a relationship. …This means creating content that is either useful or entertaining. … People currently see web 2.0 as synonymous with ‘community creation’, but it is ‘content customisation and personalisation’ that is really at the heart of it. We will see a growing ‘unbundling‘ of services that are traditionally anchored to web sites, setting them free to be available anywhere at anytime.”

5 May 2007 at 13:14 Leave a comment


July 2018
« Sep    

Twitter Updates