Introducing Chipwrapper - a UK newspaper search engine

 by Martin Belam, 3 September 2007

This week I'm launching a new web project - Chipwrapper.

Chipwrapper

Chipwrapper is a hub for a set of tools to search news content from the UK's major newspapers and TV news sources.

It was originally intended just to cover the printed press, but I found it odd to be searching for news from the UK and to not be seeing content from the BBC or Sky. Adding the TV news giants does rather spoil the chipwrapper metaphor, of course, but I think it makes for a better and more useful service overall.

Search engine

The main part of the service is a Google Custom Search Engine. This allows you to search Google's index, but only see results from the major newspapers and news sources in the UK. At the moment the search engine will return material from:

  • BBC News
  • BBC Sport
  • Daily Express
  • Daily Mail
  • Daily Mirror
  • Daily Star
  • The Financial Times
  • The Guardian
  • The Independent
  • ITN
  • News Of The World
  • The People
  • The Scotsman
  • Sky News
  • Sky Sports
  • The Sun
  • The Telegraph
  • The Times

I'm hoping that I can expand this shortly in time to also include some of the UK's larger regional and national titles.

Chipwrapper browser tools

As well as being able to search from the Chipwrapper homepage, I've also made available a couple of browser tools so that you can integrate Chipwrapper search into your web experience.

There is a Chipwrapper OpenSearch plugin, which will allow you to add Chipwrapper search to Internet Explorer 7, and to Firefox version 2 and above.

Chipwrapper  Chipwrapper Browser Plugin  [Add this search to your browser]

I've also made a Chipwrapper custom button for the Google Toolbar. This works across all of the platforms that the Google Toolbar supports, and allows you to search Chipwrapper direct from your toolbar. It also includes a drop-down menu which will bring you the latest UK newspaper and news headlines via the wonder of RSS.

Chipwrapper  Chipwrapper custom Google Toolbar button  [Add this button to your Google Toolbar]

Headline aggregator

You'll also find a headline aggregator on the Chipwrapper homepage.

This brings the current top online story from the BBC, Daily Express, Daily Mail, Daily Mirror, Financial Times, Guardian, Independent, ITN, Sky News, The Sun, The Telegraph and The Times altogether in one place.

Chipwrapper headlines

Each of the headlines is credited to the newspaper or source it comes from, and links through to their original story.

On Sundays, the mix of sources changes, and Chipwrapper automatically adds the top story from the News Of The World in.

Chipwrapper RSS feeds

The headline aggregation is also available as an RSS feed, which come flavoured for news, sport, or football headlines.

Chipwrapper live bookmark

Again, each headline is credited to the newspaper or news source it originates from, and links to the original published story.

RSS Feed  Chipwrappper - UK Newspaper Headlines Feed

RSS Feed  Chipwrappper - UK Newspaper Sport Headlines Feed

RSS Feed  Chipwrappper - UK Newspaper Football Headlines Feed

All of the Chipwrapper RSS feeds are built using Yahoo! Pipes that pull together content from the newspapers.

They are then processed by the Chipwrapper server to append a newspaper credit to each headline using a tiny bit of my rusty old Perl, and distributed using Feedburner.

Headline buzz

Underneath the search box on the Chipwrapper homepage you'll find the Headline Buzz - 7 of the most popular words in the UK's headlines within the last hour.

Chipwrapper headline buzz

This is based on the kind of work that I used to do way-back-when for the BBC on search log analysis and looking at word bursts within the logs.

The headline buzz is built by using a Yahoo! Pipe which takes the top ten headlines from the 11 main UK news sources - BBC, Daily Express, Daily Mail, Daily Mirror, Guardian, Independent, ITN, Sky News, The Sun, The Telegraph and The Times.

A script then analyses those headlines, and makes a league table of popular words, with the dull ones like 'we', 'they', 'and', 'of' etc stripped out. The top 7 of these appear on the Chipwrapper homepage and link through to the Chipwrapper search results for that word.

The whole list is also available as an RSS feed. This refreshes every hour, and contains each word that occurred in 3 or more headlines and the number of times that it appeared.

RSS Feed  Chipwrappper Headline Buzz

Build stuff yourself

One of the defining ideas of Chipwrapper has been to build it using free (as in costs nothing) software and Web 2.0 tools and services. In fact, aside from the domain registration costs, and a bit of Bytemark hosting costs, it hasn't cost me anything apart from time.

I have dreams of Chipwrapper widgets and gadgets and map mash-ups well beyond my capabilities though, which is why there is a 'Make stuff' page.

This lists all of the contributory pipes, feeds, XML and bits'n'bobs that make up the service so that people can hopefully build new tools based upon it. I hope I'll be able to list those on Chipwrapper in return.

...And finally

There is the obligatory Chipwrapper blog as well. Whilst I'd dearly love to have the time to blog all day, every day, about newspapers, I don't, so this is primarily for me to announce new features as they come along. There is also a useful links page with some external links to news about the news industry.

Any feedback is welcome, so please either post a comment here or on the Chipwrapper blog, or email me at the usual address - martin.belam@currybet.net

8 Comments

chipwrappr, surely?

I did prepare a version of the banner with the obligatory web 2.0 vowel-dropping in place, but having discussed it, I thought the name should reflect the idea that the target audience of journos, PR people and students can (mostly) actually spell

My dumb browser has just gamely ambled off to http://www.google.com/reader/view/#browser_plugin

:)

I don't know, you people who want to read sites without actually visiting them, it were all HTML around here in my day, and relative links used to work etc etc :-)

On a more serious note, knowing that people view the site in feed readers has changed the way I code it - for example all image links have to be fully qualified with a www.currybet.net at the start to make sure the pictures actually turn up in the Feedburner feed, Bloglines et all.

There's a Currybet site now? :)

If you want to search through EVERY SINGLE UK MEDIA OUTLET then you can also try the search below...

http://www.google.com/coop/cse?cx=011504164764696342474%3Agsg-hi7oiuk

That's, um, over 3,000 websites... crikey.

"headline buzz" can take hilarious turns, as in:

madeleine police mother fears charged killing pavarotti

Just a thought, but how about some editorialising on the sources themselves? I mean, why include the Daily Express, which ceased to be a "news source" when Desmond took it over. Should that appear on the homepage alongside the BBC and the broadsheets?

Keep up to date on my new blog