Scrapheap Challenge at SPA2007, part 2: The Presentation Package

Here is my solution to the the Presentation Package challenge from the SPA 2007 Scrapheap Challenge workshop. The challenge was:

A Presentation Package. I want to be able to type in a list of sentences that summarise what I will talk about during each slide of the presentation. For each summary the tool should suggest pictures that illustrate the point I want to put across and let me pick one picture per slide to build a presentation. It shows that presentation full-screen.

To write a solution in 90 minutes I used as much of the infrastructure of the GNOME desktop environment as I could.

The user writes their slide summaries in a text file using the Gedit text editor. For example:

Scrapheap Challenge is a workshop about using other peoples software
We have created a scrapheap for you to use called the Internet
You have to work in pairs
You will be given three challenges
The first pair to complete the challenge wins
Then swap pairs before the next challenge
We will have a short retrospective after each challenge
And a long retrospective at the end of the workshop
Prizes will be awarded on completely arbitrary criteria

I wrote a little Python script that turned those summaries into comma-separated tags and used a Python API to the Flickr search webservice to pull down ten pictures that matched each set of tags. I chose Python because I know it well, it has a large standard library for doing internet stuff, and it lets you write terse but readable code, which is good when you want to get a lot done in a short time. I chose Flickr because it contains a lot of stunning photos, Google don't provide automatable search APIs any more and I've had problems with the Yahoo image search in the past.

The script is below. It's what I wrote on the day in 90 minutes while experimenting with the Flickr API so it could be tidied up but I think it's still pretty readable, which is one of Python's big strengths in my opinion.

import sys
import os
from itertools import *
from urllib2 import urlopen

from flickr import photos_search

BatchSize = 10

fluff = set([
    "then", "there", "with", "have", "will"
])

def search(title):
    words = set([word.lower() for word in title.split() if len(word) > 3])
    tags = ",".join(words - fluff)
    return photos_search(tags=tags,
                         tag_mode="any",
                         sort="interestingness-desc",
                         per_page=BatchSize)


titles = [line for line in
          [line.strip() for line in open(sys.argv[1]).readlines()]
          if line != ""]

results = [(title, search(title)) for title in titles]

os.system("rm -rf slides/")
os.makedirs("slides/chosen")
for (title, photos), slide_index in izip(results, count(1)):
    print title
    slide_dir = "slides/choose/%02i - %s"%(slide_index,title)
    os.makedirs(slide_dir)
    
    for photo, photo_index in izip(photos,count(1)):
        url = photo.getURL(urlType='source')
        print "    Loading ", url
        data = urlopen(url).read()
        
        local_file = slide_dir + "/%02i.%02i - %s.jpg"%(slide_index,photo_index,title)
        
        open(local_file, "wb").write(data)

The script creates two folders, slides/choose and slides/chosen. Under slides/choose it creates a folder per slide, named after the summary of that slide:

For each summary in the user's text file the script downloads ten photos from Flickr that have any tags in common with the words in the summary, ordered by "interestingness", whatever that means. The downloaded photos are saved into the appropriate folder under slides/choose:

The user then opens the slides/choose and slides/chosen folders in Nautilus, the GNOME file manager, and drags one picture per slide from the subfolders of slides/choose into slides/chosen:

To give a presentation, the user opens the slides/chosen folder in Nautilus and double-clicks on the first slide to open it in the GNOME image viewer. Hitting F11 in the image viewer shows the slides fullscreen. Hitting Space shows the next slide in the folder. The user can also navigate forwards and back with the Page-Up and Page-Down keys.

The final presentations are surprisingly good.

Mistaeks I Hav Made

Scrapheap Challenge at SPA2007, part 2: The Presentation Package