The Resurrection of Newspaper Obituaries
September 7th, 2010 Brian HerzogLast week I started talking about newspaper obituaries. Today's post details how we're improving access to the obituaries we do have in our newspaper microfilm records, using an online index created with Yahoo Pipes.
Our microfilm records of the local papers go back to 1940. But microfilm is primarily an archival format, rather than an accessible format, so it can be cumbersome to use. Our biggest impediment was that we didn't know what was there - when a patron contacted the Reference Desk asking for someone's obituary, it was very time-consuming for us to search the microfilm for an obituary, which may or may not have even appeared in the paper - we wouldn't even know until we checked.
So we created an online searchable index to the newspaper's obituaries - not the text of the obituaries, just a name/date/page index. Patrons and staff can use this to know whether someone's obituary appeared in our newspaper, instead of having to check the microfilm every time.
Here's how we did it: first, for about the past 10 months, volunteers have been going through every microfilm reel we have, page by page, and building an Excel spreadsheet with the following information:
Newspaper | Year | Month | Day | Page | FirstName | MiddleInitial | LastName | Maiden-Jr-Sr |
The first column is necessary because we have records for both the Chelmsford Newsweekly (1940-1993) and the Chelmsford Independent (1986-present). The middle columns are reference and retrieval information. In the last column, we included extra information, like maiden name, whether a person was a "Jr." or "Sr." etc., and anything else that was random and didn't fit into another column.
The spreadsheet itself is useful, but I wanted to put this online so anyone could search it. The tool I chose was Yahoo Pipes, which has both pros and cons:
Pros:
- It's easy to play with and learn (like most Web 2.0 tools), but is also very powerful so we can grow into it
- It can use a csv file for the data, which is easy to create with Excel
- Beyond a simple search, it also provides fancy features like RSS feeds and tie-ins with other social media tools
- Using Yahoo Pipes is covered in Chapter 7 of Library Mashups, written by Nicole Engard
- The data is easy to update as the file continues to grow
- It worked
Cons:
- Searching a database is not what Pipes is intended to do, so it's probably not the best tool out there (I wanted to use DabbleDB, but they're in transition right now)
- The csv file must be ftp'ed to the webserver, which will be increasingly problematic - right now the file is 17,000+ lines and over 1MB. It will only get bigger, and the entire thing needs to be uploaded each time it's updated
- Pipes has funny rules that you don't know about until something breaks. For instance, field names must be single words (hence "FirstName" and "Maiden-Jr-Sr"), you can't use certain characters in the data (like /), the search doesn't let you combine keywords (so far - I'm sure there must be some kind of fancy loop setup that will allow it, but right now people can only search either by first name or last name or year)
- There isn't an easy way to embed the search box back into our website (there are Badge options, but only for search output) - you have to use the Pipe interface to search
- There doesn't seem to be a wildcard for search
- The results can't not link to something - I wanted the names and dates just to be displayed, but the way Pipes works requires the results to link to something
The last point was initially a pain, but it forced me to be creative, and I think the solution is actually more helpful for patrons than what I originally wanted. Now, when a patron finds the obituary listing they'd like to read, they click the link, and it automatically fills the obituary information into an email contact form on our website. That request gets sent to Reference staff, who then have an easy time of retrieving the obituary from the microfilm. Unfortunately, our microfilm machine isn't connected to a computer, so we'll just print and mail or fax the obituary to the patron. When possible we'll type them in and email them, and of course that will go into the searchable database too.
To make the connection from the Pipes listing to our email form, I had to use some javascript (which introduced another glitch: javascript makes names like O'Conner problematic, because it stops at the ', but I'll worry about this later).
Here's what the whole Pipe's source code looks like:
Here's what it does:
- The "Fetch CSV" module is the path to the csv file on our webserver
- The module to the right of that controls what the patron search input box looks like. The "Label" field is "Enter EITHER a First Name, Last Name OR Year:" and you can see where that displays on the Pipe page
- Both of those modules feed into "Filter" module - this one takes what the patron enters into the search box and filters the data from the csv file to create a subset of just matching records. Whatever the patron enters gets searched for in all the fields listed in the "Filter" module
- The next module is "Rename" and I'm not sure I'm using it properly - I needed to create two new fields, so I'm just taking two existing fields, copying them, and renaming them so I can work with them later. The fields that got copied still exist untouched
- Next is the "Regex" module, which is the most complicated and powerful, and I use it to create what the patron sees for the search results. The "Title" field is one I created, and here I'm replacing the contents from when I copied it to display what the patron will see on the screen - the code for it is "${FirstName} ${MiddleInitial} ${LastName} ${Maiden-Jr-Sr} - ${Newspaper}, ${Month} ${Day}, ${Year}, Page ${Page} ${Obituary}" which also includes punctuation formatting. So, for example, the result looks like this:
Katherine M. Polley - Chelmsford Newsweekly, December 31, 1940, Page 7
Because this field has to be a link, I also had to define what it links to, which is what I'm doing in the "Link" field. The value for that field is being written as
http://www.chelmsfordlibrary.org/reference/ask_us-obits.html?obit=${FirstName}+${MiddleInitial}+${LastName},+${Newspaper},+${Month}+${Day},+${Year},+Page+${Page}
which carries the data over to the library's website and some javascript pulls the data from the url and puts it in an email form. The patron can fill in their name and contact info into the form and submit it to us as an email message
- The "Sort" module is self-explanatory, and I chose to list them with most recent first
This feels far more complicated than it should be, and I'm sharing it here to both save someone else from having to figure it all out again on their own, and to hopefully get suggestions on how to simplify/improve it.
Although, speaking of improving it, I do have one idea for future development: the local Cemetery Department has spreadsheet online listing complete burial locations - it would be neat to mashup up that data, so the obituary is linked to the cemetery plot location.
That's down the road a bit, so in the meantime I just keep adding whatever new obituaries appear in the paper to the csv data file - I had planned to do that weekly, but lately there have been many weeks without any obituaries in the paper (see my previous post). Anyway, we'll see how this works - it only went live last week, but already patrons have been using it, and it certainly does save a lot of staff time.