Race(ing) the Archives

Research with WCU Yearbooks

WCU Digital Collections, part of Special Collections at FHG Library, hosts a variety of digital collections for primary research.  Digitized versions of West Chester Yearbooks are available covering the period from class of 1910 (1909-1910 AY) through class of 2008 (2207-08 AY).    

West Chester University Yearbooks within Digital Collections is set up to allow you to browse by each year’s yearbook, organized by decade.  Each of the issues has been digitized: scanned and OCRed. The storage of these items uses the Internet Archive, which is a great setup for storage but not the best for searching.  Identify a decade and click on a year to head to that Yearbook. 

Browsing the Yearbooks is the best approach.  Here’s the 1913 Serpentine

The site really wants you to browse/read the books. The setup uses a javascript page turner to animate the act of reading analogous to a paper book (we thought this was fascinating in 2008). At the top of the screen, there’s a menu for the Internet Archive that reflects the variety of their holdings (not just books– some really cool stuff).  You can make a profile and login, upload stuff to the archive, and search *everything* they hold (there’s a lot of stuff there). What you *can’t* do is search across one collection; you can’t search just within the WCU yearbooks. That’s a bummer. You can search inside each individual yearbook.  Be sure you search in the correct place– this book, not everything in the archive:  

A word about searching vs browsing the yearbooks.  These documents were digitized long enough ago (~2009), with old enough software (~2005), and the text is somewhat non-standard, so the OCR (how one makes an image of a letter into a machine-readable font) is old enough that it is error-prone.  You can use it, but don’t feel confident if a search doesn’t find results. Sometimes, the OCR mis-recognized the text on the page and your search can’t improve the original misrecognition.  

At the bottom of your page image is the menu for this item: scroll between the pages, flip them one at a time, make the page smaller or larger, and navigate using thumbnails. 

We urge you to browse.  These documents were produced by students, somewhat independently, so there’s interesting stuff there. 

Finding Something Great

There’s a ton of stuff there.  We are, of course, interested in the yearbook entry for every black student who attended West Chester.  But there’s more than just students in the yearbooks– there’s a window into the culture of that year, as curated by the yearbook’s staff.  When you find something you like, you need to download it to your local machine in order to make an item for the archive. For that, scroll down below the page images.

Below the page image, there’s a list of metadata identifying the item and on the right column some download options.  The PDF option downloads the entire book as a pdf, with decent image quality. The single page processed JP2 zip option is much higher quality and, helpfully, downloads the book as a set of pages.  That’s the one we’ll be using. Download that option, and wait: the internet archive has great storage, but not great speed. You’ll download about 70-80 MB zip file for each Yearbook. Click on the zip file to expand the archive into a directory of files, one for each page.  Since everything is a page for this software, the page count will be off.


Everything is a page:  front cover, inside front cover, etc.  They are sequential, so you should be able to quickly find the page once you calculate the differential.  Here’s the file named serpentine1913penn_0033.jp2, identified as page 33: 

The number at the bottom identifies the page as 25.  That’s cool. When you find the page you love, you need to save the page and name it correctly.  Here’s our naming convention for yearbook files: Serpentine_YEAR#_PAGE#. So in our system, this image would get renamed Serpentine_1913_25.  Click File—>Save As—>and name the file using the correct information. When you save, for file format select png (not jpeg or jp2) as the file format for the file.  

Selecting an item for research from the Yearbook

Find something?  Here’s what you need for each found item in the Yearbook:

  1. The name of the item.  If an illustration or a candid photograph, the item name is not identified and you’ll have to name it. Put on your librarian’s hat and try and objectively describe the item in its name.  For the example above, I would name it “Untitled illustration of Faculty” as my name. That’s going to be the Title of the item.  
  2. The date of the yearbook  (1913)
  3. The page(s) for the item (25)
  4. The url to the yearbook on internet archive (https://archive.org/details/serpentine1913penn)
  5. The url to that page (https://archive.org/details/serpentine1913penn/page/25)
  6. The image(s) of the page(s).  
css.php