Friday, April 20, 2012

Half Way Plus

First of all, I have to say that I had really hoped to make more use of this blog.  Here we are at the halfway point (well, a little past the halfway point) and this is only my ninth post.  Um, yeah ... things have been slow going.

So being that we are beyond the half way point, I really should evaluate where I am.

Number one on my Project List: Clean up the recital database.  Is this done?  No.  Is it much improved?  Yes!

  • I have separated out all personal names from all corporate names in the "performers" column of the database.
  • In the composer column I have cleaned up the vast majority of the names, fixed all the problem names that were in red (except for about 10) and blue, but decided to leave the problems that were in yellow and green (more on that later)
  • In many places there was more than one name in the composer column.  I separated out all the extra people and have them all in their own columns.
  • I started working on the Notes columns
What is still left to do:
  • Create a spreadsheet of all the names that are not authorized (yellow) or uncertain (green) so they can be looked out at a future time.  This was bogging me down too much and i finally decided that I could spend the next three months fixing all these or I could skip it and get more work done in other areas.  Skipping it made the most sense.
  • Authorize all the "other" names.  These are arrangers, transcribers, writers of lyrics, etc.
  • Notes, notes, notes, and more notes.  I found this extremely time consuming when I started it and difficult to do.  I feel like I need to see the programs for each one or at least the recordings in order to  better organize the notes.  They really are a terrible mess!!
Number two on the list: Creating templates.  Not been done yet.  This is something that I feel confident will be easy to do once I get to a point where I'm ready to move the spreadsheet into MARC records.  Plus, I know someone at another institution who told me she has a template already.  Cooperative cataloging at it's finest!!

Number five on the list (yes, I know I skipped some): Survey.  I actually forgot that it was five on the list, but really my list isn't exactly priority order.  Well, maybe a little bit, but I have my reasons for skipping three and four.  I did finally take all my notes from when I was at MOUG and MLA in Dallas and transcribe the questions and suggestions I got from my awesome, supremely more intelligent colleagues and I'm ready to start reorganizing that list and editing the questions into a survey.  I have a reference question out to help with the "demographics" part and I have a list of colleagues I respect (paired down from 300+ to about 5) who I would like to ask to look the survey questions over and give me suggestions before I send it out.  So I feel like I've made progress here.

Number three: investigate digital possibilities.  Haven't done this yet.  This feels like just an intellectual exercise, which is the kind of thing I like.  The kind of thing that led my Master's thesis adviser to tell me that I'm a great researcher but not much of a wordsmith (thankyouverymuch).  So this is on the back burner for now.

Number four: Requirements for recital programs.  Also an item that has not been done yet.  But I have started thinking about it.  That's a start, right?

A little beyond the halfway point and this is where we stand.  I feel like I still have so much to do with this nasty spreadsheet!!  There is just still so much wrong with it.

Friday, March 2, 2012

Sabbatical Challenges


As I work on this sabbatical project, there are many challenges I have as well.

The biggest cutest challenge is caring for two babies who are just now nine months old while I work at home.  I love having this opportunity to be home with them, but it is a challenge.  My work schedule consists of about an hour and a half in the morning during nap time, another hour in the afternoon during their second nap time, a couple hours in the evening a few nights a week after the boys are in bed, and occasionally some time on the weekends.  Doesn't really add up to a lot of time, so I try to make use of every minute I can once the boys are sleeping.  It's a challenge but it's a wonderful challenge to have.

An additional challenge has been trying to find a daycare for the boys for when I go back to work in July.  It seems I could find a daycare for later in the fall, but not necessarily in July.  I'm still on the hunt, but going out to interview and tour daycare facilities takes time away from my project.  

On another front, just when this sabbatical started we got an offer on our house.  We put it up for sale early last fall and this was great news!  But it also meant we had to find a new place. We did and we're going to be moving soon.  So the project will probably take a backseat temporarily while we pack up and move from one house to another.  I assume we'll also be without internet for a short period while we're between houses.

I am also working on two other projects at the same time as the sabbatical project.  One is an ongoing thing with a deadline in June.  I'm working on getting that project wrapped up and then not worrying about that project again until I'm back at work full time.  I'll have about 11 months before the next deadline.  The second project is one that I thought was complete but was asked to do a few more things too.  I never found the time at the end of the year to squeeze it in along with the many other things I was doing in trying to prepare to be gone from the office for six months and since I was also not given a deadline I've put it aside for the time being. But it weighs on my mind and I feel like I need to just go in and get it done.  I don't think it'll take long, but I'm afraid once I start that I'll discover otherwise and get bogged down in that.  I really need to find out what the deadline is, that would give me the sense of urgency (or not) that I probably need.

It's a bit of a long list and it feels overwhelming to me to list it all out like this.  But I wanted to do this post because this is part of the reality of this sabbatical and this blog is a journal of this six month project.  In the middle of all this, I am pleased with the amount of work I have actually been able to accomplish.  And when I go back to work in July I think I'll look back on these six months as the busiest ever.  This might be a break from work, but it's not a break by any means.  I feel even busier than I do when I'm working in the office full-time.  That's the reality.

Thursday, March 1, 2012

Two Months Gone, Four To Go

One third of the way through this sabbatical deserves an update.  Especially since I haven't posted about anything here in a while.

Since my last update I am still working through the Composer column to research the names that were problems during the first pass.  I'm currently a little more than 30% done.  As expected it is taking a while to work on each name.  Some are easier than others.  The problem names are color coded four different ways.  Those that are in highlighted in yellow are names that are not in the authority file.  I've decided to leave those alone for now.  I am focusing instead on the other three problem types and for the most part I have been able to figure out either who in the authority file the person is or change the color coding to yellow because the name is not in the authority file.

Two weeks ago I attended the Music OCLC Users Group Meeting and the Music Library Association Annual Conference.  While at these meetings I chatted with several colleagues about my project and specifically about what they were each interested in knowing about how others manage their recital recordings.  I had some very interesting conversations with people from small institutions and large institutions.  It helped give me more perspective on how I would like to put together a survey and the kinds of questions I should ask.  I also received a lot of good ideas on questions to ask on the survey.  I am hoping to start putting together my questions soon based on my notes of my ideas and the notes I took while at the conference.  I even had someone volunteer to look over a draft of the survey questions.

After two months I do feel like I have accomplished something, although right now it doesn't feel like it as I work through the problem names.  But I have a much more organized spreadsheet and a plan for the the parts that are still to be dealt with.  Plus the survey is coming together even if it is currently in my notes and in my head.  There is still a lot to do in the next four months!

Thursday, February 9, 2012

Progress Update

In the last update I gave I had completed about 70% of the "Name" column on the spreadsheet.  That column consisted of student performer names, ensemble names, faculty ensemble names, faculty names, guest performers, and a few other random names.  Mostly I was separating out the individual names from the ensemble names.  That is now done!

What is now left in that column and the columns that got created from it is to separate out multiple names.  Some cells have two or more performers' names or two or more ensemble names.  I also haven't gone through those columns for authorized names.  So that is all still to be done.

I have also just completed my first pass through the column labeled "Composer."  This column had been worked on previously.  So I sorted the spreadsheet so it was in order according to this column and started going through it again.  I created four new columns to go along with this one:

  • Composer: name only (existing column)
  • Composer#$b: any sort of number associated with a name
  • ComposerOther$c: any sort of title (for example, Sir) associated with a name
  • ComposerFuller: for the fuller version of a name (when a name is commonly used with initials and we know what the initials stand for, we often put the fuller form of the name in parenthesis in another subfield in the name field).
  • ComposerDates: Birth and/or death dates associated with a name
Since this field was already checked against the authority file, I was just moving the info around.  It went pretty quickly.  It's not 100% done, but the good majority of it is done.  The remainder cells have some sort of issue associated with them: multiple names, just a last name, two or more possible authorized headings, problems with diacritics in Excel, unauthorized names, and a few other minor problems.  Those are all highlighted in different colors to let me know what the problem is.

My next step is to go through the Composer column again and work on those highlighted cells.  I expect this to take a little longer since I'll have to go into the authority file for each one and try to determine the correct form of name.  I'll also have to move the additional names to a different column.  I currently have three additional columns for "Other name."  

I'll get started on this this week, but then I'm leaving early next week to attend the Music OCLC Users Group Meeting and the Music Library Association's annual conference in Dallas, TX.  While there I'll probably take the opportunity to try and talk to other librarians about the cataloging and management of the recital recordings at their institutions.  I'm looking forward to the trip!

Wednesday, January 25, 2012

Authority Control Matters

Authority control: The procedures by which consistency of form is maintained in the headings (names, uniform titles, series titles, and subjects) used in a library catalog or file of bibliographic records through the application of an authoritative list (called an authority file) to new items as they are added to the collection. Authority control is available from commercial service providers.  (Online Dictionary for Library and Information Science http://www.abc-clio.com/ODLIS/odlis_A.aspx)
Authority Control is a necessary part of a library catalog.  And it sure helps with any large database.  If I ever doubted the value of authority control (I never have), working on this database would be enough to convince me of its importance.  I'll explain.

Since my last post regarding the multitude of problems I was encountering I discussed the issues with a friend who works in computers and databases and she made a number of wonderful suggestions.  She also is going to help me with an aspect of moving the database once I get it cleaned up.

Speaking of cleaning it up ... wow, what a mess!!

After discussing this with my friend, I am no longer trying to move everything from multiple lines into one line as I was doing before.  That was taking way too long!  After two weeks on this project, I had managed to combine about 60 lines into 10 records.  I have a whole new approach now.  Currently I am going down the "Name" column only.  The goal is to separate out the individual names (remain in the column) from the ensemble names (put in a new column) from other phrases that are most likely titles (another new column).

Sounds easy, right?  Mostly it is.  I have managed to get down around line 7000 of the 10,100+ line spreadsheet.  That's big progress!

It is enlightening to see how incredible inconsistent names were entered into this database.  Notice in the definition I quoted at the beginning of this post that authority control requires "consistency of form."  Obviously that wasn't a concern with this database ... ever.

Just today I found a recital that consisted of about 10 lines of data (i.e. 10 pieces performed).  The same name appeared in all 10 rows of the column, but in about three different forms, just as example:

  • Last, Matthew R.
  • R. Matthew Last, piano
  • Last, R. Matthew
Hmmm.  So my first question: why is the person's instrument listed in some places and not others?  Second, is the initial a first initial or a middle initial?  And finally, could they not decide if the name should be listed last name-comma-first or first-last?

The order of the names is constantly changing as I go down the list.  The addition of instrument or voice part is also inconsistent.  It seems to me that there is a tendency to prefer last name-comma-first name unless there is an instrument name added on in which case it becomes first name-last name-comma- instrument/voice.  But not always.  Another issue is nicknames: sometimes they are used (Jim or Ben) and sometimes not (James or Benjamin); and it is obvious that it is the same person.  

Oh, authority control, how I miss you!

There are similar issues with ensemble names.  Some are more complete than others.  Sometimes ensembles are combined into one makeshift name, sometimes they are just listed together.

The one thing I haven't bothered with yet is lines that have two or more names or two or more ensembles.  I'm going to have to add some new columns.  I'm also not bothering with changing the names all into the same format or dealing with inconsistent punctuation.  For now, I just want to separate everything out and then I can go back through and do all those little details.  

I'm considering going on to the notes column and then coming back to the names once that is done.  Mostly because I'm expecting some duplicate info in the notes area.  That will be something to evaluate once I get done with this current column.  

Only about 3000+ lines to go!  

Thursday, January 12, 2012

Problems, problems, and more problems

There are lots of problems in the spreadsheet I am working off of.  I knew there would be, but I'm realizing that it is even worse than I could have imagined.

I've had various people over the years work on this spreadsheet, but at the time I didn't know that each recording was represented by multiple rows in the spreadsheet.  So they would sort the sheet by composer name and then be able to edit all the names to the authorized heading and only have to look up the composer once since they were listed in alphabetical order.  Sounds like a great plan, right?

So now the spreadsheet is organized by date (since that is the only way to know which rows belong together) and the composers are organized alphabetically within the concert or recital date.  Thus, I don't know what the actual order the works were performed in.

Why does this matter?  Because when we catalog sound recordings we list the works in a contents note in the order in which they were performed.  And if there are multiple performers who played on some works but not on others, we list which work they performed on, as in number of work (i.e. Kerri Baunach, clarinet (3rd work)), in a performers note.  I have no way of knowing that info at this time, so I am just listing the various performers or groups of performers in no particular order.

As a result, my performer notes aren't that helpful at the moment.  And my contents notes appear as if the works were all performed in alphabetical order according to the composer's name.  If you're a bit OCD, that might seem kind of cool.  For this OCD cataloger, it's not.

Other problems:

  • Misspellings in names of performers or groups
  • Inaccurate title info (how about Sarabande for guitar ensemble by J.S. Bach.  Or Fugue for guitar ensemble by Handel.  Seriously, no arrangers listed, no further title info to let me know WHICH sarabande or WHICH fugue.)
  • Incomplete title info (kind of goes along with the point above)
  • I have no idea which concerts or recitals have programs and which do not (I may need to contact the School of Music and spend some time with a scanner, which means going to the office and finding a babysitter.)
  • The authority work
As for dealing with the spreadsheet itself:
I scrolled down to the bottom of the sheet and found that I have about 10160+ rows of information.  I have so far converted the first 57 (well 56, row 1 is column headings) into 10 rows on a new spreadsheet that would create 10 MARC records.

That doesn't sound like much when I look at those numbers, but really ... that did take me a long time.  Refer back to the problems I'm dealing with.  Those problems are on each performance.  Every. Single. One.

On the spreadsheet, I've created four new tabs.  One for the student recitals and one for the ensemble concerts.  Since I'm creating columns for each MARC field, it seemed either to do it this way so there was only one 1XX field in each spreadsheet: 100 on the student recitals and 110 on the ensembles.  Then I created a tab as a "transfer" space.  This is just for me to copy the info for the recording I'm currently working on over to this space so I can see it better.  It was getting hard on the master list to see just the parts I needed and I kept losing my place, thus wasting time.  It's working well so far.  Finally, the fourth tab I added today as a place to list the date of recordings where there is insufficient info and what that insufficient information is.  This will help when I have to go back and fix things, it should be easier to locate the problem items.

As for the original spreadsheet, I'm not changing anything on it.  I'm keeping it as a master in case I mess something up somewhere and need to refer back to something.  I've made the date column bold and as I complete a recording I un-bold those dates.  That way I can keep track of where I am visually, especially since I am working in short stints and sometimes have to walk away in the middle of something.  Seven month old babies don't like to be kept waiting.

I also now have MARCEdit on my work computer and someone sent me a link to a tutorial on You Tube.  So new part of the sabbatical project: learn how to use MARCEdit and transfer all these records I'm working on into the actual MARC format.

Lots going on.  Problems galore, lots of cutting and pasting between spreadsheets, heavy use of the authority file, and eventually learning a new program.  On top of that, I really have to figure out a better schedule!

Wednesday, January 4, 2012

The Sabbatical Starts

The Sabbatical Project has officially started!  It's nice to work on a more leisurely pace and be able to focus on one thing rather than juggling a gazillion responsibilities (seemingly).

The biggest challenge is just figuring out a schedule.  I'm doing this sabbatical project with two babies at home with me, currently 7 months old.  Today was tough, we got a little off their schedule so they were each sleeping and eating at different times from each other.  Not good!  On the positive side, that rarely happens, so I have hope that we'll be back on track tomorrow.

I have discovered that I can hold a baby in one arm, hold a bottle in that same hand, and have the other hand free to check email, type, and search other institution's OPACs (ooooh, a new term for my Definitions page!).  That worked for one baby, not so much with the second baby.

One thing I did today was search for recital cataloging at other institutions.  This helped give me a better idea of what I really need to do with the data we have.  Biggest discovery: the data we have totally sucks.

The second thing today was looking at the data in our spreadsheet (used to be in a database format that is no longer supported, thus the spreadsheet).  Literally "looking."  I had a baby in my arms that was fascinated by the laptop keyboard.  I was a bit concerned looking at the first several lines of the spreadsheet.  I had to email the manager of our Fine Arts Media Center to see if programs were available for a couple mid-80s recitals or concerts so I could make sense of what I was looking at.  He provided the program for one and described the two cassette tapes for the others that helped answer my questions.

(Should I add "Cassette tape" to my list of definitions?  It was recently brought to my attention that there is now a generation of people who don't know what a cassette tape is.  Wow, I'm getting old!)

Lastly, just a note about the spreadsheet I'm working on.  Each performed piece of music is listed on a separate row of the spreadsheet.  Each entry contains the recital date, the student's name, and then the composer and title of the piece and any notes (all in one cell, by the way).  That's it.

So imagine a student who gives a senior recital and performs 5 pieces.  Then they stay to get a master's degree and they give another recital performing another 5 pieces.  That's ten lines on the spreadsheet that will contain their name.  The date is the ONLY way I am able to tell which recital pieces go together; it is THE most important piece of information I have.

And then you have lines with the same date but different performers.  Oy!  Two different recitals on the same day?  One recital with different performers?

I may be taking more trips into the office than I originally thought I would.  It'll be good for the babies to get out.

Tomorrow's goal: Set up a second tab on the spreadsheet  for editing purposes.  I'm also considering moving all the info for each recital or concert into one row, rather than multiple rows.  And then maybe also separate out large ensemble concerts from the student recitals (more tabs).  I feel like this week mainly about realistic organizing (as opposed to the previous planning I did); I'm getting a feel for how this is really going to work and what really needs to be done.  It's already looking a little different than I had thought it would.