Showing posts with label Data. Show all posts
Showing posts with label Data. Show all posts

Tuesday, June 12, 2012

In the Home Stretch

I can't believe this sabbatical time is almost to a close.  In a little less than three weeks I will be back at work and any further work on this project will become just a small part of my regular job again.  There is SO MUCH work that still needs to be done!!  I've done a lot but a lot more could be done.

Here's a quick summary of what I have tackled and hope to accomplish in the next few weeks.

The largest part of the project has been the database itself.  This is the main instigator of doing this project in the first place.  I have handled all the composer names and cleaned up a lot of other names (performers, arrangers, etc.) throughout most of the database: composers are done all the way through; other names are semi-done.

Currently I'm working through the notes, which has been way more of a headache than I anticipated.  But in the last few weeks I've gotten into a rhythm with those.  Basically I am taking all the notes and getting rid of duplicate information and condensing all the note info onto one line for each recording.  This is in contrast to the current set up where the notes appear on every line.  I had originally thought that they were all duplicates and that all I had to do with delete all lines but one and then arrange them appropriately into the columns I had established (general note, performers note, location note, etc.).  What a silly assumption!  Many of those lines of notes were specific to the piece that line represented.  So this is what I have done:

  • Find something in the notes section that can be or seems to represent a title for the whole recording and cut and paste it into the title column.  Delete all other instances of that title.
  • Move all performers into one line regardless of which pieces they do or do not perform on.  IF a program is available, designate which pieces individuals perform on within that note.  List each name or group once and delete all other instances of the name.
  • Move any other general notes into a general note field.  Delete all other instances of that note.
  • If a program is available create a time and place note (518) for the performance and recording.
  • If there is a note on a student recital regarding the recital being "in partial fulfillment of" a certain degree type, put it in a general note.  Delete all other instances of the note.  [May reconsider moving these notes to a 502 note later, for now keeping them general (500) notes.]
  • Delete notes made that are typically not used in MARC records for sound recordings.  In other words, if it doesn't fit, delete it.  [This was a really hard concept for me at first.  Delete information??  But something had to give and there were many instances of extraneous info.]
Right now my goal is to finish working on these notes through 1989 (I'm almost there) which will give me 8 years worth of recordings to play with in MARC format.  I am then going to copy and paste those 8 years into a new spreadsheet.  Once I have a new spreadsheet for them I will create two tabs and separate the recordings by recitals in one tab and large ensemble concerts in another tab.  For the most part every recording fits one of those two categories, exceptions are few and I can put those where they best fit.   Once I have them separated in this way, I can customize a few things that are specific to those types.  I should be able to complete all of this by the end of this week.  Next steps will be to finally move the info to MARC records.

The second part of my project was to create templates for future recordings.  I don't think this will be difficult to accomplish.  Especially with all the work I have done so far, I can see much more clearly what information we even have for these recordings and how I have been dealing with that info.  I think now that I have this experience, creating the templates won't be too hard.  That is unless I decide to create templates in RDA format (which makes the most sense, unfortunately).  Even then though, I don't think there would be too many differences.

Third part of the project was the digital side.  I have a proposal in my head for digitizing the programs and linking them up with the finished MARC records in our local catalog.  I haven't yet explored the possibilities of digitizing the sound.  Sometime in the next couple weeks I want to get my hands on some of the articles written about the Variations project at Indiana and maybe talk to someone who was heavily involved in that project.  I need to find out who else is doing something similar.  What I don't know is if they are digitizing recitals or not.  I'm not interested (at the time) in digitizing our regular sound collection.

Researching requirements of recital programs (the fourth aspect of my project) has also not happened.  The first part of this project really was very time consuming!  This is something I REALLY want to do though.  I know the approach I want to take with this, but haven't been able to devote the time to it yet.

Finally, surveying other institutions on their cataloging practices regarding recital and concert performances at their institutions.  I have put together a survey and I sent it out to a few colleagues just for feedback.  I have received that feedback and made a few changes, but I need to make more based on that feedback.  But then I stopped short on actually setting it up as a survey (I originally thought I'd get a survey out before the end of May).  Reason being is that I put together a short survey for a personal project and realized in that process that how I wanted to set up this survey wasn't going to work the way I wanted it to.  Plus it occurred to me that the university may have guidelines that need to be followed in doing surveys for research and since I want to eventually publish an article with the information I gather, I need to investigate further.  Plus, maybe the university has a survey tool that I could take advantage of that would be better than the free tools I have been using online for other projects.  So this aspect of my project has started to look bigger than I anticipated.

So that's where I stand right now.  In the last month of this project I am looking back at the project overall, what I have accomplished, what I know I can get done in the remaining three weeks, and what still needs to be done and I wonder if this project was too big from the beginning.  I knew it was going to be a lot (and one aspect was added by my superiors {part 3--researching digitization possibilities} so that wasn't even in my original plan) and it has proved itself so.

It probably didn't help that we sold a house, bought a house, and moved during my sabbatical.  That probably cost me about a month of time.  But other than that I have worked pretty regularly most every day.  I do feel like I have accomplished a lot and I know this database far better than I ever did.  If I can make MARC records of the first 8 years of the database and get templates established for creating MARC records from 2012 forward, that will be a big boost to what was there before.  And I will know how to deal with the years 1990-2004 and everything added since 2004 in a much more efficient way.

Friday, April 20, 2012

Half Way Plus

First of all, I have to say that I had really hoped to make more use of this blog.  Here we are at the halfway point (well, a little past the halfway point) and this is only my ninth post.  Um, yeah ... things have been slow going.

So being that we are beyond the half way point, I really should evaluate where I am.

Number one on my Project List: Clean up the recital database.  Is this done?  No.  Is it much improved?  Yes!

  • I have separated out all personal names from all corporate names in the "performers" column of the database.
  • In the composer column I have cleaned up the vast majority of the names, fixed all the problem names that were in red (except for about 10) and blue, but decided to leave the problems that were in yellow and green (more on that later)
  • In many places there was more than one name in the composer column.  I separated out all the extra people and have them all in their own columns.
  • I started working on the Notes columns
What is still left to do:
  • Create a spreadsheet of all the names that are not authorized (yellow) or uncertain (green) so they can be looked out at a future time.  This was bogging me down too much and i finally decided that I could spend the next three months fixing all these or I could skip it and get more work done in other areas.  Skipping it made the most sense.
  • Authorize all the "other" names.  These are arrangers, transcribers, writers of lyrics, etc.
  • Notes, notes, notes, and more notes.  I found this extremely time consuming when I started it and difficult to do.  I feel like I need to see the programs for each one or at least the recordings in order to  better organize the notes.  They really are a terrible mess!!
Number two on the list: Creating templates.  Not been done yet.  This is something that I feel confident will be easy to do once I get to a point where I'm ready to move the spreadsheet into MARC records.  Plus, I know someone at another institution who told me she has a template already.  Cooperative cataloging at it's finest!!

Number five on the list (yes, I know I skipped some): Survey.  I actually forgot that it was five on the list, but really my list isn't exactly priority order.  Well, maybe a little bit, but I have my reasons for skipping three and four.  I did finally take all my notes from when I was at MOUG and MLA in Dallas and transcribe the questions and suggestions I got from my awesome, supremely more intelligent colleagues and I'm ready to start reorganizing that list and editing the questions into a survey.  I have a reference question out to help with the "demographics" part and I have a list of colleagues I respect (paired down from 300+ to about 5) who I would like to ask to look the survey questions over and give me suggestions before I send it out.  So I feel like I've made progress here.

Number three: investigate digital possibilities.  Haven't done this yet.  This feels like just an intellectual exercise, which is the kind of thing I like.  The kind of thing that led my Master's thesis adviser to tell me that I'm a great researcher but not much of a wordsmith (thankyouverymuch).  So this is on the back burner for now.

Number four: Requirements for recital programs.  Also an item that has not been done yet.  But I have started thinking about it.  That's a start, right?

A little beyond the halfway point and this is where we stand.  I feel like I still have so much to do with this nasty spreadsheet!!  There is just still so much wrong with it.

Wednesday, January 4, 2012

The Sabbatical Starts

The Sabbatical Project has officially started!  It's nice to work on a more leisurely pace and be able to focus on one thing rather than juggling a gazillion responsibilities (seemingly).

The biggest challenge is just figuring out a schedule.  I'm doing this sabbatical project with two babies at home with me, currently 7 months old.  Today was tough, we got a little off their schedule so they were each sleeping and eating at different times from each other.  Not good!  On the positive side, that rarely happens, so I have hope that we'll be back on track tomorrow.

I have discovered that I can hold a baby in one arm, hold a bottle in that same hand, and have the other hand free to check email, type, and search other institution's OPACs (ooooh, a new term for my Definitions page!).  That worked for one baby, not so much with the second baby.

One thing I did today was search for recital cataloging at other institutions.  This helped give me a better idea of what I really need to do with the data we have.  Biggest discovery: the data we have totally sucks.

The second thing today was looking at the data in our spreadsheet (used to be in a database format that is no longer supported, thus the spreadsheet).  Literally "looking."  I had a baby in my arms that was fascinated by the laptop keyboard.  I was a bit concerned looking at the first several lines of the spreadsheet.  I had to email the manager of our Fine Arts Media Center to see if programs were available for a couple mid-80s recitals or concerts so I could make sense of what I was looking at.  He provided the program for one and described the two cassette tapes for the others that helped answer my questions.

(Should I add "Cassette tape" to my list of definitions?  It was recently brought to my attention that there is now a generation of people who don't know what a cassette tape is.  Wow, I'm getting old!)

Lastly, just a note about the spreadsheet I'm working on.  Each performed piece of music is listed on a separate row of the spreadsheet.  Each entry contains the recital date, the student's name, and then the composer and title of the piece and any notes (all in one cell, by the way).  That's it.

So imagine a student who gives a senior recital and performs 5 pieces.  Then they stay to get a master's degree and they give another recital performing another 5 pieces.  That's ten lines on the spreadsheet that will contain their name.  The date is the ONLY way I am able to tell which recital pieces go together; it is THE most important piece of information I have.

And then you have lines with the same date but different performers.  Oy!  Two different recitals on the same day?  One recital with different performers?

I may be taking more trips into the office than I originally thought I would.  It'll be good for the babies to get out.

Tomorrow's goal: Set up a second tab on the spreadsheet  for editing purposes.  I'm also considering moving all the info for each recital or concert into one row, rather than multiple rows.  And then maybe also separate out large ensemble concerts from the student recitals (more tabs).  I feel like this week mainly about realistic organizing (as opposed to the previous planning I did); I'm getting a feel for how this is really going to work and what really needs to be done.  It's already looking a little different than I had thought it would.