Presentation was prepared over the last week and will be presented later today.


The final few weeks involved writing this up (hence no posts) and completing the final document.

This was submitted on the 27th October

Update – 5 Oct 10

Wednesday 29th

  • Had a meeting with Mike
    • using inverse fractions to determine the ‘weight’ of a note depending on where it is located in a bar. (beat 1, 2, 3, 4 (of 4/4 music) = 1, quaver’s off them = 2, and so on
  • MIC – running on faster computer now, is still running today
  • Started looking at BLAST type algorithm with indexes, discovered it was better to create my seq alignment algorithm with these new weightings.

Monday 4th October

  • Finished seq. alignment algorithm and is currently running.

Tuesday 5th October

  • Sequence alignment has shown interesting results while running over last night. 3 melodies came up with a 100% match, it turns out they are the same melody that has been transposed. This has given me some points to look for in other techniques too, while doing this I checked the MIC matchings of the same files and discovered that they have a close percentage match too, I will use these to check against my BLAST like algorithm too when that is implemented. (Today’s job)


Currently I am parsing musicXML files so I can obtain more accurate data from my input as MIDI is quite restricted.

Whilst implementing it I am obtaining example musicXML files to place in my database once implementation is complete.


In looking for beat detection, I came across MusicXML which appears to be able to give a better indication of many features of a note, and explicitly specifies beats, note lengths, and even gracenotes.

Implementing the use of MusicXML appears to have many advantages and could make many things easier.

Scatter Plot 2

Did this across Melody ID’s 1-100 in the larger DB and got some interesting results

This scatterplot shows 4 interesting melodies,

id: 4, 23, 47, 62

23, 47, 62 are very simple melodies, and consist on one or very few notes.

4 how ever is quite elaborate and looking at this melody could be quite interesting.

Average compression of these 100 melodies is 240.7

mId:4 -> 1822

mId:23 -> 44

mId:47 -> 20

mId:62 -> 17

Scatter Plot

Created a scatter plot of MIC (Mutual Information Content) firstly on small DB which contains, 1-4(Land Down Under) 5-6(Kookaburra) 7-12(Random Melody)

Getting Ready for index processing (BLAST style)

Today I started getting classes ready to run a BLAST style seq. alignment from the created indexes, for testing I setup a small DB with 12 melodies to run quickly while creating this.


Grace Note Detection

While trying to gather stats on midi inputs, discovered that there was an error with my import of melodies.

Fixed this issue up, it had to do with reading on/off’s on notes in a midi file.

Indexing  Stats:

1747 Melodies in my DB (took 53 mins to compress)

Motif Length 3: 1,339,216 rows created

Motif Length 4: 2,567,181 rows created

Motif Length 5: 5,574,372 rows created

Need to now create a BLAST like algorithm that can use these indexs and extend on the.

->scoring for sequence alignments, based on cluster’s??? need to look at beat detection to test seq. alignment with clusters

Mutual Information Content

Run across whole DB and collect statistics and place into a scatter plot diagram to show percentages, could be interesting?

Into DB, insert some of the same melodies intentionally to see if they are detected, also insert random melodies.


Heuristic, what about turns?

Not on the bean, less than a standard length – gather a standard for the melody?

Compression Results

Normal Compressed
A 44 28
B 46 31
A+B 90 49

A/A+B = 28/49 = 57%

B/A+B = 31/49 = 63%

Is this not filled with lots of stuff between/around

Domain agnostic – leave the chords