Friday, 15 September 2017

Match of the Day

- or a simple example on using OpenCV Template Matching (or where is Billy Bunter these days?).


This post is not about sports or old Genesis songs - but a bit about comics and mostly about Delphi and OpenCV - the Open Source Computer Vision Library.


I have a couple of years spend some time on indexing, approving and editing, as other volunteers - on GCD/comics.org that is - were the vision statement says:

"The Grand Comics Database™ Project intends to be the most comprehensive online comics database for comic readers, collectors, scholars and professionals."

Some do crosswords or Sudoku - I do comics indexing, and feel free to contribute if you find the interest in that kind of detective work, or have boxes for old comics laying around :).

Disclaimer: As always is this "tutorial essay" a conceptual example, and to keep minimizing the code , exception handling has been left out.

By using OpenCV and a few lines of code, I thought I would see if I could start making a small tool to minimize my frustrations when identifying and indexing reprints of comics - living in a country that made heavy use of reprints from all over the place.

The various reprint editors did not make it easier when they either edited the panels, changed the names or otherwise add their creative touch - which often makes things even harder to find.

The search features on comics.org is great when you have just a bit of textual info, but when I stumble upon a picture reference I have seem before, it is often hard to find again in my collection.

To the rescue - OpenCV are GO.


Since I am fairly new to OpenCV - bear with me - I decided to either try to solve my problem using
Template Matching or Cascade Classification and Facial Recognition. And I started of with Template Matching since it seemed simpler, and I was also afraid that Face Recognition would fail out of the box on comic characters and anthropomorphic-funny animals - and Cascade Classification would require some "training" to get a high confidence in hit rate.

My need was also to find panels (or parts of these), since many comic blogs references to a panel when revealing useful information - so matching a template/patch seemed the obvious choice.

Find the official doc for Template Matching here.

I will be using Delphi-OpenCV translations by Laentir Valetov and Mikhail Grigorev found on GitHub here.

Follow the instructions given in the README.md on GitHub - and try some of the samples to ensure that everything is working.

Apart from the classes and library wrappers, there are also VCL and FMX components, which makes it very easy or natural for Delphi developers to build something with a minimum of code. I will in this case refrain from using the provided components - and just use standard components found in the free Starter Edition and up.

I will just start a new VCL project, throw in 3 TButtons, 2 TImages, a TListBox, 2 TLabels, a TFileOpenDialog and a TStatusBar. Placed, aligned and anchored I get something like this at design-time:


The only property changes worth mentioning are setting the TImages' Proportional and Stretch to True, and TListBox.Style to lbVirtual - more on that later.

1. Loading the template.


Before starting add the following to your uses clause:

  Vcl.Imaging.jpeg, Vcl.FileCtrl, System.Generics.Collections, System.Generics.Defaults, ocv.highgui_c, ocv.core_c, ocv.core.types_c, ocv.imgproc_c, ocv.imgproc.types_c, System.IOUtils

On the first buttons OnClick event I added the following code:

  if fileOpenDialog.Execute then
  begin
    templateFileName := fileOpenDialog.FileName;
    imgTemplate.Picture.LoadFromFile(templateFileName);
    templw := imgTemplate.Picture.Width;
    templh := imgTemplate.Picture.Height;
  end;

Bloated with a couple of variables I use later - so basically the first TImage is loaded with a file picked by the user via the dialog.

2. Selecting where to search.


The second button does a bit more that just showing a dialog to select a directory:

  if not Assigned(FilesToMatch) then
    FilesToMatch := TObjectList<TFileMatchAttr>.Create(True);
  fileList.Clear;
  if Vcl.FileCtrl.SelectDirectory(
'Please select directory with files (currently jpg only), you want to do a template match on. Subdirectories are included.',
        TPath.GetPicturesPath, dir, []) then
  begin
    ListDirectory(dir+'\*.jpg', FilesToMatch);
    fileList.Count := FilesToMatch.Count;
    fileList.Invalidate;
    lblSearchDir.Caption := '...in '+FilesToMatch.Count.ToString+' file(s), in: '+ dir;
  end;

I do start of creating a generic, TObjectList<T> letting it own the objects it is going to contain, since I want more that a plain TListBox would just give me. It holds objects that are defined as:

  TFileMatchAttr = class(TObject)
    Name: string;
    Fullpath: string;
    Confidence: Double;
    MatchRect: TRect;
  end;

So for every file I run the Template Matching against, we keep the filename (for the display), the full path, how confident we think the match was and position of where the match was on the given image.

I have left out the ListDirectory helper - that by recursion finds all the *.jpg files in the directory and its sub-folders - but you can find whole thing in the source. It does create and adds items to the TObjectList - and no worries freeing them - since that is auto-magically done by the owner.

Note that I set the count on the TListBox and invalidates it - the reason being that I what to update it through its onData event - this is also the reason the Style property needs to be lbVirtual.

  Data := FilesToMatch.Items[Index].Name+
  ' ('+FormatFloat('0.00', FilesToMatch.Items[Index].Confidence*100)+'%)';

This populates the TListbox with the filename and the percentage of confidence in the match, when invalidate is called.

3. Firing OpenCV up, looking for matches and storing the results.


On the third button I the following code:

procedure TmainForm.btnStartMatchClick(Sender: TObject);
var
  imgSrc, imgTempl, imgMat: pIplImage;
  min, max: double;
  p1, p2: TCvPoint;
  fma: TFileMatchAttr;
  tfile, sfile: pCVChar;
  i: Integer;
begin
  i  := 0;
  tfile := pCVChar(AnsiString(templateFileName));
  imgTempl := cvLoadImage(tfile, CV_LOAD_IMAGE_GRAYSCALE);
  for fma in FilesToMatch do
  begin
    sfile := pCVChar(AnsiString(fma.Fullpath));
    imgSrc := cvLoadImage(sfile, CV_LOAD_IMAGE_GRAYSCALE);
    imgMat := cvCreateImage(CvSize(imgSrc.width-imgTempl.width+1, imgSrc.height-imgTempl.height+1), IPL_DEPTH_32F, 1);
    cvMatchTemplate(imgSrc, imgTempl, imgMat, CV_TM_CCOEFF_NORMED);
    cvMinMaxLoc(imgMat, @min, @max, nil, @p1, nil);
    fma.Confidence := max;
    p2.X := p1.X + templw - 1;
    p2.Y := p1.Y + templh - 1;
    fma.MatchRect := Rect(p1.x, p1.y, p2.x, p2.y);
    inc(i);
    StatusBar1.SimpleText := 'Files processed: '+i.ToString;
  end;
  cvReleaseImage(imgSrc);
  cvReleaseImage(imgTempl);
  cvReleaseImage(imgMat);

First a couple of OpenCV specific types are declared; 3 images - where we are looking, what we are looking for and the result matrix. And some string types for fun :/

Loading and converting the template image to grayscale - to ensure the colour depth is the same we do that also with the images we want to search in. Creating the result matrix "image" with the correct size.

Then we run the cvMatchTemplate function with the macthing method CV_TM_CCOEFF_NORMED - since it seemed to give the best results in my case. And after that the cvMinMaxLoc locates the maximum confidence and point where the best match was.

I update the items in my generic list with the confidence and the TRect data for the match. After the traversal of the items in the list a bit of housekeeping is done - releasing the OpenCV "images".

Then I sort the list - getting the best matches first, by using any anonymous method for the IComparer function and update the TListBox.

// Sort according to level of confidence - update ListBox
  FilesToMatch.Sort(TComparer<TFileMatchAttr>.Construct(
      function (const L, R: TFileMatchAttr): integer
      begin
         if L.Confidence=R.Confidence then
            Result:=0
         else if L.Confidence > R.Confidence then
            Result:=-1
         else
            Result:=1;
      end)
  );
  fileList.Invalidate;
end;


4. Showing the results and closing.


Lastly on the TLListBox.onClick event I want the selected files image displayed and a red rectangle drawn to indicate where we found the match.

procedure TmainForm.fileListClick(Sender: TObject);
var
  pic: TPicture;
  bmp: TBitmap;
  R: TRect;
begin
  R := FilesToMatch.Items[fileList.ItemIndex].MatchRect;
  pic := TPicture.Create;
  try
    pic.LoadFromFile(FilesToMatch.Items[fileList.ItemIndex].Fullpath);
    bmp := TBitmap.Create;
    try
      bmp.Width := pic.Width;
      bmp.Height := pic.Height;
      bmp.Canvas.Draw(0, 0, pic.Graphic);
      bmp.Canvas.Pen.Color := clRed;
      bmp.Canvas.Pen.Width := 10;
      bmp.Canvas.Polyline([R.TopLeft, Point(R.Right, R.Top), R.BottomRight, Point(R.Left, R.Bottom), R.TopLeft]);
      imgPreview.Canvas.StretchDraw(Rect(0, 0, imgPreview.Width, imgPreview.Height), bmp);
    finally
      bmp.Free;
    end;
  finally
    pic.Free;
  end;
end;

This ended up a bit of a mess - I need to draw on the bitmap canvas of the jpeg file, before I stretch it onto the preview TImage. Doing that on a TovcView would probably have been trivial - but I did refrain from using the components.

Also take a look at the Delphi-OpenCV\samples\Components\cMatchTemplate project that is using the web camera and doing Template Matching on the live video stream - with practically no code.

On the forms OnCloseQuery event I do free the TObjectList<T>, so that it as the owner can free its items.

Support the contributors of Delphi-OpenCV and others who helps us make fun things easier.

I will probably start reading up on all my blank spots in this huge topic - and improve the tool for personal use - adding .cbz support, GCD data awareness (local REST api), search matches from mobile, cross-platform and optional "face recognition".

"I probably also need ter add more image re-sharpin' and wot else - ter cop end better results."

Sorry - just playing with a Cockney dialect translator....ups.

Source code can be found here.

Enjoy.