Automatically tagging, captioning and categorising locally stored images using the Azure Computer Vision API

It’s easy in the digital age to amass tens of thousands of photos (or more!). Categorising these can be a challenging task, let alone searching through them to find that one happy snap from 10 years ago.

Significant advances in machine learning over the past decade have made it possible to automatically tag and categorise photos without user input (assuming a machine learning model has been pre-trained). Many social media and photo sharing platforms make this functionality available for their users — for example, Flickr’s “Magic View”.  What if a user has a large number of files stored locally on a Hard Disk?

The problem

  • 49,049 uncategorised digital images stored locally
  • Manual categorisation
  • No easy way to search (e.g. “red dress”, “mountain”, “cat on a mat”)

The solution

Steps

  1. Obtain a Microsoft Azure cloud subscription (note – Azure is not free, however free trials may be available):
    https://azure.microsoft.com/en-us/free/
  2. Start a cognitive services account from the Azure portal and take note of one of the “Keys” (keys are interchangeable):
    https://portal.azure.com/
    computer_vision-azure_keys
  3. Log in to your Linux machine and ensure you have python3 installed:
    user@host.site:~> which python3
    /usr/bin/python3
  4. Ensure you have these python libraries installed:
    sudo su -
    pip3 install python-xmp-toolkit
    pip3 install argparse
    pip3 install Pillow
    exit
  5. Obtain a copy of the image-auto-tag script:
    git clone https://github.com/niftimusmaximus/image-auto-tag
  6. Automatically tag, caption and categorise an image (e.g. image.jpg):
    cd image-auto-tag
    ./image-auto-tag.py --key XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
      --captionConfidenceLevel 0.50 --tagConfidenceLevel 0.5
      --categoryConfidenceLevel 0.5 image.jpg

    Note – replace key with one of the ones obtained from the Azure Portal above

    Script will process the image:

    INFO: [image.jpg] Reading input file 1/1                                                                                                                      
    INFO: [image.jpg] Temporarily resized to 800x600                                                                                                              
    INFO: [image.jpg] Uploading to Azure Computer Vision API
                      (length: 107330 bytes)                                                                               
    INFO: [image.jpg] Response received from Azure Computer Vision API
                      (length: 1026 bytes)                                                                       
    INFO: [image.jpg] Appended caption 'a river with a mountain in the
                      background' (confidence: 0.67 >= 0.50)                                                     
    INFO: [image.jpg] Appended category 'outdoor_water'
                      (confidence: 0.84 >= 0.50)                                                                                
    INFO: [image.jpg] Appending tag 'nature' (confidence: 1.00 >= 0.50)                                                                                           
    INFO: [image.jpg] Appending tag 'outdoor' (confidence: 1.00 >= 0.50)                                                                                          
    INFO: [image.jpg] Appending tag 'water' (confidence: 0.99 >= 0.50)                                                                                            
    INFO: [image.jpg] Appending tag 'mountain' (confidence: 0.94 >= 0.50)                                                                                         
    INFO: [image.jpg] Appending tag 'river' (confidence: 0.90 >= 0.50)                                                                                            
    INFO: [image.jpg] Appending tag 'rock' (confidence: 0.89 >= 0.50)                                                                                             
    INFO: [image.jpg] Appending tag 'valley' (confidence: 0.75 >= 0.50)                                                                                           
    INFO: [image.jpg] Appending tag 'lake' (confidence: 0.60 >= 0.50)                                                                                             
    INFO: [image.jpg] Appending tag 'waterfall' (confidence: 0.60 >= 0.50)                                                                                        
    INFO: [image.jpg] Finished writing XMP data to file 1/1
  7. Verify the results:
    Auto tagging

    computer_vision-keyword_search
    API has applied “tags” which can be searched

    Auto captioning

    computer_vision-auto_caption
    API has captioned this image as “a beach with palm trees”

    Auto categorisation

    "plant_tree" hierarchical category has been applied
    API has applied the category “plant_tree” to this image

    Note – please see here for the API’s 86 category taxonomy

Script features

  • Writes to standard XMP metadata tags within JPG images which can be read by image management applications such as XnView MP and digiKam
  • Sends downsized images to Azure to improve performance

    Example
    – only send image of width 640 pixels (original image will retain its dimensions)

    ./image-auto-tag.py --key XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    --azureResizeWidth 640 image.jpg
  • Allows customisation of thresholds for tags, description and caption. This is useful because whilst good, the API is not perfect!

    Example – only caption image if caption confidence score from API is 0.5 or above:

    ./image-auto-tag.py --key XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
    --captionConfidenceLevel 0.5 image.jpg

Useful date formulas for Hive

Hive comes with some handy functions for transforming dates.  These can be helpful when working with date dimension tables and performing time-based comparisons and aggregations.

e.g. Convert a native Hive date formatted date string:

date_format(myDate,'dd-MM-yyyy')

Return the week number (within the year) of a particular date – i.e. first week of the year is 1, the week of new year’s eve is 52, etc:

weekofyear(myDate)

Other less obvious examples

Current month’s name (e.g. January, February, etc):

date_format(myDate, 'MMMMM')

First date of the current quarter:

cast(trunc(add_months(myDate,-pmod(month(myDate)-1,3)),'MM') as date)

Last date of the current quarter:

cast(date_add(trunc(add_months(myDate,3-pmod(month(myDate)-1,3)),'MM'),-1) as date)

Day number of the current quarter (e.g. April 2nd is day 2 of the second quarter, December 9th is day 70 of the fourth quarter, etc):

datediff(myDate,cast(trunc(add_months(myDate,-pmod(month(myDate)-1,3)),'MM') as date))+1