Fixing incorrect photo data

A few months ago I have been to the states. Naturally, one big outcome of that trip was a handful of photographs documenting it – 1.6GB of these, to be precise. The bad part is that I have forgotten to reset the camera settings beforehand, which made all the pictures date a few years back.

The images I made with my Cannon camera (sx110) use JPEG for compression, and Exif for the metadata – including the date and time the picture was taken, which is what we’re interested in. Being as perfectionist as I am, and having some familiarity with Python, I set out to modify the metadata on the pictures to contain the right dates for the trip.

As mentioned, looking at the task at hand I decided a small script in a dynamic and rich language would probably fit to solve the problem — and what is better than Python for such a plan?

The first thing needed is probably a module to parse the aforementioned Exif format. Python has a fairly large user base, which meant that it was relatively fast and easy to come up with Pyexiv2: a Python binding to the mature exiv2 image metadata library. The tutorial and interfaces seemed fairly straightforward and useful, so there was no need to look for a different tool.

Looking at available Exif tags, the following seem to be the most relevant:

  1. Exif.Image.DateTime – The date and time of image creation. In Exif standard, it is the date and time the file was changed.
  2. Exif.Photo.DateTimeOriginal – The date and time when the original image data was generated.
  3. Exif.Photo.DateTimeDigitized – The date and time when the image was stored as digital data.

Adding some glue logic for traversing all images and making the right date changes, my final script is given below:

import os
import pyexiv2
import datetime

have = datetime.datetime(2009,9,19,9,33,37)
want = datetime.datetime(2011,1,13,11,33,37)
diff = want - have

dir = 'C:/20120611-155806'

for f in os.listdir(dir):
    ff = dir + '/' + f
    metadata = pyexiv2.ImageMetadata(ff)
    print 'reading metadata for:', ff, '..',
    metadata.read()
    print 'done'
    print 'updating fields:',
    for m in metadata.exif_keys:
        if m.find('DateTime') >= 0:
            print m, '..',
            metadata[m].value = metadata[m].value + diff
    metadata.write()
    print 'done'

print 'all done!'

The business logic here is to “patch” all dates by adding a predefined time difference to each of them. The patched amount (‘diff’) was calculated by subtracting a made-up date (‘have’) from a real-world date (‘want’) of one of the images.

A sample of the output is hereby provided:

..
reading metadata for: C:/20120611-155806/IMG_4907.JPG .. done
updating fields: Exif.Image.DateTime .. Exif.Photo.DateTimeOriginal .. Exif.Photo.DateTimeDigitized .. done
reading metadata for: C:/20120611-155806/IMG_4908.JPG .. done
updating fields: Exif.Image.DateTime .. Exif.Photo.DateTimeOriginal .. Exif.Photo.DateTimeDigitized .. done
..

An important thing to note here, is that the given script will alter every Exif field that contains the combination ‘DateTime’ in its name. It worked well for me since the only fields that matched this lookup were the ones I wanted to change (namely, only the 3 mentioned above). You should consider modifying the source to name only the ones you like it to change.

One thought on “Fixing incorrect photo data

Leave a Reply