exiftool vendored.js

Fast, cross-platform Node.js access to ExifTool

451
48
TypeScript

exiftool-vendored

Fast, cross-platform Node.js access to ExifTool. Built and supported by PhotoStructure.

npm version
Node.js CI
GitHub issues
Known Vulnerabilities

Features

  1. Best-of-class cross-platform performance and reliability.

    This library enables PhotoStructure and 500+ other projects to read and write metadata in photos and videos.

    Expect an order of magnitude faster performance than other Node.js ExifTool modules.

    Thanks to being based on ExifTool, it’s the state of the art in high quality metadata extraction for thousands of file types.

  2. Best-effort extraction of

  3. Support for

  4. Robust type definitions of the top 99.5% tags used by over 6,000
    different camera makes and models (see an example)

  5. Automated updates to ExifTool (as new versions come out
    frequently
    )

  6. Robust test coverage, performed with on macOS, Linux, and
    Windows

Installation

npm install --save exiftool-vendored

Debug logging

If anything doesn’t work, the first thing to try is enabling the logger.

You can provide a Logger implementation via ExifToolOptions.logger, or set the environment variable NODE_DEBUG=exiftool-vendored. See the debuglog() documentation for more details.

Regarding use within Electron

Due to how different every Electron application setup is, and how new versions
frequently have breaking changes, do not ask for help by opening a github
issue on this project.

Please seek help via StackOverflow, the Electron discord, or other channels.

Electron-builder support

Add the following pattern to electron-builder.yml’s asarUnpack:

- "node_modules/exiftool-vendored.*/**/*"

The default exiftoolPath implementation will detect app.asar in your require
path and replace it with app.asar.unpacked automatically.

Electron-forge support

Version 25.0 of this library added experimental support for electron-forge:
add the following element to your ForgeConfig.packagerConfig.extraResource
string array, and things should “just work” for the main process.

"./node_modules/exiftool-vendored." +
  (process.platform === "win32" ? "exe" : "pl")

If your main process forks any node subprocesses, process.resourcesPath will
not be set
in those subprocesses, and the default exiftoolPath won’t work.

If this is your case, you must provide a correct implementation of
ExifToolOptions.exiftoolPath,
either by passing through resourcesPath via process.env, or some other
method.

Installation notes

  • exiftool-vendored provides an installation of ExifTool relevant for your
    local platform through
    optionalDependencies.

  • You shouldn’t include either exiftool-vendored.exe or
    exiftool-vendored.pl as direct dependencies to your project, unless you know
    what you’re doing.

  • If you’re installing on a minimal Linux distribution, you may need to install perl. On Alpine, run apk add perl.

  • Node.js’s -slim docker images don’t include a working perl build. Use the non-slim image instead. See the issue report for details.

  • If the platform-correct vendor module (exiftool-vendored.exe or exiftool-vendored.pl) is not found, exiftool is searched for on your PATH. Note that very old versions of exiftool are found on currently-supported Linux distributions which this library will not work correctly with.

Upgrading

See the
CHANGELOG
for breaking changes since you last updated.

Major version bumps

I bump the major version if there’s a chance existing code might be
affected.

I’ve been bit too many times by my code breaking when I pull in minor or patch
upgrades with other libraries. I think it’s better to be pessimistic in code
change impact analysis: “over-promise and under-deliver” your breaking-code
changes.

When you upgrade to a new major version, please take a bit more care in
validating your own systems, but don’t be surprised when everything still works.

Usage

There are many configuration options to ExifTool, but all values have (more or
less sensible) defaults.

Those defaults have been used to create the
exiftool singleton.
Note that if you don’t use the default singleton, you don’t need to .end()
it.

// We're using the singleton here for convenience:
const exiftool = require("exiftool-vendored").exiftool

// And to verify everything is working:
exiftool
  .version()
  .then((version) => console.log(`We're running ExifTool v${version}`))

If the default ExifTool constructor
parameters

wont’ work for you, it’s just a class that takes an options hash:

const ExifTool = require("exiftool-vendored").ExifTool
const exiftool = new ExifTool({ taskTimeoutMillis: 5000 })

You should only use the exported default exiftool singleton, or only create one instance of ExifTool as a singleton.

Remember to .end() whichever singleton you use.

General API

ExifTool.read() returns a Promise to a Tags instance. Note
that errors may be returned either by rejecting the promise, or for less
severe problems, via the errors field.

All other public ExifTool methods return Promise<void>, and will reject
the promise if the operation is not successful.

Tags types

ExifTool knows how to extract several thousand different tag fields.

Unfortunately, TypeScript crashes with error TS2590: Expression produces a union type that is too complex to represent if the Tags interface was comprehensive.

Instead, we build a corpus of “commonly seen” tags from over 10,000 different
digital camera makes and models, many from the ExifTool metadata
repository
and <raw.pixls.us>.

Here are some example fields:

  /** ★☆☆☆ ✔ Example: 200 */
  ISO?: number

  /** ★★★★ ✔ Example: 1920 */
  ImageHeight?: number

  /** ★★★★ ✔ Example: 1080 */
  ImageWidth?: number

  /** ★★★★ ✔ Example: "image/jpeg" */
  MIMEType?: string

The stars represent how common that field has a value in the example corpus. ★★★★ fields are found in > 50% of the examples.
☆☆☆☆ fields are found in < 1% of examples.

The checkmark denotes if the field is found in “popular” cameras (like recent
Nikon, Canon, Sony, and Apple devices).

Caveats with Tags

The fields in Tags are not comprehensive.

Just because a field is missing from the Tags interface does not mean the
field doesn’t exist in the returned object
. This library doesn’t exclude
unknown fields, in other words. It’s up to you and your code to look for other
fields you expect and cast to a more relevant interface.

Logging and events

To enable trace, debug, info, warning, or error logging from this library and
the underlying batch-cluster library, provide a Logger instance to the ExifTool constructor options.

ExifTool instances emits many lifecycle and error events via batch-cluster.

Reading tags

exiftool
  .read("path/to/image.jpg")
  .then((tags /*: Tags */) =>
    console.log(
      `Make: ${tags.Make}, Model: ${tags.Model}, Errors: ${tags.errors}`
    )
  )
  .catch((err) => console.error("Something terrible happened: ", err))

Extracting embedded images

Extract the low-resolution thumbnail in path/to/image.jpg, write it to
path/to/thumbnail.jpg, and return a Promise<void> that is fulfilled
when the image is extracted:

exiftool.extractThumbnail("path/to/image.jpg", "path/to/thumbnail.jpg")

Extract the Preview image (only found in some images):

exiftool.extractPreview("path/to/image.jpg", "path/to/preview.jpg")

Extract the JpgFromRaw image (found in some RAW images):

exiftool.extractJpgFromRaw("path/to/image.cr2", "path/to/fromRaw.jpg")

Extract the binary value from “tagname” tag in path/to/image.jpg
and write it to dest.bin (which cannot exist already
and whose parent directory must already exist):

exiftool.extractBinaryTag("tagname", "path/to/file.exf", "path/to/dest.bin")

Writing tags

Note that only a portion of tags is writable. Refer to the
documentation

and look under the “Writable” column.

If you apply malformed values or ask to write to tags that aren’t
supported, the returned Promise will be rejected.

Only string and numeric primitives are supported as values to the object.

To write a comment to the given file so it shows up in the Windows Explorer
Properties panel:

exiftool.write("path/to/file.jpg", { XPComment: "this is a test comment" })

To change the DateTimeOriginal, CreateDate and ModifyDate tags (using the
AllDates
shortcut) to 4:56pm UTC on February 6, 2016:

exiftool.write("path/to/file.jpg", { AllDates: "2016-02-06T16:56:00" })

To write to a specific metadata group’s tag, just prefix the tag name with the group.
(TypeScript users: you’ll need to cast to make this compile).

exiftool.write("path/to/file.jpg", {
  "IPTC:CopyrightNotice": "© 2021 PhotoStructure, Inc.",
})

To delete a tag, use null as the value.

exiftool.write("path/to/file.jpg", { UserComment: null })

The above example removes any value associated with the UserComment tag.

Always Beware: Timezones

If you edit a timestamp tag, realize that the difference between the
changed timestamp tag and the GPS value is used by exiftool-vendored to
infer the timezone.

In other words, if you only edit the CreateDate and don’t edit the GPS
timestamps, your timezone will either be incorrect or missing. See the
section about Dates below for more information.

Rewriting tags

You may find that some of your images have corrupt metadata and that writing
new dates, or editing the rotation information, for example, fails. ExifTool can
try to repair these images by rewriting all the metadata into a new file, along
with the original image content. See the
documentation for more
details about this functionality.

rewriteAllTags returns a void Promise that will be rejected if there are any
errors.

exiftool.rewriteAllTags("problematic.jpg", "rewritten.jpg")

ExifTool configuration support (.ExifTool_config)

ExifTool has an extensive user configuration system. There are several ways to use one:

  1. Place your user configuration
    file
    in your HOME
    directory
  2. Set the EXIFTOOL_HOME environment variable to the fully-qualified path that
    contains your user configuration.
  3. Specify the in the ExifTool constructor options:
new ExifTool({ exiftoolEnv: { EXIFTOOL_HOME: resolve("path", "to", "config", "dir") }

Resource hygiene

Call ExifTool.end() when you’re done

You must explicitly call
.end()
on any used instance of ExifTool to allow node to exit gracefully.

ExifTool child processes consume system resources, and prevents node from
exiting due to the way Node.js streams
work
.

Note that you can’t call cannot be in a process.on("exit") hook, as the stdio streams
attached to the child process cannot be unref’ed. (If there’s a solution to
this, please post to the above issue!)

Mocha v4.0.0

If you use mocha v4 or later, and you don’t call
exiftool.end(), you will find that your test suite hangs. The relevant
change is described here
,
and can be solved by adding an after block that shuts down the instance
of ExifTool that your tests are using:

after(() => exiftool.end()) // assuming your singleton is called `exiftool`

Dates

The date metadata in all your images and videos are, most likely,
underspecified.

Images and videos rarely specify a time zone in their dates. If all your files
were captured in your current time zone, defaulting to the local time zone is a
safe assumption, but if you have files that were captured in different parts of
the world, this assumption will not be correct. Parsing the same file in
different parts of the world result in different times for the same file.

Prior to version 7, heuristic 1 and 3 were applied.

As of version 7.0.0, exiftool-vendored uses the following heuristics. The
highest-priority heuristic to return a value will be used as the timezone offset
for all datetime tags that don’t already have a specified timezone.

Heuristic 1: explicit metadata

If the EXIF
TimeZoneOffset tag is present it will be applied as per the spec to
DateTimeOriginal, and if there are two values, the ModifyDate tag as well.
OffsetTime, OffsetTimeOriginal, and OffsetTimeDigitized are also
respected, if present (but are very rarely set).

Heuristic 2: GPS location

If GPS latitude and longitude is present and valid (the value of 0, 0 is
considered invalid), the tz-lookup library will be used to determine the time
zone name for that location.

Heuristic 3: UTC timestamps

If GPSDateTime or DateTimeUTC is present, the delta with the dates found
within the file, as long as the delta is valid, is used as the timezone offset.
Deltas of > 14 hours are considered invalid.

ExifDate and ExifDateTime

Because date-times have this optionally-set timezone, and some tags only specify
the date, this library returns classes that encode the date, the time of day, or
both, with an optional timezone and an optional tzoffset: ExifDateTime and
ExifTime. It’s up to you, then, to determine what’s correct for your
situation.

Note also that some smartphones record timestamps with microsecond precision
(not just milliseconds!), and both ExifDateTime and ExifTime have floating point
milliseconds.

Tags

Official EXIF tag names
are PascalCased, like
AFPointSelected and ISO. (“Fixing” the field names to be camelCase, would
result in ungainly aFPointSelected and iSO atrocities).

The Tags interface is
auto-generated by the mktags script, which parses through over 6,000 unique
camera make and model images, in large part sourced from the ExifTool site.
mktags groups tags, extracts their type, popularity, and example values such
that your IDE can autocomplete.

Tags marked with “★★★★”, like
MIMEType,
should be found in most files. Of the several thousand metadata tags, realize
less than 50 are found generally. You’ll need to do your research to
determine which tags are valid for your uses.

Note that if parsing fails (for, example, a date-time string), the raw string
will be returned. Consuming code should verify both existence and type as
reasonable for safety.

Serialization

The Tags object returned by ExifTool.read() can be serialized to JSON with JSON.stringify.

To reconstitute, use the parseJSON() method.

import { exiftool, parseJSON } from "exiftool-vendored"

const tags: Tags = await exiftool.read("/path/to/file.jpg")
const str: string = JSON.stringify(tags)

// parseJSON doesn't validate the input, so we don't assert that it's a Tags
// instance, but you can cast it (unsafely...)

const tags2: Tags = parseJSON(str) as Tags

Performance

The default exiftool singleton is intentionally throttled. If full system
utilization is acceptable:

  1. set
    maxProcs
    higher

  2. consider setting
    minDelayBetweenSpawnMillis
    to 0

  3. On a performant linux box, a smaller value of streamFlushMillis may work as
    well: if you see noTaskData
    events
    ,
    you need to bump the value up.

Benchmarking

The yarn mktags ../path/to/examples target reads all tags found in a directory
hierarchy of sample images and videos, and parses the results.

exiftool-vendored v16.0.0 on a 2019 AMD Ryzen 3900X running Ubuntu 20.04 on an
SSD can process 20+ files per second per thread, or 500+ files per second when
utilizing all CPU threads.

Batch mode

Using ExifTool’s -stay_open batch mode means we can reuse a single
instance of ExifTool across many requests, dropping response latency
dramatically as well as reducing system load.

Parallelism

To avoid overwhelming your system, the exiftool singleton is configured with a
maxProcs set to a quarter the number of CPUs on the current system (minimally
1); no more than maxProcs instances of exiftool will be spawned. If the
system is CPU constrained, however, you may want a smaller value. If you have
very fast disk IO, you may see a speed increase with larger values of
maxProcs, but note that each child process can consume 100 MB of RAM.

Author

Contributors 🎉

CHANGELOG

See the
CHANGELOG on github.