GPUImage 3 is a BSD-licensed Swift framework for GPU-accelerated video and image processing using Metal.
Janie Larson
@RedQueenCoder
@[email protected]
Brad Larson
http://www.sunsetlakesoftware.com
GPUImage 3 is the third generation of the GPUImage framework, an open source project for performing GPU-accelerated image and video processing on Mac and iOS. The original GPUImage framework was written in Objective-C and targeted Mac and iOS, the second iteration rewritten in Swift using OpenGL to target Mac, iOS, and Linux, and now this third generation is redesigned to use Metal in place of OpenGL.
The objective of the framework is to make it as easy as possible to set up and perform realtime video processing or machine vision against image or video sources. Previous iterations of this framework wrapped OpenGL (ES), hiding much of the boilerplate code required to render images on the GPU using custom vertex and fragment shaders. This version of the framework replaces OpenGL (ES) with Metal. Largely driven by Apple’s deprecation of OpenGL (ES) on their platforms in favor of Metal, it will allow for exploring performance optimizations over OpenGL and a tighter integration with Metal-based frameworks and operations.
The API is a clone of that used in GPUImage 2, and is intended to be a drop-in replacement for that version of the framework. Swapping between Metal and OpenGL versions of the framework should be as simple as changing which framework your application is linked against. A few low-level interfaces, such as those around texture input and output, will necessarily be Metal- or OpenGL-specific, but everything else is designed to be compatible between the two.
As of this point, we are not approving enhancement requests from outside contributors. We are actively working to port all of the functionality between this version of GPUImage and previous versions. Once this task has been completed we will be happy to take community contributions.
BSD-style, with the full license available with the framework in License.txt.
The framework relies on the concept of a processing pipeline, where image sources are targeted at image consumers, and so on down the line until images are output to the screen, to image files, to raw data, or to recorded movies. Cameras, movies, still images, and raw data can be inputs into this pipeline. Arbitrarily complex processing operations can be built from a combination of a series of smaller operations.
This is an object-oriented framework, with classes that encapsulate inputs, processing operations, and outputs. The processing operations use Metal vertex and fragment shaders to perform their image manipulations on the GPU.
Examples for usage of the framework in common applications are shown below.
GPUImage is provided as a Swift package. To add it to your Mac or iOS application, go to your project
settings, choose Package Dependencies, and click the plus button. Enter this repository’s URL in the
upper-right and hit enter. GPUImage will appear as a package dependency of your project.
In any of your Swift files that reference GPUImage classes, simply add
import GPUImage
and you should be ready to go.
Note that you may need to build your project once to parse and build the GPUImage framework in order for Xcode to stop warning you about the framework and its classes being missing.
To filter live video from a Mac or iOS camera, you can write code like the following:
do {
camera = try Camera(sessionPreset:.vga640x480)
filter = SaturationAdjustment()
camera --> filter --> renderView
camera.startCapture()
} catch {
fatalError("Could not initialize rendering pipeline: \(error)")
}
where renderView is an instance of RenderView that you’ve placed somewhere in your view hierarchy. The above instantiates a 640x480 camera instance, creates a saturation filter, and directs camera frames to be processed through the saturation filter on their way to the screen. startCapture() initiates the camera capture process.
The --> operator chains an image source to an image consumer, and many of these can be chained in the same line.
Functionality not completed.
Functionality not completed.
Functionality not completed.
Functionality not completed.
The framework uses a series of protocols to define types that can output images to be processed, take in an image for processing, or do both. These are the ImageSource, ImageConsumer, and ImageProcessingOperation protocols, respectively. Any type can comply to these, but typically classes are used.
Many common filters and other image processing operations can be described as subclasses of the BasicOperation class. BasicOperation provides much of the internal code required for taking in an image frame from one or more inputs, rendering a rectangular image (quad) from those inputs using a specified shader program, and providing that image to all of its targets. Variants on BasicOperation, such as TextureSamplingOperation or TwoStageOperation, provide additional information to the shader program that may be needed for certain kinds of operations.
To build a simple, one-input filter, you may not even need to create a subclass of your own. All you need to do is supply a fragment shader and the number of inputs needed when instantiating a BasicOperation:
let myFilter = BasicOperation(fragmentFunctionName:"myFilterFragmentFunction", numberOfInputs:1)
A shader program is composed of matched vertex and fragment shaders that are compiled and linked together into one program. By default, the framework uses a series of stock vertex shaders based on the number of input images feeding into an operation. Usually, all you’ll need to do is provide the custom fragment shader that is used to perform your filtering or other processing.
Fragment shaders used by GPUImage look something like this:
#include <metal_stdlib>
#include "OperationShaderTypes.h"
using namespace metal;
fragment half4 passthroughFragment(SingleInputVertexIO fragmentInput [[stage_in]],
texture2d<half> inputTexture [[texture(0)]])
{
constexpr sampler quadSampler;
half4 color = inputTexture.sample(quadSampler, fragmentInput.textureCoordinate);
return color;
}
and are saved within .metal files that are compiled at the same time as the framework / your project.
If you wish to group a series of operations into a single unit to pass around, you can create a new instance of OperationGroup. OperationGroup provides a configureGroup property that takes a closure which specifies how the group should be configured:
let boxBlur = BoxBlur()
let contrast = ContrastAdjustment()
let myGroup = OperationGroup()
myGroup.configureGroup{input, output in
input --> self.boxBlur --> self.contrast --> output
}
Frames coming in to the OperationGroup are represented by the input in the above closure, and frames going out of the entire group by the output. After setup, myGroup in the above will appear like any other operation, even though it is composed of multiple sub-operations. This group can then be passed or worked with like a single operation.
[TODO: Rework for Metal]
The framework uses several platform-independent types to represent common values. Generally, floating-point inputs are taken in as Floats. Sizes are specified using Size types (constructed by initializing with width and height). Colors are handled via the Color type, where you provide the normalized-to-1.0 color values for red, green, blue, and optionally alpha components.
Positions can be provided in 2-D and 3-D coordinates. If a Position is created by only specifying X and Y values, it will be handled as a 2-D point. If an optional Z coordinate is also provided, it will be dealt with as a 3-D point.
Matrices come in Matrix3x3 and Matrix4x4 varieties. These matrices can be build using a row-major array of Floats, or can be initialized from CATransform3D or CGAffineTransform structs.
Operations are currently being ported over from GPUImage 2. Here are the ones that are currently functional:
BrightnessAdjustment: Adjusts the brightness of the image. Described in detail here.
ExposureAdjustment: Adjusts the exposure of the image. Described in detail here.
ContrastAdjustment: Adjusts the contrast of the image. Described in detail here.
SaturationAdjustment: Adjusts the saturation of an image. Described in detail here.
GammaAdjustment: Adjusts the gamma of an image. Described in detail here.
LevelsAdjustment: Photoshop-like levels adjustment. The minimum, middle, maximum, minOutput and maxOutput parameters are floats in the range [0, 1]. If you have parameters from Photoshop in the range [0, 255] you must first convert them to be [0, 1]. The gamma/mid parameter is a float >= 0. This matches the value from Photoshop. If you want to apply levels to RGB as well as individual channels you need to use this filter twice - first for the individual channels and then for all channels.
ColorMatrixFilter: Transforms the colors of an image by applying a matrix to them
RGBAdjustment: Adjusts the individual RGB channels of an image. Described in detail here.
WhiteBalance: Adjusts the white balance of an image.
HighlightsAndShadows: Adjusts the shadows and highlights of an image
HueAdjustment: Adjusts the hue of an image
ColorInversion: Inverts the colors of an image. Described in detail here.
Luminance: Reduces an image to just its luminance (greyscale). Described in detail here.
MonochromeFilter: Converts the image to a single-color version, based on the luminance of each pixel
Haze: Used to add or remove haze (similar to a UV filter)
SepiaToneFilter: Simple sepia tone filter
OpacityAdjustment: Adjusts the alpha channel of the incoming image
LuminanceThreshold: Pixels with a luminance above the threshold will appear white, and those below will be black
Vibrance: Adjusts the vibrance of an image
HighlightAndShadowTint: Allows you to tint the shadows and highlights of an image independently using a color and intensity
{1.0f, 0.0f, 0.0f, 1.0f}
(red).{0.0f, 0.0f, 1.0f, 1.0f}
(blue).LookupFilter: Uses an RGB color lookup image to remap the colors in an image. First, use your favourite photo editing application to apply a filter to lookup.png from framework/Operations/LookupImages. For this to work properly each pixel color must not depend on other pixels (e.g. blur will not work). If you need a more complex filter you can create as many lookup tables as required. Once ready, use your new lookup.png file as the basis of a PictureInput that you provide for the lookupImage property.
AmatorkaFilter: A photo filter based on a Photoshop action by Amatorka: http://amatorka.deviantart.com/art/Amatorka-Action-2-121069631 . If you want to use this effect you have to add lookup_amatorka.png from the GPUImage framework/Operations/LookupImages folder to your application bundle.
MissEtikateFilter: A photo filter based on a Photoshop action by Miss Etikate: http://miss-etikate.deviantart.com/art/Photoshop-Action-15-120151961 . If you want to use this effect you have to add lookup_miss_etikate.png from the GPUImage framework/Operations/LookupImages folder to your application bundle.
SoftElegance: Another lookup-based color remapping filter. If you want to use this effect you have to add lookup_soft_elegance_1.png and lookup_soft_elegance_2.png from the GPUImage framework/Operations/LookupImages folder to your application bundle.
ColorInversion: Inverts the colors of an image
Luminance: Reduces an image to just its luminance (greyscale).
MonochromeFilter: Converts the image to a single-color version, based on the luminance of each pixel
FalseColor: Uses the luminance of the image to mix between two user-specified colors
Haze: Used to add or remove haze (similar to a UV filter)
SepiaToneFilter: Simple sepia tone filter
LuminanceThreshold: Pixels with a luminance above the threshold will appear white, and those below will be black
AdaptiveThreshold: Determines the local luminance around a pixel, then turns the pixel black if it is below that local luminance and white if above. This can be useful for picking out text under varying lighting conditions.
ChromaKeying: For a given color in the image, sets the alpha channel to 0. This is similar to the ChromaKeyBlend, only instead of blending in a second image for a matching color this doesn’t take in a second image and just turns a given color transparent.
Vibrance: Adjusts the vibrance of an image
HighlightShadowTint: Allows you to tint the shadows and highlights of an image independently using a color and intensity
{1.0f, 0.0f, 0.0f, 1.0f}
(red).{0.0f, 0.0f, 1.0f, 1.0f}
(blue).Sharpen: Sharpens the image
GaussianBlur: A hardware-optimized, variable-radius Gaussian blur
BoxBlur: A hardware-optimized, variable-radius box blur
iOSBlur: An attempt to replicate the background blur used on iOS 7 in places like the control center.
MedianFilter: Takes the median value of the three color components, over a 3x3 area
TiltShift: A simulated tilt shift lens effect
Convolution3x3: Runs a 3x3 convolution kernel against the image
SobelEdgeDetection: Sobel edge detection, with edges highlighted in white
PrewittEdgeDetection: Prewitt edge detection, with edges highlighted in white
ThresholdSobelEdgeDetection: Performs Sobel edge detection, but applies a threshold instead of giving gradual strength values
LocalBinaryPattern: This performs a comparison of intensity of the red channel of the 8 surrounding pixels and that of the central one, encoding the comparison results in a bit string that becomes this pixel intensity. The least-significant bit is the top-right comparison, going counterclockwise to end at the right comparison as the most significant bit.
ColorLocalBinaryPattern: This performs a comparison of intensity of all color channels of the 8 surrounding pixels and that of the central one, encoding the comparison results in a bit string that becomes each color channel’s intensity. The least-significant bit is the top-right comparison, going counterclockwise to end at the right comparison as the most significant bit.
LowPassFilter: This applies a low pass filter to incoming video frames. This basically accumulates a weighted rolling average of previous frames with the current ones as they come in. This can be used to denoise video, add motion blur, or be used to create a high pass filter.
HighPassFilter: This applies a high pass filter to incoming video frames. This is the inverse of the low pass filter, showing the difference between the current frame and the weighted rolling average of previous ones. This is most useful for motion detection.
ZoomBlur: Applies a directional motion blur to an image
ColourFASTFeatureDetection: Brings out the ColourFAST feature descriptors for an image
ChromaKeyBlend: Selectively replaces a color in the first image with the second image
DissolveBlend: Applies a dissolve blend of two images
MultiplyBlend: Applies a multiply blend of two images
AddBlend: Applies an additive blend of two images
SubtractBlend: Applies a subtractive blend of two images
DivideBlend: Applies a division blend of two images
OverlayBlend: Applies an overlay blend of two images
DarkenBlend: Blends two images by taking the minimum value of each color component between the images
LightenBlend: Blends two images by taking the maximum value of each color component between the images
ColorBurnBlend: Applies a color burn blend of two images
ColorDodgeBlend: Applies a color dodge blend of two images
ScreenBlend: Applies a screen blend of two images
ExclusionBlend: Applies an exclusion blend of two images
DifferenceBlend: Applies a difference blend of two images
HardLightBlend: Applies a hard light blend of two images
SoftLightBlend: Applies a soft light blend of two images
AlphaBlend: Blends the second image over the first, based on the second’s alpha channel
SourceOverBlend: Applies a source over blend of two images
ColorBurnBlend: Applies a color burn blend of two images
ColorDodgeBlend: Applies a color dodge blend of two images
NormalBlend: Applies a normal blend of two images
ColorBlend: Applies a color blend of two images
HueBlend: Applies a hue blend of two images
SaturationBlend: Applies a saturation blend of two images
LuminosityBlend: Applies a luminosity blend of two images
LinearBurnBlend: Applies a linear burn blend of two images
Pixellate: Applies a pixellation effect on an image or video
PolarPixellate: Applies a pixellation effect on an image or video, based on polar coordinates instead of Cartesian ones
PolkaDot: Breaks an image up into colored dots within a regular grid
Halftone: Applies a halftone effect to an image, like news print
Crosshatch: This converts an image into a black-and-white crosshatch pattern
SketchFilter: Converts video to look like a sketch. This is just the Sobel edge detection filter with the colors inverted
ThresholdSketchFilter: Same as the sketch filter, only the edges are thresholded instead of being grayscale
ToonFilter: This uses Sobel edge detection to place a black border around objects, and then it quantizes the colors present in the image to give a cartoon-like quality to the image.
SmoothToonFilter: This uses a similar process as the ToonFilter, only it precedes the toon effect with a Gaussian blur to smooth out noise.
EmbossFilter: Applies an embossing effect on the image
SwirlDistortion: Creates a swirl distortion on the image
BulgeDistortion: Creates a bulge distortion on the image
PinchDistortion: Creates a pinch distortion of the image
StretchDistortion: Creates a stretch distortion of the image
SphereRefraction: Simulates the refraction through a glass sphere
GlassSphereRefraction: Same as SphereRefraction, only the image is not inverted and there’s a little bit of frosting at the edges of the glass
Vignette: Performs a vignetting effect, fading out the image at the edges
KuwaharaRadius3Filter: A modified version of the Kuwahara filter, optimized to work over just a radius of three pixels
CGAColorspace: Simulates the colorspace of a CGA monitor
Solarize: Applies a solarization effect