Computer Vision for Human Rights Researchers
VFRAME (Visual Forensics and Metadata Extraction) is a computer vision toolkit designed for human rights researchers. It aims to bridge the gap between state-of-the-art artificial intelligence used in the commercial sector and make it accessible and tailored to the needs of human rights researchers and investigative journalists working with large video or image datasets. Read more about VFRAME and its objectives at vframe.io. VFRAME is under active development through 2019.
During the first stage of development (April-Oct 2018) 3 prototypes were developed: the core image processing software to analyze a large collection of videos, a large-scale visual search interface into the video collection, and an annotation interface to create training data.
The core image processing software is a Python image processing pipeline with a Click command line tool interface with modularity in mind. Though still a prototype, the toolkit includes object detection (Darknet and OpenCV DNN), classification (Darknet and OpenCV DNN), and content-based keyframe splitting utilities to iteratively reduce large datasets into usable metadata. Follow vframeio/vframe for updates on the core image processing pipeline and vframeio/vcat for updates on the search engine and annotation software. Below are several screenshots of the annotation interface, which is integrated with the search engine and uses FAISS for seaching over 10 million keyframes in less than 25 milliseconds on a basic commodity Linux server.
The first round was developed with support from PrototypeFund.de / BMBF.
The second stage of development will be to explore two main improvements: a neural-network guided annotation interface and using synthetic training data. One of the main challenges during Phase 1 was creating enough training data to develop robust munition detectors. While dozens of videos with cluster munitions were found, in total there were less than 1,000 annotations for each object. Training an object detection model requires (varies widely) around 2,000 samples per class.
To generate more training data, each target object is being 3D modeled, rendered into photorealistic scenes, then post-processed (eg Pix2PixHD) to match the quality of footage appearing in documentation from conflict zones (medium quality smartphone camera sensor with noticeable compression artifacts at between 480-720p). Current prototypes include the ShOAB-0.5 and AO-2.5RT cluster munitions, which are both high priority objects for the Syrian Archive researchers.
The first 3D model print prototypes were produced for the Ars Electronica Export exhibition in Berlin. In a darkly poetic coincidence, the 3D printed cluster bombs were produced at a former WWII munition factory (HfG Karlsruhe) and are now on display at VW Drive showroom in Berlin from November 17th, 2018 – February 17th, 2019.