Smart Video Analytics: Motion Detection for Security & Monitoring

Link: https://github.com/philippe-heitzmann/Video_Background_Subtraction_OpenCV
Introduction
Motion detection in video footage is essential for many smart monitoring applications like security systems, parking occupancy tracking, and retail analytics. This project evaluated different motion detection algorithms to find the most reliable solution for real-world deployment.
We used the publicly available CDTNet-14 dataset to test various motion detection algorithms across different lighting conditions, including challenging scenarios with shadows and variable lighting.
CDTNet-14 Dataset
The CDTNet-14 dataset was developed for the 2014 Change Detection Workshop and contains 53 videos with ~140,000 frames covering various indoor and outdoor monitoring scenarios. The dataset categorizes videos by difficulty level:
- Baseline: Videos with minimal background movement and good lighting
- Shadows: Videos with moving objects that cast shadows, creating detection challenges
We tested our algorithms on highway traffic and office foot traffic videos from both categories to compare performance across different environments.
Figure 1. CDTNet-14 Baseline highway.avi and Shadows cubicle.avi videos and publicly available highway traffic recording used in analysis
Methodology
We implemented two primary motion detection algorithms using OpenCV in C++:
- k-Nearest Neighbors (kNN) - A statistical approach that learns from recent video frames
- MOG2 Adaptive Gaussian Mixture - A more sophisticated model that adapts to changing backgrounds
The C++ implementation takes a video file and algorithm type as input:
#include <iostream>
#include <sstream>
#include <opencv2/imgcodecs.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/videoio.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/video.hpp>
using namespace cv;
using namespace std;
const char* params
= "{ help h | | Print usage }"
"{ input | highway_traffic.mp4 | Path to a video or a sequence of image }"
"{ algo | KNN | Background subtraction method (KNN, MOG2) }";
The main function creates a motion detection model and processes each video frame:
int main(int argc, char* argv[])
{
CommandLineParser parser(argc, argv, params);
parser.about("This program shows how to use background subtraction methods provided by "
" OpenCV. You can process both videos and images.\n");
if (parser.has("help"))
{
//print help information
parser.printMessage();
}
//! [create]
//create Background Subtractor objects
Ptr<BackgroundSubtractor> pBackSub;
if (parser.get<String>("algo") == "MOG2")
pBackSub = createBackgroundSubtractorMOG2();
else
pBackSub = createBackgroundSubtractorKNN();
//! [create]
//! [capture]
VideoCapture capture(samples::findFile(parser.get<String>("input")));
if (!capture.isOpened()) {
//error in opening the video input
cerr << "Unable to open: " << parser.get<String>("input") << endl;
return 0;
}
//! [capture]
Mat frame, fgMask;
while (true) {
capture >> frame;
if (frame.empty())
break;
//! [apply]
//update the background model
pBackSub->apply(frame, fgMask);
//! [apply]
//! [display_frame_number]
//get the frame number and write it on the current frame
rectangle(frame, cv::Point(10, 2), cv::Point(100, 20),
cv::Scalar(255, 255, 255), -1);
stringstream ss;
ss << capture.get(CAP_PROP_POS_FRAMES);
string frameNumberString = ss.str();
putText(frame, frameNumberString.c_str(), cv::Point(15, 15),
FONT_HERSHEY_SIMPLEX, 0.5, cv::Scalar(0, 0, 0));
//! [display_frame_number]
//! [show]
//show the current frame and the fg masks
imshow("Frame", frame);
imshow("FG Mask", fgMask);
//! [show]
//get the input from the keyboard
int keyboard = waitKey(30);
if (keyboard == 'q' || keyboard == 27)
break;
}
return 0;
}
Running this script produces two video outputs: the original footage and a motion detection mask showing moving objects in white against a black background.
Figure 2. Sample KNN-based Background Subtractor output using OpenCV
Figure 3. Sample MOG2-based Background Subtractor output using OpenCV
Both algorithms successfully detect moving cars, but they also capture shadows, which could cause false alarms in security applications. To find better solutions for shadow-heavy environments, we evaluated eight additional algorithms from OpenCV’s background subtraction module:
- MOG
- GMG
- LSBP-vanilla
- LSBP-speed
- LSBP-quality
- LSBP-comp
- GSOC
- GSOC-comp
We used a Python evaluation script to test all algorithms on both Baseline and Shadows datasets:
def main():
#parse command line arguments used later in our args variable
parser = argparse.ArgumentParser(description='Evaluate all background subtractors using Change Detection 2014 dataset')
parser.add_argument('--dataset_path', help='Path to the directory with dataset. It may contain multiple inner directories. It will be scanned recursively.', required=True)
parser.add_argument('--algorithm', help='Test particular algorithm instead of all.')
args = parser.parse_args()
#get groundtruth and input data dirs
dataset_dirs = find_relevant_dirs(args.dataset_path)
assert len(dataset_dirs) > 0, ("Passed directory must contain at least one sequence from the Change Detection dataset. There is no relevant directories in %s. Check that this directory is correct." % (args.dataset_path))
if args.algorithm is not None:
global ALGORITHMS_TO_EVALUATE
#defining OpenCV background subtraction algorithm to evaluate
ALGORITHMS_TO_EVALUATE = [algo_tuple for algo_tuple in ALGORITHMS_TO_EVALUATE if algo_tuple[1].lower() == args.algorithm.lower()]
summary = {}
#calculating pixel-level recall, precision and f1-score performance metrics of our model vs groundtruth
for seq in dataset_dirs:
evaluate_on_sequence(seq, summary)
#compiling performance metrics of our models
for category in summary:
for algo_name in summary[category]:
summary[category][algo_name] = np.mean(summary[category][algo_name], axis=0)
#printing performance summaries of our models
for category in summary:
print('=== SUMMARY for %s (Precision, Recall, F1, Accuracy) ===' % category)
for algo_name in summary[category]:
print('%05s: %.3f %.3f %.3f %.3f' % ((algo_name,) + tuple(summary[category][algo_name])))
if __name__ == '__main__':
main()
The evaluation process:
- Parse command line arguments for dataset path and algorithm selection
- Load ground truth data and input videos
- Create algorithm objects as specified
- Calculate accuracy metrics (recall, precision, F1-score) by comparing predictions to ground truth
- Compile performance summaries for all models across different categories
Results
Our evaluation revealed significant performance differences between ideal and challenging lighting conditions. The best-performing GSOC algorithm achieved:
- 96% recall and 99% precision in ideal lighting conditions (Baseline dataset)
- 82% recall and 52% precision in shadow-heavy environments (Shadows dataset)
This means in shadow environments, the system would miss ~20% of actual threats and generate false alarms for ~50% of detections - unacceptable for security applications requiring high reliability.
Visual comparison of different algorithms on a challenging shadow frame:
Figure 5. Shadows input frame #2450 image
Figure 6. Ground truth mask on Shadows input frame #2450
Figure 7. GSOC prediction mask on Shadows input frame #2450
Figure 8. GSOC-comp prediction mask on Shadows input frame #2450
Figure 9. GMG prediction mask on Shadows input frame #2450
Figure 10. MOG prediction mask on Shadows input frame #2450
Figure 11. LSBP-vanilla prediction mask on Shadows input frame #2450
Figure 12. LSBP-comp prediction mask on Shadows input frame #2450
Figure 13. LSBP-quality prediction mask on Shadows input frame #2450
Figure 14. LSBP-speed prediction mask on Shadows input frame #2450
Conclusion
This project demonstrated that while motion detection algorithms can achieve excellent performance in ideal conditions, they struggle significantly in challenging lighting environments with shadows. The GSOC algorithm performed best overall but still showed concerning reliability issues in shadow-heavy scenarios.
For production deployment, these results suggest that:
- High-security applications requiring near-perfect accuracy should avoid shadow-heavy environments or implement additional preprocessing
- General monitoring applications in well-lit environments can achieve reliable performance with the GSOC algorithm
- Future development should focus on shadow-resistant algorithms or multi-sensor fusion approaches
Thanks for reading!