Video_SemanticSegmentation_MaskRCNN

Understanding Mask R-CNN

Mask R-CNN is an extension of Faster R-CNN that introduces a branch for pixel-wise object segmentation. It enhances previous methods by incorporating:

Bounding Box Detection: Identifies objects in an image.
Instance Segmentation: Predicts pixel-wise masks for each object.
Feature Pyramid Networks (FPN): Improves multi-scale feature detection.
ROI Align: Eliminates misalignment issues in object detection. The original paper: He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). "Mask R-CNN."

Repository Features

This project is an adapted version of alsombra/Mask_RCNN-TF2 with key enhancements:

Upgraded to TensorFlow 2.x for modern compatibility.
Pretrained COCO weights for transfer learning.
Custom dataset support for instance segmentation.
ROI Align implementation for precise mask extraction.
Multi-GPU training support.
Background removal & visualization tools.

Project Workflow

Step 1: Downloading the Repository

Clone the repository to your local machine or Google Colab environment.

Step 2: Importing the Required Libraries

2(A): Installing Older Versions of NumPy I had to downgrade and install the numpy version (1.23.5) that:
- supports 'np.bool' alias
- is compatible to the Tensoflow version (2.15.0) to be installed below.
- is compatible with Python version 3.11 I am currently using
2(B): Installing Older Versions of TensorFlow
- TensorFlow 2.16+ introduced changes that break compatibility with older .h5 weights. TensorFlow 2.15.0 still supports by_name=True for .h5 models without issues.
2(C): Install 'opendatasets'
- Opendatasets allows you to easily download datasets from various sources, especially Kaggle.
2(D): Import other libraries for segmentation task
- imports inclue: os, sys, cv2, skimage.io, cv2_imshow

Step 3: Installing Required Dependencies

Navigate to the repository folder (Mask_RCNN-TF2) and install all required libraries.
Some dependencies require compatibility adjustments for TensorFlow 2.x.

Step 4: Importing Files from the MRCNN Folder

Import the necessary files from the MRCNN directory, which includes model architecture, utility, and visualisation functions.

Step 5: Importing the (coco) Dataset

5(A): Creating the Logs and Accessing the Images Folder
- Create a logs/ directory to store model checkpoints and training logs.
- Access the images/ folder to load sample images for segmentation.

Step 6: Updating Compatibility with TensorFlow v2

Since the original implementation was designed for TensorFlow 1.x, certain modifications are necessary. Make sure to import TensorFlow 1.x configuration and session classes while using the 2.x-compatible interface.

Step 7: Loading the Pre-trained Neural Network

7(A): Specifying the Path and Data Format
- Define the file path where the COCO pre-trained weights are stored.
7(B): Downloading the Pre-trained Model and Weights
- The model uses pre-trained COCO weights to perform instance segmentation on images.
- These weights allow the model to detect multiple object classes.
7(C): Specifying GPU Resources
- Configure the GPU settings to optimize memory usage.
- Ensure that TensorFlow allocates only necessary resources instead of consuming all available memory.
7(D): Instantiating the Mask R-CNN Model for Inference
- Create an instance of the MaskRCNN class in inference mode.
- Set up the configuration parameters needed for testing.
7(E): Loading the Weights into the Network
- Load the pre-trained weights into the Mask R-CNN model.
- Ensure that the weight format is compatible with the TensorFlow version being used.

Step 8: Segmenting in Videos

8(A): Create a List of Objects from the COCO Dataset
- Define a list of object classes that the model should detect in videos.
8(B): Import .mp4 Video from Kaggle
- Load a video file from Kaggle datasets for segmentation.
8(C): Load the .mp4 File
- Read the video file using OpenCV (cv2.VideoCapture()).
- Extract frames for processing.
8(D): Saving the Segmented Video
- Define the format of saving the processed video
- Save the segmented output using OpenCV's cv2.VideoWriter().
8(E): Importing a Custom Function for Visualizing Segmented Videos
- A custom file is provided to visualizing segmentation results, particularly for Mask R-CNN outputs, named 'video_functions.py'
- Download the function from my 'Github' link
8(F): Import the video_functions.py File to Project Session
- Import video_functions.py from GitHub.
- This file contains custom utilities for handling video segmentation.

Step 9: Visualizing the Objects in the Video

9(A): Define a Function to Display an Image
- Create a function to display segmented frames using Matplotlib or OpenCV.
9(B): Set Processing Frequency and Customize Video Length
- Allow users to define how many processed frames are displayed and lenght of video to process
9(C): OpenCV-Based Custom Video Frame Processing with Time Limit
- Implement frame-skipping logic to improve efficiency.
- Limit video processing time to a user-defined duration.
9(D): Display the Segmented .mp4 File
- Use OpenCV (cv2.imshow()) or Colab's display(Video()) to visualize the output.

Conclusion

This repository provides a TensorFlow 2.x-compatible implementation of Mask R-CNN for instance segmentation. It builds upon the original research paper while ensuring compatibility with modern libraries. Key features include:

Accurate object detection and segmentation.
Pretrained model inference using COCO weights.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
Video_SemanticSegmentation_with_Mask_R_CNN.ipynb		Video_SemanticSegmentation_with_Mask_R_CNN.ipynb
video_functions.py		video_functions.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video_SemanticSegmentation_MaskRCNN

About

Releases

Packages

Languages

ShivSubedi/Video_SemanticSegmentation_MaskRCNN_TensorFlow

Folders and files

Latest commit

History

Repository files navigation

Video_SemanticSegmentation_MaskRCNN

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages