Search Articles

Home / Articles

Image Caption Generator Using Convolutional Recurrent Neural Network Feature Fusion

. Fida Hussain Dahri, Asghar Ali Chandio, Nisar Ahmed Dahri & Muhammad Ali Soomro


Abstract

In this research, we introduced a novel approach for image captioning using a Convolutional Recurrent Neural Network (CRNN) model with Bidirectional Gated Recurrent Units (BiGRU). The model combines the features of convolutional and recurrent neural networks while leveraging transfer learning with a pre-trained VGG16 model from the ImageNet dataset. The evaluation was conducted on the Flickr8K dataset, which was partitioned into training, validation, and testing sets. The performance of the proposed CRNN model was assessed based on the BLEU score, and the results indicated that our model outperforms traditional encoder-decoder models in generating informative and diverse captions for images. Specifically, our model achieved a BLEU-1 score of 0.603, BLEU-2 score of 0.359, BLEU-3 score of 0.219, and BLEU-4 score of 0.122.

 

Index Terms- Image captioning, deep learning, Convolutional Recurrent Neural Network (CRNN), attention mechanism, Flickr8k dataset.

 

Download :