ImageCaptioningSystem is a Python-based application that generates descriptive captions for images using deep learning techniques. By leveraging pre-trained neural networks, the system interprets the content of images and produces relevant textual descriptions.
- Automated Image Captioning: Generates descriptive captions for input images.
- Deep Learning Models: Utilizes pre-trained models for image analysis and caption generation.
- Multiple Caption Styles: Supports General, Creative, Professional, Descriptive, and Quote-based captions.
- Text Overlay on Image: Allows users to overlay the selected caption on the image.
- Audio Output: Generates speech for each caption using text-to-speech (TTS).
Ensure you have the following installed:
- Python 3.x
- Required Python libraries (as specified in
requirements.txt)
- Clone the Repository:
git clone https://github.com/PhilemonTJ/ImageCaptioningSystem.git
- Navigate to the Project Directory:
cd ImageCaptioningSystem - Install Required Libraries:
pip install -r requirements.txt
-
Prepare Input Images: Place the images you want to caption in the designated input directory.
-
Run the Application:
python ImageCaptioningSystem.py
Follow the prompts to input the image file path and receive the generated captions.
-
Select Caption Style: Choose from Generalized, Creative, Professional, Descriptive, or Quote-based captions.
-
Overlay Caption on Image (Optional): The selected caption can be overlaid on the image with a simple white background and Times New Roman font.
-
Audio Output: The generated captions can also be played as speech output.
- Image Input: A majestic sunset over the mountains.
- Generated Captions:
- General: "A stunning sunset casting golden hues over the mountain peaks."
- Creative: "Natureβs masterpieceβwhere the sky kisses the earth in fiery passion."
- Professional: "A breathtaking view of the mountains at sunset, captured in warm tones."
- Descriptive: "Golden light spilling over rugged peaks as the sun sets behind the mountains."
- Quote: "Every sunset brings the promise of a new dawn. β Ralph Waldo Emerson"
We welcome contributions! If youβd like to enhance the system by improving the model, adding features, or fixing bugs, please follow these steps:
- Fork the Repository.
- Create a New Branch:
git checkout -b feature-branch
- Make Your Changes and Commit:
git commit -m "Description of changes" - Push to Your Fork:
git push origin feature-branch
- Submit a Pull Request.
Your contributions will be reviewed and merged accordingly.
This project is for learning purpose. :)
- Thanks to the developers of the pre-trained models used in this project.
- Inspired by the advancements in computer vision and natural language processing.