A command-line tool for managing, enhancing, and interacting with YouTube transcripts.
- Overview
- Features
- Installation
- Dependencies
- Usage
- File Structure
- Examples
- Next TODOs
- Contributing
- License
- Acknowledgments
SegScript allows you to download, view, and query YouTube video transcripts directly from your terminal. It provides a clean interface for working with transcripts, including the ability to extract specific time ranges and view enhanced transcript content. I've used the langchain-google-genai package in conjunction with Google's Gemini Flash 2.0 model, which has delivered exceptional results in transcript enhancement.
- Download transcripts from any YouTube video using its ID
- List all downloaded transcripts stored in your local collection
- View full transcripts or segments based on time ranges
- Interactive mode for browsing and working with your transcript collection
- Rich text formatting for improved readability in the terminal
pip install segscriptFor testing purposes,
# Clone the repository
git clone https://github.com/keshavsharma25/segscript.git
cd segscript
# Install dependencies
pip install -r pyproject.toml
# Install the package (optional)
pip install -e .- youtube-transcript-api: Fetch youtube transcripts with ease
- click: Command-line interface creation kit
- rich: Terminal formatting and styling
- python-dotenv: Load
GOOGLE_API_KEYfrom the command line environment - pathlib: Object-oriented filesystem paths
- langchain-google-genai: For synthesizing transcript into a well structured format
# List all downloaded transcripts
segscript list
# Download a transcript for a YouTube video
segscript download VIDEO_ID
# Get a transcript (downloads if not already available)
segscript get VIDEO_ID
# Get a transcript for a specific time range
segscript get VIDEO_ID --time-range "10:00;20:00"
# Start interactive mode
segscript promptInteractive mode provides a user-friendly interface for:
- Browsing your transcript collection
- Selecting a transcript to work with
- Viewing full transcripts or specific segments
- Querying transcripts by time range
Transcripts are stored in the ~/.segscript/ directory with the following structure:
~/.segscript/
├── .env # Environment variables file
├── VIDEO_ID_1/
│ ├── VIDEO_ID_1.json # Raw transcript data
│ └── metadata.json # Video metadata
├── VIDEO_ID_2/
│ ├── VIDEO_ID_2.json
│ └── metadata.json
└── ...
segscript download dQw4w9WgXcQsegscript get dQw4w9WgXcQ --time-range "1:30;2:45"segscript prompt- Add transcript summary support.
- Add a prompt to make the each sentence be have its own line for better readibility.
- In
prompt, improve the UX by clearing the screen before running the command (like after download inprompt). - Improve the copy of the segscript prompt for better understanding.
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- A huge thanks to Youtube Transcript API for making transcript retrieval so easy and accessible.
- Also kudos to Langchain Google for the
langchain-google-genai. - Built with Rich for beautiful terminal output.
- Uses Click for command-line interface.
Note: SegScript is not affiliated with YouTube or Google.