A powerful Python script that allows you to scrape messages and media from Telegram channels using the Telethon library. Features include real-time continuous scraping, media downloading, and data export capabilities.
___________________ _________
\__ ___/ _____/ / _____/
| | / \ ___ \_____ \
| | \ \_\ \/ \
|____| \______ /_______ /
\/ \/
Before running the script, you'll need:
pip install -r requirements.txt
Contents of requirements.txt
:
telethon
aiohttp
asyncio
api_id
: A numberapi_hash
: A string of letters and numbersKeep these credentials safe, you'll need them to run the script!
git clone https://github.com/unnohwn/telegram-scraper.git
cd telegram-scraper
pip install -r requirements.txt
python telegram-scraper.py
When scraping a channel for the first time, please note:
The script provides an interactive menu with the following options:
You can use either: - Channel username (e.g., channelname
) - Channel ID (e.g., -1001234567890
)
Data is stored in SQLite databases, one per channel: - Location: ./channelname/channelname.db
- Table: messages
- id
: Primary key - message_id
: Telegram message ID - date
: Message timestamp - sender_id
: Sender's Telegram ID - first_name
: Sender's first name - last_name
: Sender's last name - username
: Sender's username - message
: Message text - media_type
: Type of media (if any) - media_path
: Local path to downloaded media - reply_to
: ID of replied message (if any)
Media files are stored in: - Location: ./channelname/media/
- Files are named using message ID or original filename
Data can be exported in two formats: 1. CSV: ./channelname/channelname.csv
- Human-readable spreadsheet format - Easy to import into Excel/Google Sheets
./channelname/channelname.json
The continuous scraping feature ([C]
option) allows you to: - Monitor channels in real-time - Automatically download new messages - Download media as it's posted - Run indefinitely until interrupted (Ctrl+C) - Maintains state between runs
The script can download: - Photos - Documents - Other media types supported by Telegram - Automatically retries failed downloads - Skips existing files to avoid duplicates
The script includes: - Automatic retry mechanism for failed media downloads - State preservation in case of interruption - Flood control compliance - Error logging for failed operations
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
This tool is for educational purposes only. Make sure to: - Respect Telegram's Terms of Service - Obtain necessary permissions before scraping - Use responsibly and ethically - Comply with data protection regulations