ranet-clone
A searchable clone of russianplanes.net, for transparency and ease of identifying planes.
A torrent of the (partial) dataset is available here.
What is this?
This is a tool for
- Downloading the entirety of the images hosted on russianplanes.net
- Re-hosting them and allowing people to search them
- Re-creating the metadata that was once attached to these images via OCR
Why?
The website russianplanes.net was told to take down all their military aircraft listings by the Russian government. This project aims to archive all the images hosted on their CDN in order to make identification of aircraft easier.
Usage
git clone https://github.com/5HT2/ranet-clone
cd ranet-clone
# Make the dir first
RANET_DATA=/path/to/images/dir
echo "{}" > "$RANET_DATA/config.json"
#
# Run directly
go build -o ranet .
./ranet -dir $RANET_DATA -threads 4
#
# Or, run via Docker
docker build -t ranet .
docker run --name ranet --mount type=bind,source="$RANET_DATA",target=/ranet-data --network host -d -e MODE=all -e THREADS=4 ranet
TODO
- Async downloading
- Distributed hosting
- Searching
- OCR
Images start even lower than 100000
If you 0 pad the jpg filenames you can start even lower:
from: https://russianplanes.net/images/to1000/000001.jpg to: https://russianplanes.net/images/to100000/099999.jpg
Should yield ~100,000 more images.
Docker build fails due to missing "leptonica/allheaders.h"
Prior to running, had set: RANET_DATA=/mnt/user/downloads/ranetImages echo "{}" > "$RANET_DATA/config.json"