Say Mojeek has discovered a website a while ago and ranks it favorably. How does Mojeek “notice” new content on that site, and does it plan on adding new options?
Some that come to mind:
- No special logic. Notice new pages the same way as always, by following referring links
- Polling the sitemap
- Polling individual already-known pages
- Polling feeds (RSS, Atom, JSON feeds)
- IndexNow API (maybe it doesn’t have to participate in the initiative but can still use the same API)
- WebSub (Open w3c protocol specifically for the purpose of pushing updates! Google uses this)
- Manual submission (discussed already; source of spam)
- Data sharing with a partner
I’m planning on running a personal WebSub publisher, so the thought just crossed my mind again.