Service that allows Discord forums to be indexed by Google and read without a Discord account.

  • By null
  • Last update: Nov 21, 2022
  • Comments: 15

dforum

dforum is a Discord bot that can be invited to your server that will broadcast all the forums in your server to a website, so that Google and other search engiens may find it, and people may be able to view it without a Discord account (this does not support anonymous posting though, they will need a Discord account to do that).

we have a discord server.

Download

dforum.zip

Comments(15)

  • 1

    Move Google fonts to local branch

  • 2

    Archived/"older" posts not shown.

    Post @samhza's Arikawa rewrite, "archived posts" are not shown anymore. This should be fixed.

    Since this was implemened once in the past (and also just a really stupid feature on Discord's part, and I almost feel that Arikawa should just group this with the posts array sent normally) this is marked as a bug

  • 3

    Caching messages in a database.

    Messages should be cached for performance reasons as well as to avoid hitting against Discord's rate limits too much.

    As discussed in the general chat of the Discord, the best way to go about this is probably to use an sql based database for storing the messages.

    As of November this has proven to be important as we realize the site cannot handle many concurrent connections due to Discord rate limits, which trips up Google when it tries to index the sitemap.

  • 4

    Anonymous Posting via Webhooks

    I'm not sure if Anonymous posting is not supported for security reasons or is seen as a technical limitation, but would it be possible to add anonymous posting via Discord webhooks? I remember seeing something done previously in a server where users could post under different names through a webhook? I haven't looked at the discord docs though to see if webhook bots can create/reply in forums? Just a thought. Maybe use a cookie to match a user to a post they made if they revisit the page on the same device like encoding the message id in the cookie? I've never worked with Go before but if it's something that you're open to I could take a look into it?

  • 5

    Optimize the HTML and whatnot so that bigger pages can get past Cloudflare.

    We now host a post that has nearly 3000 messages. It's actually only 3 megabytes but 56mb worth of resources is transferred and CloudFlare refuses to send this to the user.

    We need to look into optimization of resources sent in order to have larger posts like this sent.

    The post in question

  • 6

    User consent popup.

    This PR addresses #42

    • There is now a popup, by default, on each post page: "Due to GDPR we require your consent to serve content from third party sites such as cdn.discordapp.com.".
    • Nothing from cdn.discordapp.com or Google Fonts is loaded by default until you click "Consent" on that popup.
    • Clicking "Consent" sets a cookie (this is communicated to the user) that saves this setting.
    • Clicking "Don't Consent" closes the popup. We don't set a cookie that time because the people who care enough not to have .png files loaded from a site they don't trust probably don't want a cookie set.
    • The Google Fonts depedency is removed (#41 accidentally got mixed into this). The Ubuntu font is now served locally.
  • 7

    Apply the flexbox CSS to legacy browsers.

    This is an in-progress pull request. It is being made to track progress and link important things to keep in mind with this.

    The goal of this PR is to implement the "legacy" flexbox standard that browsers in the early 2000s supported (as well as the newer, better one) as a time-effective way of having this website render somewhat correctly on older browsers.

    Many would consider this a weird and fruitless idea, and they wouldn't be entirely wrong, but my counterpoint is that this is the only thing that's required. The rest of the CSS is actually basic enough to work on many of the oldest brosers. It is specifically this that needs to be done in order to have this be viewable on legacy browsers, and I think in that regard there is not a lot of time wasted here.

    Important links:

  • 8

    Update the Privacy Policy and add a TOS

    Pull Request

    Describe it here:

    • Change the Privacy Policy based on Sam's message caching changes
    • Add a TOS that covers which servers we will and won't host.

    I have made sure all the things still work:

    • [x] The index page and privacy policy page is still served
    • [x] The sitemap isn't broken.
    • [x] The guild listing is still functioning
    • [x] A guild's message listing is still functioning.
    • [x] Any but not all of the posts in the test forum are working; you are invited to invite your self-host to the server to actually index this.
  • 9

    Rust program that allows us to swiftly extract the proper nouns from a paragraph, while retaining the performance target.

    Pull Request

    Describe it here:

    This adds a sub-directory named "wordsearch" which contains a Rust program that does what the title describes.

    Some things to note:

    • It works using the official Scrabble dictionary. When the program starts, it creates 26 vectors for each letter of the alphabet, each containing a group of characters, and loads the dictionary words into their respective letters. When it receives a paragraph, it checks the first letter of each of the words before checking it against the corresponding vector of words to see if it's in there, and if it's not, it's considered a proper noun. This is not a 100% correct solution but it has been given large paragraphs and works with about 95% accuracy. However some words may have to be chopped or added to that list since there are brands that have one word that could be in the dictionary ("Discord" and "Meta" come to mind. "Twitter" is also there.)

      • It removes any non-alphabetical character from a word before comparing, which is useful for successfully comparing a few words in the alphabet and also words that might have special symbols, such as trademarks and restricted symbols. It also means we save computational time not considering words that aren't alphabetical, and on that note:
      • It does not compare words with an apostrophe specifically. This is because Scrabble actually doesn't have that letter so it doesn't consider "you've" a word anyways, and "youve" isn't in there either. This seems like one solution to many problems, but actually the dictionary we're using is English only and non-alphabetical words will not make it through the program anyways, and when we consider those two languages, this is probably the only case where words with specific glyphs need to not be considered.
      • On that note, we should likely consider the Spanish dictionary. But at the moment this is an English only site so it's fine.
    • It is a TCP server. This seems weird and wasteful but in my opinion it's not much weirder (in fact, in the golang code this will be less weird) then having it be a run-once program, and having it be a server allows us to retain data that will otherwise be reread and recalculated on each program startup.

    • The reason this is in Rust and not Golang is because I wanted this to add minimal overhead to the program, and optimized Golang code will do this much slower then Rust I'm afraid.

    There is not yet code to utilize this from the golang side, so none of the checkboxes apply.

  • 10

    Ensure, once and for all, that Discord shows us all the messages it has in a channel.

    Pull Request

    Describe it here:

    forces discord, once and for all, to give us ALL the messages it has in a channel.

    I have made sure all the things still work:

  • 11

    allow the site url to be configurable

    I have made sure all the things still work:

  • 12

    "Number of pages" halves after a certain amount of messages are cached

    After running a script I made that sends a request to every page, over night, I found that at some point the number of pages in the sitemap is halved. This was at 1,790 or so before refreshing, now its at 657 image

  • 13

    Site "locks up" after a few days of being online, begins taking 5 minutes to serve pages.

    I wanted to be transparent about this issue to the people who might be wondering why the site isn't working or why a lot of pages aren't indexed, but there's honestly nothing else to report other then this image.

    image

  • 14

    Per-server custom CSS/HTML(/JS?) additions

    In addition to custom URLs, servers should be able to be able to add custom CSS to their listings. Optionally, it would be nice if they could inject HTML before certain elements, although this would be hard to implement and might not make it.

    and the ability to add custom JS wouldn't hurt, but theres a lot that goes against the idea. In particular, it would be nice to ensure every page on the site loads in a timely manner. We would also open the gates to free hosting for aggressive popup/"alert spam" scams which could hurt our ranking on Google, since these would also be usable on the dfs.ioi-xd.net url.

  • 15

    Slash command that gives the dforum.org link to a post

    If this is also gonna be a "cleaner, linkable way to view a post" then a "/link" command or a custom "Copy Link" button that gets you a link to the channel you're in should be considered