Recover notes full IDs

Alfred456654 · January 20, 2023, 3:11pm

My version of HedgeDoc is: 1.9.5

What I expected to happen:

Our instance used to use LDAP to log users in, but it was removed unexpectedly, so now we use the Github login. Obviously, this means that the users can no longer see their list of notes.

I launched psql to look at the tables contents, but could not find the full IDs for the documents, the ones that are used in the URL (like this: https://my.hedgedoc.instance/KcELsw5LSXK1AcrIRDuagA)

What actually happened:

I only found IDs like this: 025c3b99-a1f3-4abf-8e15-bc0f3c6e868a

I already tried:

Looking through the different tables, using base64 one way and another to match both IDs, looking through this forum, looking through the documentation, reading the source code (but I’ll be honest, I got nowhere and gave up fast).

What I’d like to get is just the list of “full” URLs for our notes, we can figure the rest ourselves.

Thanks a lot in advance for any help!

DerMolly · January 27, 2023, 5:18pm

Hey @Alfred456654 ,

welcome to the community forum.

The long ids in the url are an encoded version of a id column. You can find the encoding part here:

github.com

hedgedoc/hedgedoc/blob/master/lib/models/note.js#L172-L183


      
          Note.decodeNoteId = function (encodedId) {
            // decode from url-safe base64
            const id = base64url.toBuffer(encodedId).toString('hex')
            // add dashes between the UUID string parts
            const idParts = []
            idParts.push(id.substr(0, 8))
            idParts.push(id.substr(8, 4))
            idParts.push(id.substr(12, 4))
            idParts.push(id.substr(16, 4))
            idParts.push(id.substr(20, 12))
            return idParts.join('-')
          }

Greetings,
DerMolly

Alfred456654 · January 30, 2023, 12:59pm

Hello DerMolly,
Thank you very much for this perfect answer!
I’m still figuring out how this works since I’m not familiar with javascript but with a bit of patience I’ll be able to write a function that does the opposite.
I will then update my post with said script just in case.
Wish you the best!
Alfred

Alfred456654 · February 10, 2023, 12:08pm

Here is how I recovered my URLs:

First, I extracted the IDs from the DB like this:

psql -U hackmd -W -c 'select "id" from "Notes";' > ids.txt

I cleaned up the file a bit manually so that only the ids remain in it (basically remove the 2 first lines, the last line, and de-indent the whole file)
I executed the following python code:

#!/usr/bin/env python
import codecs
if __name__ == '__main__':
    with open("ids.txt", mode='r', encoding="utf-8") as id_file:
        hex_ids = list(map(str.strip, id_file.readlines()))
        b64_ids = [codecs.encode(codecs.decode(hex_id.replace("-", ""), 'hex'), 'base64').decode().strip() for hex_id in hex_ids]
        print("\n".join([f"https://hedgedoc.example.com/{b64_id}" for b64_id in b64_ids]))