Hey!
I had to do a migration, and I wanted to share my experience!
At the end, it was a smooth migration, but it was not rainbows and unicorns all the way
Context
At https://indiehosters.net/ we host https://lescommuns.org which is a nice organization that hosts tools and data for the commons. It is a nice place for small collectives to have nice tools like Hedgedoc
They were using:
- hackmd/codimd version 2.2
- deployed on a VM
- running on MySQL
- and using filesystem for upload.
It is now running on:
- kubernetes
- Postgres provided by zalando operator
- ceph for the ObjectStore (using minio js client)
Migration
hackmd/codi 2.2
First we decided to migrate to codimd 1.6 to then perform the migration to hedgedoc.
hackmd2.2 and codimd 1.6 seems to be compatible, so far we didn’t have issues.
MySQL → Postgres
We used pgLoader.
There is a tutorial provided by digital ocean that goes more indepth.
Connect to the 2 endpoints
I decided to run the pgloader script from my machine, with access to the 2 endpoints.
So first, I needed to establish the ssh tunels:
ssh -L 5432:localhost:5432 kube-master-1
kubectl -n namespace port-forward pg-pod-that-is-master 5432:5432
ssh -L 3306:localhost:3306 vm-with-mysql-running
Configure the pgloader script
You’ll need the following binaries on your machine:
- mysql-client
- postgres-client
- pgloader
At the end, I decided to use a script, I had troubles to configure the options on the cli.
Here is the content of the file load-script
load database
from mysql://user:pass@localhost/databse
into postgresql://hedgedoc:pass@localhost/hedgedoc
WITH include drop, quote identifiers, create tables, create indexes, reset sequences;
The important bit I had most troubles with is quote identifiers
, it is necessary to have the table name in uppercase, because well, pg prefers them in minor case…
I had to fiddle a lot around this, but once I found the right set of options, it was smooth!
Start the SQL migration
Once you are good to go:
pgloader -v load-script
The -v makes it versbos so you can see what is happening!
One last thing, is that the public schema is already created and pgloader do copy in a different schema, so I used this SQL command to resolve:
ALTER SCHEMA public RENAME TO public2;
ALTER SCHEMA database RENAME TO public;
Object store migration
I reused a script that I contributed.
But I had a trouble which was that the async call was quickly exhausting the open file limit of the small server
I spent a couple hours trying, with my not so great js skills…
At the end I used a queue:
var q = async.queue(function(task, callback) {
console.log(task.filename);
key = path.join('uploads', task.filename)
filePath = path.join(config.uploadsPath, task.filename)
console.log(`Uploading ${filePath} to ${key}...`);
metadata = {
'Content-Type': getImageMimeType(filePath),
}
minioClient.fPutObject(config.s3bucket, key, filePath, metadata, callback);
}, 4);
var key;
var filePath;
var metadata;
fs.readdir(config.uploadsPath, function (err, files) {
if (err) {
console.log('Unable to scan directory: ' + err);
process.exit();
}
files.forEach(function (file) {
q.push({filename:file}, function (err,filename,res) {
if (err) {
console.log(err);
process.exit();
}
console.log('File uploaded successfully.');
});
});
});
I’m happy the solution is elegant and I learnt something.
But… I should have use an s3 client to just sync the local folder to the distant folder Anyway, here is the script if you need it.
Conclusion
Hope it helps either my future self, or somebody else
Cheers!