@appinteractive thanks for pointing out `split`. You just saved me some days of work to refactor the import statements to use CSV instead of JSON files. @Tirokk when I enter `:schema` in Neo4J web UI, I see the following: ``` :schema Indexes ON :Badge(id) ONLINE ON :Category(id) ONLINE ON :Comment(id) ONLINE ON :Post(id) ONLINE ON :Tag(id) ONLINE ON :User(id) ONLINE No constraints ``` So I temporarily removed the unique constraints on `slug` and added plain indices on `id` for all relevant node types. We cannot omit the `:Label` unfortunately, neo4j does not allow this. So I had to add all indices for all known node labels instead. With indices the import finishes in: ``` Time elapsed: 351 seconds ``` 🎉 @appinteractive when I keep the unique indices on slug, I get an error during import that a node with label `:User` and slug `tobias` already exists. Ie. we have unqiue constraint violations in our production data. @mattwr18 @ulfgebhardt @ogerly I started the application on my machine on the production data and it turns out that the index page http://localhost:3000/ takes way to long. Visiting my profile page at http://localhost:3000/profile/5b1693daf850c11207fa6109/robert-schafer is fine, though. Even pagination works. When I visit a post page with not too many comments, the application is fast enough, too: http://localhost:3000/post/5bbf49ebc428ea001c7ca89c/neues-video-format-human-connection-tech-news
Legacy data migration
This setup is completely optional and only required if you have data on a server which is running our legacy code and you want to import that data. It will import the uploads folder and migrate a dump of the legacy Mongo database into our new Neo4J graph database.
Configure Maintenance-Worker Pod
Create a configmap with the specific connection data of your legacy server:
$ kubectl create configmap maintenance-worker \
--namespace=human-connection \
--from-literal=SSH_USERNAME=someuser \
--from-literal=SSH_HOST=yourhost \
--from-literal=MONGODB_USERNAME=hc-api \
--from-literal=MONGODB_PASSWORD=secretpassword \
--from-literal=MONGODB_AUTH_DB=hc_api \
--from-literal=MONGODB_DATABASE=hc_api \
--from-literal=UPLOADS_DIRECTORY=/var/www/api/uploads
Create a secret with your public and private ssh keys. As the kubernetes documentation points out, you should be careful with your ssh keys. Anyone with access to your cluster will have access to your ssh keys. Better create a new pair with ssh-keygen and copy the public key to your legacy server with ssh-copy-id:
$ kubectl create secret generic ssh-keys \
--namespace=human-connection \
--from-file=id_rsa=/path/to/.ssh/id_rsa \
--from-file=id_rsa.pub=/path/to/.ssh/id_rsa.pub \
--from-file=known_hosts=/path/to/.ssh/known_hosts
Deploy a Temporary Maintenance-Worker Pod
Bring the application into maintenance mode.
{% hint style="info" %} TODO: implement maintenance mode {% endhint %}
Then temporarily delete backend and database deployments
$ kubectl --namespace=human-connection get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
nitro-backend 1/1 1 1 3d11h
nitro-neo4j 1/1 1 1 3d11h
nitro-web 2/2 2 2 73d
$ kubectl --namespace=human-connection delete deployment nitro-neo4j
deployment.extensions "nitro-neo4j" deleted
$ kubectl --namespace=human-connection delete deployment nitro-backend
deployment.extensions "nitro-backend" deleted
Deploy one-time maintenance-worker pod:
# in deployment/legacy-migration/
$ kubectl apply -f db-migration-worker.yaml
pod/nitro-maintenance-worker created
Import legacy database and uploads:
$ kubectl --namespace=human-connection exec -it nitro-maintenance-worker bash
$ import_legacy_db
$ import_uploads
$ exit
Delete the pod when you're done:
$ kubectl --namespace=human-connection delete pod nitro-maintenance-worker
Oh, and of course you have to get those deleted deployments back. One way of doing it would be:
# in folder deployment/
$ kubectl apply -f human-connection/deployment-backend.yaml -f human-connection/deployment-neo4j.yaml