Robert Schäfer 497f77ae10 Breakthrough! Use split+indices for performance
@appinteractive thanks for pointing out `split`. You just saved me some
days of work to refactor the import statements to use CSV instead of
JSON files.

@Tirokk when I enter `:schema` in Neo4J web UI, I see the following:
```
:schema
Indexes
   ON :Badge(id) ONLINE
   ON :Category(id) ONLINE
   ON :Comment(id) ONLINE
   ON :Post(id) ONLINE
   ON :Tag(id) ONLINE
   ON :User(id) ONLINE

No constraints
```

So I temporarily removed the unique constraints on `slug` and added
plain indices on `id` for all relevant node types. We cannot omit the
`:Label` unfortunately, neo4j does not allow this. So I had to add all
indices for all known node labels instead.

With indices the import finishes in:
```
Time elapsed: 351 seconds
```
🎉

@appinteractive when I keep the unique indices on slug, I get an error
during import that a node with label `:User` and slug `tobias` already
exists. Ie. we have unqiue constraint violations in our production data.

@mattwr18 @ulfgebhardt @ogerly I started the application on my machine
on the production data and it turns out that the index page
http://localhost:3000/ takes way to long. Visiting my profile page at
http://localhost:3000/profile/5b1693daf850c11207fa6109/robert-schafer
is fine, though. Even pagination works. When I visit a post page with
not too many comments, the application is fast enough, too:
http://localhost:3000/post/5bbf49ebc428ea001c7ca89c/neues-video-format-human-connection-tech-news
2019-05-01 12:25:28 +02:00
..
2019-04-25 11:43:43 +02:00

Legacy data migration

This setup is completely optional and only required if you have data on a server which is running our legacy code and you want to import that data. It will import the uploads folder and migrate a dump of the legacy Mongo database into our new Neo4J graph database.

Configure Maintenance-Worker Pod

Create a configmap with the specific connection data of your legacy server:

$ kubectl create configmap maintenance-worker          \
  --namespace=human-connection                          \
  --from-literal=SSH_USERNAME=someuser                  \
  --from-literal=SSH_HOST=yourhost                      \
  --from-literal=MONGODB_USERNAME=hc-api                \
  --from-literal=MONGODB_PASSWORD=secretpassword        \
  --from-literal=MONGODB_AUTH_DB=hc_api                 \
  --from-literal=MONGODB_DATABASE=hc_api                \
  --from-literal=UPLOADS_DIRECTORY=/var/www/api/uploads

Create a secret with your public and private ssh keys. As the kubernetes documentation points out, you should be careful with your ssh keys. Anyone with access to your cluster will have access to your ssh keys. Better create a new pair with ssh-keygen and copy the public key to your legacy server with ssh-copy-id:

$ kubectl create secret generic ssh-keys          \
  --namespace=human-connection                    \
  --from-file=id_rsa=/path/to/.ssh/id_rsa         \
  --from-file=id_rsa.pub=/path/to/.ssh/id_rsa.pub \
  --from-file=known_hosts=/path/to/.ssh/known_hosts

Deploy a Temporary Maintenance-Worker Pod

Bring the application into maintenance mode.

{% hint style="info" %} TODO: implement maintenance mode {% endhint %}

Then temporarily delete backend and database deployments

$ kubectl --namespace=human-connection get deployments
NAME            READY   UP-TO-DATE   AVAILABLE   AGE
nitro-backend   1/1     1            1           3d11h
nitro-neo4j     1/1     1            1           3d11h
nitro-web       2/2     2            2           73d
$ kubectl --namespace=human-connection delete deployment nitro-neo4j
deployment.extensions "nitro-neo4j" deleted
$ kubectl --namespace=human-connection delete deployment nitro-backend
deployment.extensions "nitro-backend" deleted

Deploy one-time maintenance-worker pod:

# in deployment/legacy-migration/
$ kubectl apply -f db-migration-worker.yaml
pod/nitro-maintenance-worker created

Import legacy database and uploads:

$ kubectl --namespace=human-connection exec -it nitro-maintenance-worker bash
$ import_legacy_db
$ import_uploads
$ exit

Delete the pod when you're done:

$ kubectl --namespace=human-connection delete pod nitro-maintenance-worker

Oh, and of course you have to get those deleted deployments back. One way of doing it would be:

# in folder deployment/
$ kubectl apply -f human-connection/deployment-backend.yaml -f human-connection/deployment-neo4j.yaml