Robert Schäfer 497f77ae10 Breakthrough! Use split+indices for performance
@appinteractive thanks for pointing out `split`. You just saved me some
days of work to refactor the import statements to use CSV instead of
JSON files.

@Tirokk when I enter `:schema` in Neo4J web UI, I see the following:
```
:schema
Indexes
   ON :Badge(id) ONLINE
   ON :Category(id) ONLINE
   ON :Comment(id) ONLINE
   ON :Post(id) ONLINE
   ON :Tag(id) ONLINE
   ON :User(id) ONLINE

No constraints
```

So I temporarily removed the unique constraints on `slug` and added
plain indices on `id` for all relevant node types. We cannot omit the
`:Label` unfortunately, neo4j does not allow this. So I had to add all
indices for all known node labels instead.

With indices the import finishes in:
```
Time elapsed: 351 seconds
```
🎉

@appinteractive when I keep the unique indices on slug, I get an error
during import that a node with label `:User` and slug `tobias` already
exists. Ie. we have unqiue constraint violations in our production data.

@mattwr18 @ulfgebhardt @ogerly I started the application on my machine
on the production data and it turns out that the index page
http://localhost:3000/ takes way to long. Visiting my profile page at
http://localhost:3000/profile/5b1693daf850c11207fa6109/robert-schafer
is fine, though. Even pagination works. When I visit a post page with
not too many comments, the application is fast enough, too:
http://localhost:3000/post/5bbf49ebc428ea001c7ca89c/neues-video-format-human-connection-tech-news
2019-05-01 12:25:28 +02:00

17 lines
516 B
Bash
Executable File

#!/usr/bin/env bash
set -e
SCRIPT_DIRECTORY="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
echo "MATCH (n) DETACH DELETE n;" | cypher-shell
SECONDS=0
for collection in "badges" "categories" "users" "follows" "contributions" "shouts" "comments"
do
for chunk in /tmp/mongo-export/splits/$collection/*
do
mv $chunk /tmp/mongo-export/splits/current-chunk.json
echo "Import ${chunk}" && cypher-shell < $SCRIPT_DIRECTORY/$collection.cql
done
done
echo "Time elapsed: $SECONDS seconds"