- we need to remove the whitespaces first, then check if there are still illegitimate characters in the hashtag before removing them
- this is to maintain the maximum number of hashtags imported from the legacy database as possible
Ok, so here are multiple issues:
1. In cypher, `NOT NULL` will return `NULL` not `FALSE`. If we want
`FALSE` to be set in the database import, we should use `COAELESCE`
to find the first not-null value.
See:
https://neo4j.com/docs/cypher-manual/current/syntax/working-with-null/https://markhneedham.com/blog/2017/02/22/neo4j-null-values-even-work/
2. I removed the `disabled` and `deleted` checks on the commented
counter. With `neo4j-graphql-js` it is not possible to filter on the
join models (at least not without a lot of complexity) for disabled or
deleted items. Let's live with the fact that the list of commented posts
will include those posts, where the user has deleted his comment or where
the user's comment was disabled. It's being displayed as "not available"
so I think this is OK for now.
3. De-couple the pagination counters from the "commented", "shouted"
etc. counters. It might be that the list of posts is different for
different users. E.g. if the user has blocked you, the "posts" list
will be empty. The "shouted" or "commented" list will not have the
posts of the author. If you are a moderator, the list will include
disabled posts. So the counters are not in sync with the actual list
coming from the backend. Therefore I implemented "fetch and check if
resultSet < pageSize" instead of a global counter.
- I think this does what you want @Tirokk, but let me know if I missed something
- this removes all white spaces in the hashtag, then removes any hashtags that have special characters, or that start with a number, but do not have any letters after
- it does not check that the size is greater or equal to 1 because the regex already removes anything that does not have at least one character
@ulfgebhardt: The docs at `man cypher-shell` say that you can pass
`NEO4J_USERNAME` and `NEO4J_PASSWORD` per environment variable. So the
command line arguments are obsolete here.
@ulfgebhardt: I wondered about the list of tags after importing the
legacy db. It seems, each tag has at most 1 contribution. I guess it's
because we create a unique id for each tag, so two tags with the same
`name` e.g. `#hashtag` and `#hashtag` are not de-duplicated.
I'm currently sitting in the train and cannot run the data import myself, could
you double-check?
- include all collections (commented out)
- refactored neo4j import script
- use of .env file for (additional) configurations / configuration overrides
- lots of fiddeling with neo4j cql files and cypher shell
After endless try/error I found the way to share volumes between
multiple docker-compose.ymls: You have to place those files in the same
folder. Also the import scripts must be adapted.
@appinteractive thanks for pointing out `split`. You just saved me some
days of work to refactor the import statements to use CSV instead of
JSON files.
@Tirokk when I enter `:schema` in Neo4J web UI, I see the following:
```
:schema
Indexes
ON :Badge(id) ONLINE
ON :Category(id) ONLINE
ON :Comment(id) ONLINE
ON :Post(id) ONLINE
ON :Tag(id) ONLINE
ON :User(id) ONLINE
No constraints
```
So I temporarily removed the unique constraints on `slug` and added
plain indices on `id` for all relevant node types. We cannot omit the
`:Label` unfortunately, neo4j does not allow this. So I had to add all
indices for all known node labels instead.
With indices the import finishes in:
```
Time elapsed: 351 seconds
```
🎉
@appinteractive when I keep the unique indices on slug, I get an error
during import that a node with label `:User` and slug `tobias` already
exists. Ie. we have unqiue constraint violations in our production data.
@mattwr18 @ulfgebhardt @ogerly I started the application on my machine
on the production data and it turns out that the index page
http://localhost:3000/ takes way to long. Visiting my profile page at
http://localhost:3000/profile/5b1693daf850c11207fa6109/robert-schafer
is fine, though. Even pagination works. When I visit a post page with
not too many comments, the application is fast enough, too:
http://localhost:3000/post/5bbf49ebc428ea001c7ca89c/neues-video-format-human-connection-tech-news