]> Git — Sourcephile - gargantext.git/blob - README.md
[FIX] Crawler Arxiv Api
[gargantext.git] / README.md
1 <div align="center"><img height="180" src="https://gitlab.iscpif.fr/gargantext/main/images/logo.png"></div>
2
3 &nbsp;
4 # Gargantext with Haskell (Backend instance)
5
6 ![Haskell](https://img.shields.io/badge/Code-Haskell-informational?style=flat&logo=haskell&color=6144b3)&nbsp;&nbsp;![Stack](https://img.shields.io/badge/Tools-Stack-informational?style=flat&logo=&color=6144b3)&nbsp;&nbsp;![GHC](https://img.shields.io/badge/Tools-GHC-informational?style=flat&logo=&color=2E677B)&nbsp;&nbsp;![Nix](https://img.shields.io/badge/Package%20manager-Nix-informational?style=flat&logo=debian&color=6586c8)&nbsp;&nbsp;![Docker](https://img.shields.io/badge/Tools-Docker-informational?style=flat&logo=docker&color=003f8c)
7
8 #### Table of Contents
9 1. [About the project](#about)
10 2. [Example2](#example2)
11 3. [Third Example](#third-example)
12 4. [Fourth Example](#fourth-examplehttpwwwfourthexamplecom)
13
14 ## About the project <a name="about"></a>
15
16 GarganText is a collaborative web-decentralized-based macro-service
17 platform for the exploration of unstructured texts. It combines tools
18 from natural language processing, text-data-mining bricks, complex
19 networks analysis algorithms and interactive data visualization tools
20 to pave the way toward new kinds of interactions with your textual and
21 digital corpora.
22
23 This software is free (as "Libre" in French) software, developed by the
24 CNRS Complex Systems Institute of Paris Île-de-France (ISC-PIF) and its
25 partners.
26
27 GarganText Project: this repo builds the backend for the frontend server built by
28 [backend](https://gitlab.iscpif.fr/gargantext/haskell-gargantext).
29
30
31 ## Installation
32
33 Disclaimer: since this project is still in development, this document
34 remains in progress. Please report and improve this documentation if you
35 encounter any issues.
36
37 ### Prerequisite
38
39 Clone the project.
40 ```shell
41 git clone https://gitlab.iscpif.fr/gargantext/haskell-gargantext.git
42 cd haskell-gargantext
43 ```
44 ### 1. Install Stack
45
46 Install [Stack (or Haskell Tool Stack)](https://docs.haskellstack.org/en/stable/):
47
48 ```shell
49 curl -sSL https://get.haskellstack.org/ | sh
50 ```
51
52 Verify the installation is complete with
53 ```shell
54 stack --version
55 Version 2.9.1
56 ```
57
58 ### 2. Install Nix
59
60 Install [Nix](https://nixos.org/download.html):
61
62 ```shell
63 $ sh <(curl -L https://nixos.org/nix/install) --daemon
64 ```
65
66 Verify the installation is complete with
67 ```shell
68 $ nix-env --version
69 nix-env (Nix) 2.12.0
70 ```
71
72 > **NOTE INFO (upgrade/downgrade if needed)**
73 > Gargantext works with Nix 2.12.0 (older version than current 2.13.2). To downgrade your Nix version:
74 > `nix-channel --update; nix-env -iA nixpkgs.nixVersions.nix_2_12 nixpkgs.cacert; systemctl daemon-reload; systemctl restart nix-daemon`
75 > Upgrading Nix: https://nixos.org/manual/nix/unstable/installation/upgrading.html
76
77
78 ### 3. Build Core Code
79
80 NOTE: Default build (with optimizations) requires large amounts of RAM
81 (16GB at least). To avoid heavy compilation times and swapping out your
82 machine, it is recommended to `stack build` with the `--fast` flag,
83 i.e.:
84
85 ``` sh
86 stack --nix build --fast
87 ```
88
89 If the build is finishing without error, you are ready to launch
90 GarganText! See next step.
91
92 &nbsp;
93 &nbsp;
94 &nbsp;
95 &nbsp;
96 &nbsp;
97
98
99 ### Initialization
100
101 Docker-compose will configure your database and some NLP bricks (such as CoreNLP):
102
103 ``` sh
104 # If docker is not installed:
105 # curl -sSL https://gitlab.iscpif.fr/gargantext/haskell-gargantext/raw/dev/devops/docker/docker-install | sh
106 cd devops/docker
107 docker compose up
108 ```
109 Initialization schema should be loaded automatically (from `devops/postgres/schema.sql`).
110
111 Then install:
112 ``` sh
113 stack --nix install
114 ```
115
116 Copy the configuration file:
117 ``` sh
118 cp gargantext.ini_toModify gargantext.ini
119 ```
120 Do not worry, `.gitignore` avoids adding this file to the repository by
121 mistake, then you can change the passwords in gargantext.ini safely.
122
123 Users have to be created first (`user1` is created as instance):
124 ``` sh
125 ~/.local/bin/gargantext-init "gargantext.ini"
126 ```
127
128 Launch GarganText:
129 ``` sh
130 stack --nix exec gargantext-server -- --ini gargantext.ini --run Prod
131 ```
132
133
134 ## Use Cases
135
136 ### Multi-User with Graphical User Interface (Server Mode)
137
138 ``` sh
139 ~/.local/bin/stack --docker exec gargantext-server -- --ini "gargantext.ini" --run Prod
140 ```
141
142 Then you can log in with `user1` / `1resu`.
143
144
145 ### Command Line Mode tools
146
147 #### Simple cooccurrences computation and indexation from a list of Ngrams
148
149 ``` sh
150 stack --docker exec gargantext-cli -- CorpusFromGarg.csv ListFromGarg.csv Ouput.json
151 ```
152
153 ### Analyzing the ngrams table repo
154
155 We store the repository in directory `repos` in the [CBOR](https://cbor.io/)
156 file format. To decode it to JSON and analyze, say, using
157 [jq](https://shapeshed.com/jq-json/), use the following command:
158
159 ``` sh
160 cat repos/repo.cbor.v5 | stack --nix exec gargantext-cbor2json | jq .
161 ```
162 ### Documentation
163
164 To build documentation, run:
165
166 ```sh
167 stack --nix build --haddock --no-haddock-deps --fast
168 ```
169
170 (in `.stack-work/dist/x86_64-linux-nix/Cabal-3.2.1.0/doc/html/gargantext`).
171
172 ## GraphQL
173
174 Some introspection information.
175
176 Playground is located at http://localhost:8008/gql
177
178 ### List all GraphQL types in the Playground
179
180 ```
181 {
182 __schema {
183 types {
184 name
185 }
186 }
187 }
188 ```
189
190 ### List details about a type in GraphQL
191
192 ```
193 {
194 __type(name:"User") {
195 fields {
196 name
197 description
198 type {
199 name
200 }
201 }
202 }
203 }
204 ```
205 ## PostgreSQL
206
207 ### Upgrading using Docker
208
209 https://www.cloudytuts.com/tutorials/docker/how-to-upgrade-postgresql-in-docker-and-kubernetes/
210
211 To upgrade PostgreSQL in Docker containers, for example from 11.x to 14.x, simply run:
212 ```sh
213 docker exec -it <container-id> pg_dumpall -U gargantua > 11-db.dump
214 ```
215
216 Then, shut down the container, replace `image` section in
217 `devops/docker/docker-compose.yaml` with `postgres:14`. Also, it is a good practice to create a new volume, say `garg-pgdata14` and bind the new container to it. If you want to keep the same volume, remember about removing it like so:
218 ```sh
219 docker-compose rm postgres
220 docker volume rm docker_garg-pgdata
221 ```
222
223 Now, start the container and execute:
224 ```sh
225 # need to drop the empty DB first, since schema will be created when restoring the dump
226 docker exec -i <new-container-id> dropdb -U gargantua gargandbV5
227 # recreate the db, but empty with no schema
228 docker exec -i <new-container-id> createdb -U gargantua gargandbV5
229 # now we can restore the dump
230 docker exec -i <new-container-id> psql -U gargantua -d gargandbV5 < 11-db.dump
231 ```
232
233 ### Upgrading using
234
235 There is a solution using pgupgrade_cluster but you need to manage the
236 clusters version 14 and 13. Hence here is a simple solution to upgrade.
237
238 First save your data:
239 ```
240 sudo su postgres
241 pg_dumpall > gargandb.dump
242 ```
243
244 Upgrade postgresql:
245 ```
246 sudo apt install postgresql-server-14 postgresql-client-14
247 sudo apt remove --purge postgresql-13
248 ```
249 Restore your data:
250 ```
251 sudo su postgres
252 psql < gargandb.dump
253 ```
254
255 Maybe you need to restore the gargantua password
256 ```
257 ALTER ROLE gargantua PASSWORD 'yourPasswordIn_gargantext.ini'
258 ```
259 Maybe you need to change the port to 5433 for database connection in
260 your gargantext.ini file.
261
262
263
264