]> Git — Sourcephile - gargantext.git/blob - README.md
[flow] add better progress report to flow corpus
[gargantext.git] / README.md
1 # Gargantext with Haskell (Backend instance)
2
3 ## About the project
4
5 GarganText is a collaborative web-decentralized-based macro-service
6 platform for the exploration of unstructured texts. It combines tools
7 from natural language processing, text-data-mining tricks, complex
8 networks analysis algorithms and interactive data visualization tools to
9 pave the way toward new kinds of interactions with your digital corpora.
10
11 This software is free software, developed and offered by the CNRS
12 Complex Systems Institute of Paris Île-de-France (ISC-PIF) and its
13 partners.
14
15 GarganText Project: this repo builds the
16 backend for the frontend server built by
17 [backend](https://gitlab.iscpif.fr/gargantext/haskell-gargantext).
18
19
20 ## Installation
21
22 Disclaimer: this project is still in development, this is work in
23 progress. Please report and improve this documentation if you encounter issues.
24
25 ### Stack setup
26
27 You need to install stack first:
28
29 ```shell
30 curl -sSL https://get.haskellstack.org/ | sh
31 ```
32
33 Verify the installation is complete with
34 ```shell
35 stack --version
36 ```
37
38 ### With Nix setup
39
40 First install [nix](https://nixos.org/guides/install-nix.html):
41
42 ```shell
43 curl -sSL https://nixos.org/nix/install | sh
44 ```
45
46 Verify the installation is complete
47 ```shell
48 $ nix-env
49 nix-env (Nix) 2.3.12
50 ```
51 And just build:
52 ``` sh
53 stack --nix build --fast
54 ```
55
56 ### Build Core Code
57
58 NOTE: Default build (with optimizations) requires large amounts of RAM
59 (16GB at least). To avoid heavy compilation times and swapping out your
60 machine, it is recommended to `stack build` with the `--fast-` flag,
61 i.e.:
62
63 ``` sh
64 stack --nix build --fast
65 ```
66 or
67
68 ``` sh
69 stack --docker build --fast
70 ```
71
72 #### Docker
73
74 ``` sh
75 curl -sSL https://gitlab.iscpif.fr/gargantext/haskell-gargantext/raw/dev/devops/docker/docker-install | sh
76 ```
77
78 #### Debian
79
80 ``` sh
81 curl -sSL https://gitlab.iscpif.fr/gargantext/haskell-gargantext/raw/dev/devops/debian/install | sh
82 ```
83
84 #### Ubuntu
85
86 ``` sh
87 curl -sSL https://gitlab.iscpif.fr/gargantext/haskell-gargantext/raw/dev/devops/ubuntu/install | sh
88 ```
89
90 ### Add dependencies
91
92 1. CoreNLP is needed (EN and FR); This dependency will not be needed soon.
93
94 ``` sh
95 ./devops/install-corenlp
96 ```
97
98
99 ### Initialization
100
101 #### Docker
102
103 Run PostgreSQL first:
104
105 ``` sh
106 cd devops/docker
107 docker-compose up
108 ```
109
110 Initialization schema should be loaded automatically (from `devops/postgres/schema.sql`).
111
112 #### Gargantext
113
114 ##### Fix the passwords
115
116 Change the passwords in gargantext.ini_toModify then move it:
117
118 ``` sh
119 mv gargantext.ini_toModify gargantext.ini
120 ```
121 (`.gitignore` avoids adding this file to the repository by mistake)
122
123
124 ##### Run Gargantext
125
126 Users have to be created first (`user1` is created as instance):
127
128 ``` sh
129 stack install
130 ~/.local/bin/gargantext-init "gargantext.ini"
131 ```
132
133 For Docker env, first create the appropriate image:
134
135 ``` sh
136 cd devops/docker
137 docker build -t cgenie/stack-build:lts-18.12-garg .
138 ```
139
140 then run:
141
142 ``` sh
143 stack --docker exec gargantext-init -- gargantext.ini
144 ```
145
146 ### Importing data
147
148 You can import some data with:
149 ``` sh
150 docker run --rm -it -p 9000:9000 cgenie/corenlp-garg
151 stack exec gargantext-import -- "corpusCsvHal" "user1" "IMT3" gargantext.ini 10000 ./1000.csv
152 ```
153
154 ### Nix
155
156 It is also possible to build everything with [Nix](https://nixos.org/) instead of Docker:
157 ``` sh
158 stack --nix build
159 stack --nix exec gargantext-import -- "corpusCsvHal" "user1" "IMT3" gargantext.ini 10000 ./1000.csv
160 stack --nix exec gargantext-server -- --ini gargantext.ini --run Prod
161 ```
162
163 ## Use Cases
164
165 ### Multi-User with Graphical User Interface (Server Mode)
166
167 ``` sh
168 ~/.local/bin/stack --docker exec gargantext-server -- --ini "gargantext.ini" --run Prod
169 ```
170
171 Then you can log in with `user1` / `1resu`.
172
173
174 ### Command Line Mode tools
175
176 #### Simple cooccurrences computation and indexation from a list of Ngrams
177
178 ``` sh
179 stack --docker exec gargantext-cli -- CorpusFromGarg.csv ListFromGarg.csv Ouput.json
180 ```
181
182 ### Analyzing the ngrams table repo
183
184 We store the repository in directory `repos` in the [CBOR](https://cbor.io/)
185 file format. To decode it to JSON and analyze, say, using
186 [jq](https://shapeshed.com/jq-json/), use the following command:
187
188 ``` sh
189 cat repos/repo.cbor.v5 | stack --nix exec gargantext-cbor2json | jq .
190 ```
191 ### Documentation
192
193 To build documentation, run:
194
195 ```sh
196 stack --docker build --haddock --no-haddock-deps --fast
197 ```
198