]> Git — Sourcephile - gargantext.git/blob - README.md
[ngrams] add score update endpoint + sorting
[gargantext.git] / README.md
1 # Gargantext Haskell
2
3 ## About this project
4
5 Gargantext is a collaborative web platform for the exploration of sets
6 of unstructured documents. It combines tools from natural language
7 processing, text-mining, complex networks analysis and interactive data
8 visualization to pave the way toward new kinds of interactions with your
9 digital corpora.
10
11 This software is a free software, developed by the CNRS Complex Systems
12 Institute of Paris Île-de-France (ISC-PIF) and its partners.
13
14 ## Installation
15
16 Disclaimer: this project is still in development, this is work in
17 progress. Please report and improve this documentation if you encounter issues.
18
19 ### Build Core Code
20
21 NOTE: Default build (with optimizations) requires large amounts of RAM (16GB at least). To avoid heavy compilation times and swapping out your machine, it is recommended to `stack build` with the `--fast-` flag, i.e.:
22 ``` sh
23 stack --docker build --fast
24 ```
25 or
26 ``` sh
27 stack --nix build --fast
28 ```
29 This might be related to the [broken Swagger `-O2` issue](https://github.com/haskell-servant/servant/issues/986).
30
31 #### Docker
32
33 ``` sh
34 curl -sSL https://gitlab.iscpif.fr/gargantext/haskell-gargantext/raw/dev/devops/docker/docker-install | sh
35 ```
36
37 #### Debian
38
39 ``` sh
40 curl -sSL https://gitlab.iscpif.fr/gargantext/haskell-gargantext/raw/dev/devops/debian/install | sh
41 ```
42
43 #### Ubuntu
44
45 ``` sh
46 curl -sSL https://gitlab.iscpif.fr/gargantext/haskell-gargantext/raw/dev/devops/ubuntu/install | sh
47 ```
48
49 ### Add dependencies
50
51 1. CoreNLP is needed (EN and FR); This dependency will not be needed soon.
52
53 ``` sh
54 ./devops/install-corenlp
55 ```
56
57 2. Louvain C++ needed to draw the socio-semantic graphs
58
59 NOTE: This is already added in the Docker build.
60
61 ``` sh
62 git clone https://gitlab.iscpif.fr/gargantext/clustering-louvain-cplusplus.git
63 cd clustering-louvain-cplusplus
64 ./install
65 ```
66
67 ### Initialization
68
69 #### Docker
70
71 Run PostgreSQL first:
72
73 ``` sh
74 cd devops/docker
75 docker-compose up
76 ```
77
78 Initialization schema should be loaded automatically (from `devops/postgres/schema.sql`).
79
80 #### Gargantext
81
82 ##### Fix the passwords
83
84 Change the passwords in gargantext.ini_toModify then move it:
85
86 ``` sh
87 mv gargantext.ini_toModify gargantext.ini
88 ```
89 (`.gitignore` avoids adding this file to the repository by mistake)
90
91
92 ##### Run Gargantext
93
94 Users have to be created first (`user1` is created as instance):
95
96 ``` sh
97 stack install
98 ~/.local/bin/gargantext-init "gargantext.ini"
99 ```
100
101 For Docker env, first create the appropriate image:
102
103 ``` sh
104 cd devops/docker
105 docker build -t fpco/stack-build:lts-14.27-garg .
106 ```
107
108 then run:
109
110 ``` sh
111 stack --docker run gargantext-init -- gargantext.ini
112 ```
113
114 ### Importing data
115
116 You can import some data with:
117 ``` sh
118 docker run --rm -it -p 9000:9000 cgenie/corenlp-garg
119 stack exec gargantext-import -- "corpusCsvHal" "user1" "IMT3" gargantext.ini 10000 ./1000.csv
120 ```
121
122 ### Nix
123
124 It is also possible to build everything with [Nix](https://nixos.org/) instead of Docker:
125 ``` sh
126 stack --nix build
127 stack --nix exec gargantext-import -- "corpusCsvHal" "user1" "IMT3" gargantext.ini 10000 ./1000.csv
128 stack --nix exec gargantext-server -- --ini gargantext.ini --run Prod
129 ```
130
131 ## Use Cases
132
133 ### Multi-User with Graphical User Interface (Server Mode)
134
135 ``` sh
136 ~/.local/bin/stack --docker exec gargantext-server -- --ini "gargantext.ini" --run Prod
137 ```
138
139 Then you can log in with `user1` / `1resu`.
140
141
142 ### Command Line Mode tools
143
144 #### Simple cooccurrences computation and indexation from a list of Ngrams
145
146 ``` sh
147 stack --docker exec gargantext-cli -- CorpusFromGarg.csv ListFromGarg.csv Ouput.json
148 ```
149
150 ### Analyzing the ngrams table repo
151
152 We store the repository in directory `repos` in the [CBOR](https://cbor.io/)
153 file format. To decode it to JSON and analyze, say, using
154 [jq](https://shapeshed.com/jq-json/), use the following command:
155
156 ``` sh
157 cat repos/repo.cbor.v5 | stack --nix exec gargantext-cbor2json | jq .
158 ```