2 Module : Gargantext.Text.Metrics.TFICF
3 Description : TFICF Ngrams tools
4 Copyright : (c) CNRS, 2017
5 License : AGPL + CECILL v3
6 Maintainer : team@gargantext.org
7 Stability : experimental
10 Definition of TFICF : Term Frequency - Inverse of Context Frequency
12 TFICF is a generalization of [TFIDF](https://en.wikipedia.org/wiki/Tf%E2%80%93idf).
17 module Gargantext.Text.Metrics.TFICF ( TFICF
25 import Data.Text (Text)
26 import Gargantext.Prelude
29 path = "Gargantext.Text.Metrics.TFICF"
33 data TficfContext n m = TficfInfra n m
37 data Total = Total {unTotal :: !Double}
38 data Count = Count {unCount :: !Double}
40 tficf :: TficfContext Count Total
41 -> TficfContext Count Total
43 tficf (TficfInfra (Count ic) (Total it) )
44 (TficfSupra (Count sc) (Total st) )
45 | it >= ic && st >= sc = (ic/it) / log (sc/st)
46 | otherwise = panic $ "[ERR]" <> path <>" Frequency impossible"
47 tficf _ _ = panic $ "[ERR]" <> path <> "Undefined for these contexts"