Hi Olya, Lucie, and Wikidatans,
Very interesting projects. And thanks for publishing, Lucie - very helpful!
2.1 Encoding the Triples
The encoder part of the model is a feed-forward
architecture that encodes the set of input triples
into a fixed dimensionality vector, which is subsequently
used to initialise the decoder. Given a
set of un-ordered triples FE = {f1, f2, . . . , fR :
fj = (sj , pj , oj )}, where sj , pj and oj are the onehot
vector representations of the respective subject,
property and object of the j-th triple, we compute
an embedding hfj
for the j-th triple by forward
propagating as follows:
hfj = q(Wh[Winsj ;Winpj ;Winoj ]) , (1)
hFE = WF[hf1
; . . . ; hfR−1
; hfR
] , (2)
where hfj
is the embedding vector of each triple
fj , hFE
is a fixed-length vector representation for
all the input triples FE. q is a non-linear activation
function, [. . . ; . . .] represents vector concatenation.
Win,Wh,WF are trainable weight matrices.
Unlike (Chisholm et al., 2017), our encoder is
agnostic with respect to the order of input triples.
As a result, the order of a particular triple fj in the
triples set does not change its significance towards
the computation of the vector representation of the
whole triples set, hFE
.
... whether this would address streaming triplets through GNMT?
Would this? And since
Swahili, Arabic and Esperanto, are all active languages in - https://translate.google.com/ - no further coding on the GNMT side would be necessary. (I'm curious how best for WUaS to grow small languages not yet in either Wikipedia/Wikidata's 287-301 languages or in GNMT's ~100+ languages?).
How could your Wikidata / Wikibabel work interface with Google GNMT more fully with time, building on your great Wikidata coding/papers?