Hi!
I am working now on Lexeme fulltext search. One of the unclear moments I have encountered is how to display Lexemes as search results. I am basing on assumption that we want to match both Lemmas and Forms (please tell me if I'm wrong). Having the match, I plan to display Lemma match like this:
title (LN) Synthetic description
e.g.
color/colour (L123) English noun
Meaning, the first line with link would be standard lexeme link generated by Lexeme code (which also deals with multiple lemmas) and the description line is generated description of the Lexeme - just like in completion search. The problem here, however, is since the link is generated by the Lexeme code, which has no idea about search, we can not properly highlight it. This can be solved with some trickery, probably, e.g. to locate search matches inside generated string and highlight them, but first I'd like to ensure this is the way it should be looking.
More tricky is displaying the Form (representation) match. I could display here the same as above, but I feel this might be confusing. Another option is to display Form data, e.g. for "colors":
color/colour (L123) colors: plural for color (L123): English noun
The description line features matched Form's representation and synthetic description for this form. Right now the matched part is not highlighted - because it will otherwise always be highlighted, as it is taken from the match itself, so I am not sure whether it should be or not.
So, does this display look as what we want to produce for Lexemes? Is there something that needs to be changed or improved? Would like to hear some feedback.
Thanks,
Hi Stas!
Your proposal is pretty much what I envision.
Am 14.06.2018 um 19:39 schrieb Stas Malyshev:
I plan to display Lemma match like this:
title (LN) Synthetic description
e.g.
color/colour (L123) English noun
Meaning, the first line with link would be standard lexeme link generated by Lexeme code (which also deals with multiple lemmas) and the description line is generated description of the Lexeme - just like in completion search.
Sounds perfect to me.
The problem here, however, is since the link is generated by the Lexeme code, which has no idea about search, we can not properly highlight it. This can be solved with some trickery, probably, e.g. to locate search matches inside generated string and highlight them, but first I'd like to ensure this is the way it should be looking.
Do we really need the highlight? It does not seem critical to me for this use case. Just "nice to have".
More tricky is displaying the Form (representation) match. I could display here the same as above, but I feel this might be confusing. Another option is to display Form data, e.g. for "colors":
color/colour (L123) colors: plural for color (L123): English noun
I'd rather have this:
colors/colours (L123-F2) plural of color (L123): English noun
Note that in place of "plural", you may have something like "3rd person, singular, past, conjunctive", derived from multiple Q-ids.
The description line features matched Form's representation and synthetic description for this form. Right now the matched part is not highlighted - because it will otherwise always be highlighted, as it is taken from the match itself, so I am not sure whether it should be or not.
Again, I don't think any highlighting is needed.
But, as you know, it's all up to Lydia to decide :)
-- daniel
Hi!
color/colour (L123) colors: plural for color (L123): English noun
I'd rather have this:
colors/colours (L123-F2) plural of color (L123): English noun
This part is a bit trickier since the title is still L123, so the system now is generating the link for L123. I could override that, but I see two questions: 1. What the link will be pointing to? I haven't found the code to generate the link to specific Form. I could write a new one but if it'd sit outside main classes it may be a fragile design. 2. This means overriding standard linking code and possibly reimplementing part of it (depending on whether this code supports generating Form link instead of Lexeme) - may again be a bit fragile. Unless I find standard means to do it.
Note that in place of "plural", you may have something like "3rd person, singular, past, conjunctive", derived from multiple Q-ids.
Yes, of course.
Again, I don't think any highlighting is needed.
Not strictly speaking needed, but might be nice.
Am 18.06.2018 um 19:25 schrieb Stas Malyshev:
- What the link will be pointing to? I haven't found the code to
generate the link to specific Form.
You can use an EntityTitleLookup to get the Title object for an EntityId. In case of a Form, it will point to the appropriate section. You can use the LinkRenderer service to make a link. Or you use an EntityIdHtmlLinkFormatter, which should do the right thing. You can get one from a OutputFormatValueFormatterFactory.
-- daniel
Hi!
You can use an EntityTitleLookup to get the Title object for an EntityId. In case of a Form, it will point to the appropriate section. You can use the
OK, I see it's just adding form id as a fragment, so it's easy I guess.
LinkRenderer service to make a link. Or you use an EntityIdHtmlLinkFormatter, which should do the right thing. You can get one from a OutputFormatValueFormatterFactory.
I can reimplement it manually, but I would be largely duplicating what HtmlPageLinkRendererBeginHookHandler is supposed to do. The problem seems to be that it is not doing it right. When the code works on the link like /wiki/Lexeme:L2#L2-F1, it does this:
$entityId = $foreignEntityId ?: $this->entityIdLookup->getEntityIdForTitle( $target );
Which produces back LexemeId instead of Form ID. It can't return Lexeme ID since lexeme does not have content model, and getEntityIdForTitle uses content model to get from Title to ID. So, I could duplicate all this code but I don't particularly like it. Could we fix HtmlPageLinkRendererBeginHookHandler instead maybe?
Hi!
I can reimplement it manually, but I would be largely duplicating what HtmlPageLinkRendererBeginHookHandler is supposed to do. The problem seems to be that it is not doing it right. When the code works on the link like /wiki/Lexeme:L2#L2-F1, it does this:
$entityId = $foreignEntityId ?: $this->entityIdLookup->getEntityIdForTitle( $target );
Which produces back LexemeId instead of Form ID. It can't return Lexeme ID since lexeme does not have content model, and getEntityIdForTitle uses content model to get from Title to ID. So, I could duplicate all this code but I don't particularly like it. Could we fix HtmlPageLinkRendererBeginHookHandler instead maybe?
Also, looks like Form actually doesn't have link-formatter-callback and its own link formatter code. So I wonder if there's an existing facility to format links to Forms? Leszek, do you have any information on this?
Thanks,
On 19 June 2018 at 01:35, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
I can reimplement it manually, but I would be largely duplicating what HtmlPageLinkRendererBeginHookHandler is supposed to do. The problem seems to be that it is not doing it right. When the code works on the link like /wiki/Lexeme:L2#L2-F1, it does this:
$entityId = $foreignEntityId ?: $this->entityIdLookup->getEntityIdForTitle( $target );
Which produces back LexemeId instead of Form ID. It can't return Lexeme ID since lexeme does not have content model, and getEntityIdForTitle uses content model to get from Title to ID. So, I could duplicate all this code but I don't particularly like it. Could we fix HtmlPageLinkRendererBeginHookHandler instead maybe?
Also, looks like Form actually doesn't have link-formatter-callback and its own link formatter code. So I wonder if there's an existing facility to format links to Forms? Leszek, do you have any information on this?
Hi Stas, would \Wikibase\Lexeme\PropertyType\FormIdHtmlFormatter be of any help for you in this case, at least temporarily?
In any case, thanks for bringing the issue up. Regarding link formatting: this is the very topic we're working actively right now, in a broader context, and all you need should be there rather soon. Hence I'd prefer to avoid duplicating efforts now. If this works for you Stas, my suggestion would be that you assume that the link formatting code is there, and make sure the data needed to display the link is in the search results etc. Once the ground work we're doing now is finished, and Lydia has defined how the search result link etc should look, we'll add it there.
Best
Thanks,
Stas Malyshev smalyshev@wikimedia.org
Hi!
If this works for you Stas, my suggestion would be that you assume that the link formatting code is there, and make sure the data needed to display the link is in the search results etc. Once the ground work we're doing now is finished, and Lydia has defined how the search result link etc should look, we'll add it there.
Well, I can't "assume" it since I have to put something in the code, and check it. I am not sure what "the data needed to display the link" is though since I haven't seen that code and looks like it doesn't exist yet :) I can implement something else in the meantime and replace it later with new code.
Hi!
In any case, thanks for bringing the issue up. Regarding link formatting: this is the very topic we're working actively right now, in a broader context, and all you need should be there rather soon. Hence I'd prefer to avoid duplicating efforts now.
Could you send me the phab task ID so I could subscribe to it and know when it's ready? I've paused the fulltext patch for now waiting for it,
On 21 June 2018 at 21:05, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
In any case, thanks for bringing the issue up. Regarding link formatting: this is the very topic we're working actively right now, in a broader context, and all you need should be there rather soon. Hence I'd prefer to avoid duplicating efforts now.
Could you send me the phab task ID so I could subscribe to it and know when it's ready? I've paused the fulltext patch for now waiting for it,
Hi Stas,
it is going to take some time until this is all ready, so in order to unblock your work I went ahead and had a look at the link formatter and HtmlPageLinkRendererBeginHookHandler, as it is seemed pretty straight-forward to have a basic working (hopefully) thing done on a Friday afternoon. You could see what I came up with at https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/WikibaseLexeme/+/441... .
I hope that help with the search work. Feel free to mention any missing bits or issues. Formatter is very rudimentary for now (e.g. "title" attribute is intentionally just showing form ID), but I hope this is an implementation detail that can be added later (we'll take care of the formatter).
Have a good weekend!
-- Stas Malyshev smalyshev@wikimedia.org
On Mon, Jun 18, 2018 at 1:49 PM Daniel Kinzler daniel.kinzler@wikimedia.de wrote:
Hi Stas!
Your proposal is pretty much what I envision.
Yeah it looks good for Lexemes. And I agree with Daniel's suggestion for Forms if that's possible:
colors/colours (L123-F2) plural of color (L123): English noun
We can live without highlighting for now.
Thanks for moving this forward! Leszek will say more about the linking.
Cheers Lydia
-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata
Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
wikidata-tech@lists.wikimedia.org