- Abstract-Wikipedia - lists.wikimedia.org

Wikimedia Clinic #009 with a feature on Abstract Wikipedia
by Denny Vrandečić 20 Aug '20

20 Aug '20

Tomorrow (Friday, August 21st) at 19:00 UTC (Noon Pacific, 9pm Central Europe) at the Wikimedia Clinic #009 I will be presenting about Abstract Wikipedia. If you are interested, feel free to join! There will be time for questions. https://meta.wikimedia.org/wiki/Wikimedia_Clinics#Upcoming_calls

2 1

Reference by Description
by Thad Guidry 20 Aug '20

20 Aug '20

I think that Abstract Wikipedia and more explicitly WikiLambda functions could be used for generating "References by Description" and later, help understanding them and reconciling them. One of the long term goals of Data Commons https://datacommons.org/faq is to help resolve ambiguous entities by using "Reference by Description". Denny, are you familiar with Guha's paper? https://arxiv.org/pdf/1511.06341.pdf ‘John McCarthy, Pioneer in Artificial Intelligence...’ the term ‘John > McCarthy’ alone is ambiguous. It could refer to a computer scientist, a > politician or even a novel or film. In order to disambiguate the reference, > the head-line includes the description “Pioneer in Artificial > Intelligence”. > Thad https://www.linkedin.com/in/thadguidry/

2 2

Great video on Denny talking about Abstract Wikipedia
by Thad Guidry 11 Aug '20

11 Aug '20

https://www.youtube.com/watch?v=9vYOpV-1ipU If you haven't seen it. Thad https://www.linkedin.com/in/thadguidry/

1 0

Keeping it simple for Abstract Wikipedia
by Gerard Meijssen 10 Aug '20

10 Aug '20

Hoi, I am amazed by all the competing ideas, notions I have read on the mailing list so far. It is bewildering and does not give me a notion of what is to be done. I have thought about it and for me it is simple. For every Wikipedia article there are two Wikidata items that have no Wikipedia article. It follows that the first item of business is to make these knowable in any language. The best way to do this is by providing automated descriptions that aid in disambiguation. When a Wikipedia article exists, it links to many articles. All of them have their own Wikidata item and all can be described either in a Wikidata triple or in structured text. When sufficient data is available, a text can be generated. This has been demonstrated by LSJBOT and it is why a Cebuano Wikipedia has so many articles. A template as used by LSJBOT can be adapted for every language. My point is that all the research in the world makes no difference when we do not apply what we know. Thanks, GerardM https://ultimategerardm.blogspot.com/2020/08/keeping-it-simple-for-abstract…

3 3

Re: [Abstract-wikipedia] Comprehension questions
by Grounder UK 06 Aug '20

06 Aug '20

Thank you for joining in, Adam. I think I'll leave outputting mathematics for now. Perhaps you could start a new topic on that later? I'd also like to think about outputting functions from our anonymous wiki (aka "wikilambda"), which I'll come onto here. I definitely agree about reading-comprehension qustions. For now, I encourage everyone who is not already familiar with it to check out https://www.deepl.com/translator. It does work on a tablet, but I found it a bit fiddly to interact with. What it allows you to do is play with its translation, choosing alternative words etc, and it reworks the rest of the translation as you go along. When it comes to Q&A mode, that brings us back to Charles's difficult problem of re-use. (This seems to be missing here, but it's in https://lists.wikimedia.org/pipermail/abstract-wikipedia/ 2020-July/000233.html.) It also brings us back to "Reasoning over ontologies" (https://lists.wikimedia.org/pipermail/abstract-wikipedia/20 20-July/000206.html). Reuse is never easy, and "all the facts in all the languages at all levels" is rather a broad scope... So I certainly accept the wisdom of Charles's suggestion of a pilot project. I started with a single quiz question: "What is the atomic number of oxygen?". For many Wikipedias, if not most, the article on oxygen gives you the answer in the first sentence or two. But we don't just want the answer, we want a quiz! Equally, maybe we don't just want the question and the answer, we want some wrong answers and some tips. We begin with a multiple choice question, as above. "What is the atomic number of oxygen? Is it (a) "Z"; (b) "O"; (c) 16; (d) 6; (e) 8; (f) .... The first thing to notice is that the wrong answers are not fictitious; they are values you will find in Wikidata Item Q629 (well, "Z" is nearby, in the label of the atomic number property, but no more clues!) and we can explore this further by choosing a wrong answer and provoking a response (a) "No, "Z" is the symbol for "atomic number", not the value for oxygen. (b) That's not oxygen's atomic number, that's its symbol. (c) Oxygen is in group 16 on the periodic table. See if you can find it there. (d) Well, oxygen is the sixth element in period 2... (e) That's right! Try choosing a wrong answer, just for fun... So we have a small number of somewhat connected facts to explore. Any of them can be turned into a question, which can pull in facts connected to that, like other elements in group 16 or period 2. This is enabled by the "quiZiverse" function that can turn a Wikidata Statement into a related set of statements, for any one of which a new related set can be produced. In an interactive context, choosing the option displays the text and could call the function again with the chosen statement in place of the original. Or the function is embedded in a link, so that, for example, clicking "group 16" calls the function withQ104567 as well as Q629, extending the quiz. If used to set a quiz, the teacher can suppress some answers and add alternatives, tailoring the quiZiverse to the students' level and the material to cover. It is here that we need to see how to capture the teacher's thinking, the rationale for changing the "focused ontology". Was it too advanced or too simple? Difficult to understand? Irrelevant? More interesting connection to something else... and so on. That sort of metadata can then be used to guide future quiZiverse instantiations, with target values for a range of dimensions, such as "reading comprehension". Best regards, Al. On Sunday, 2 August 2020, <abstract-wikipedia-request(a)lists.wikimedia.org> wrote: > Send Abstract-Wikipedia mailing list submissions to > abstract-wikipedia(a)lists.wikimedia.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.wikimedia.org/mailman/listinfo/abstract-wikipedia > or, via email, send a message with subject or body 'help' to > abstract-wikipedia-request(a)lists.wikimedia.org > > You can reach the person managing the list at > abstract-wikipedia-owner(a)lists.wikimedia.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Abstract-Wikipedia digest..." > > > Today's Topics: > > 1. Re: Comprehension questions (Adam Sobieski) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sun, 2 Aug 2020 01:15:10 +0000 > From: Adam Sobieski <adamsobieski(a)hotmail.com> > To: "General public mailing list for the discussion of Abstract > Wikipedia (aka Wikilambda)" <abstract-wikipedia(a)lists.wiki > media.org> > Subject: Re: [Abstract-wikipedia] Comprehension questions > Message-ID: > <CH2PR12MB41844F0AF42F5A53024C82D5C54C0(a)CH2PR12MB4184.namprd > 12.prod.outlook.com> > > Content-Type: text/plain; charset="utf-8" > > Educational technology is also interesting here. > > Generating reading comprehension questions while generating natural > language articles is an interesting topic. I think that the matter would be > one of refining the set of possible questions and selecting the best > questions for a particular reader in a particular context. One might also > find interesting the topics of intelligent tutoring systems [1] and > automatic item generation [2]. > > We can view the automatic generation of encyclopedia articles in response > to search engine queries as a type of Q&A system. Articles and their > related content hyperlinks sections could be generated in search result > contexts, contexts which include the question(s) that users asked a search > engine to find the content. Articles, when produced with this search engine > referrer information, could, in addition to highlighting relevant content, > recommend follow-up questions for readers to select in a related content > section, each follow-up question a hyperlink to another article (resembling > a hypertext-based dialogue system). Hopefully, these related content > hyperlink sections (perhaps resembling a recommender system) would entice > readers to further self-directed learning. > > I would like to also indicate that we should explore outputting > mathematics when automatically generating encyclopedia articles. For > wikitext, this could involve outputting LaTeX for MathJax to process. > > > Best regards, > Adam > > [1] https://en.wikipedia.org/wiki/Intelligent_tutoring_system > [2] https://en.wikipedia.org/wiki/Automatic_Item_Generation > > From: Grounder UK<mailto:grounderuk@gmail.com> > Sent: Friday, July 31, 2020 12:32 PM > To: abstract-wikipedia(a)lists.wikimedia.org<mailto:abstract-wikip > edia(a)lists.wikimedia.org> > Subject: Re: [Abstract-wikipedia] Comprehension questions > > Thanks, Charles. > > I can certainly see the possibility of many interesting use cases there. > True or false questions would be an interesting game for our > natural-language renderers to play, for example. Given an inferred > statement supposed to be true, negate it. Test-setters might be expected to > correct errors of fact or expression, but that's up to them. It would be > interesting to monitor which statements they preferred to choose as True > and which as False, in any event. > > Questions of the form: "choose the best answer from the following" could > also be a win-win if our renderers face difficulties selecting or > expressing some combination of facts. > > Then there is the grading of information. Questions chosen for more basic > tests might be supposed to be more generally relevant than those chosen for > more advanced tests, which might feed back into the emphasis in the general > Wikipedia article (now complete with a slider bar for the reader's current > and/or target level of understanding, as well as competence in the > language). > > And finally, renderer, given the pedagogue's valuable input into what is > an appropriate statement of fact here, please turn it into questions in > many languages! > > Loving it... > > Thank you again, Charles > > Best regards, > Al. > > Today's Topics: > > 1. Re: How to store wikitext along the structured content? > (Grounder UK) > 2. Re: Comprehension questions (Charles Matthews) >

3 2

Re: [Abstract-wikipedia] Loose notes
by Grounder UK 05 Aug '20

05 Aug '20

Just one huge Thank You for Ordia, Finn Årup Nielsen! It's really coming along nicely now we have so many more Lexemes. You are quite right, of course; we're not quite up to 325,000. I overlooked the possibility of a Lexeme having multiple lemmas. A few have as many as six, it seems! Sorry, for that slight overstatement. I hope you didn't think you had lost some. While I'm apologizing, it seems that I got the link to your aclweb.anthology paper wrong when I included it earlier! (It should be "2020.ldl" not "2020.idl", of course.) Sorry for that, too. I assume that https://www.aclweb.org/anthology/2020.ldl-1.12.pdf [corrected link] is identical to https://people.compute.dtu.dk/faan/ps/Nielsen2020Lexemes.pdf. Thank you again for your great work. I hope my mistakes did not inconvenience you too much. Best regards, Al. On Tuesday, 4 August 2020, <abstract-wikipedia-request(a)lists.wikimedia.org> wrote: > Send Abstract-Wikipedia mailing list submissions to > abstract-wikipedia(a)lists.wikimedia.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.wikimedia.org/mailman/listinfo/abstract-wikipedia > or, via email, send a message with subject or body 'help' to > abstract-wikipedia-request(a)lists.wikimedia.org > > You can reach the person managing the list at > abstract-wikipedia-owner(a)lists.wikimedia.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Abstract-Wikipedia digest..." > > > Today's Topics: > > 1. Re: Loose notes (Andy) > 2. Re: Loose notes (fn(a)imm.dtu.dk) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 4 Aug 2020 17:49:03 +0200 > From: Andy <borucki.andrzej(a)gmail.com> > To: "General public mailing list for the discussion of Abstract > Wikipedia (aka Wikilambda)" <abstract-wikipedia@lists. > wikimedia.org> > Subject: Re: [Abstract-wikipedia] Loose notes > Message-ID: > <CAE2KeALchD9EAY0HPZgmR9y760eVPO=O+mWiEd5+o0Ns==zbYA@mail. > gmail.com> > Content-Type: text/plain; charset="utf-8" > > Is any road map on https://meta.wikimedia.org/ with estimated points of > time for Abstract Wikipedia? > > pon., 3 sie 2020 o 18:43 Grounder UK <grounderuk(a)gmail.com> napisał(a): > > > Plenty more work to be done! > > >

1 0

Re: [Abstract-wikipedia] Loose notes
by Grounder UK 04 Aug '20

04 Aug '20

Andrzej, Yes, there are over 325,000 lexemes in Wikidata now, over 40,000 for English. "Abstract" definitions are a little tricky, but it is not Lexemes themselves that are defined, it is their Senses, and Senses can be linked to Wikidata Items, which connects Lexemes into the abstract graph of "knowledge". Translations are still very incomplete but, as with definitions, it is the Sense that should have the translation. The difficulty is that translation cannot imply identity, which means that you cannot assume that a Sense to Sense translation allows you to acquire translations from the Sense you translate into. If you think of each Sense as a set, you cannot tell whether the translated Sense is a subset or a superset. What we need for that is the concept of the intersection between the two sets, which would be part of each Sense but not necessarily the whole of either Sense. So, broadly, your example of "zamek" is not a problem; you can connect the "lock" Sense to the Sense of the English word "lock" (L1132-S1) as well as to the identifier for the encyclopedic concept Q228039 and/or Q24644118 (claimed to be a subclass of Q228039). But you should not connect it to L1132-S2 (which connects to Q105731 pl:"Śluza wodna") or to L1132-S3 (Q1134386 pl:"Zamek (broń)", assuming that's a different Sense of "zamek" too). (I say this without knowing enough Polish to know if it makes sense; I'm living in Searle's Chiński pokój!)[1] I don't know whether the lexical data is in the dumps now, but it will be pretty huge just by itself. It is also quite dependent on the main Wikidata pages. For our natural-language generation, that's a great strength, because we can move naturally from the concept to the word and related vocabulary in any language without doing any translation. The extra context we need to be able to choose the right Form of the Lexeme for the Sense... that will need more work on the data, as will characterising thesaurus relations (hypernymy, synonymy, hyponymy, antonymy etc) so that good alternative Lexemes can be found. In an "abstract" context, these can be thought of as "translations" into overlapping Senses, but the extent to which we represent and consult (or navigate within) the broader compound Sense domain (the set union of the Senses) is... an interesting challenge. As for a fully "abstract" dictionary that can be read in any language... We'll be better able to think about that once we have built a few renderers for our "abstract" encyclopedic content, in my view. Machine translation and natural-language understanding are not our primary goal. I think we will make progress on both, if we remember to pay attention to inverse functions as we evolve our NLG renderers, but we have a very long way to go in all directions (and all languages). Best regards, Al. [1] https://pl.wikipedia.org/wiki/Chi%C5%84ski_pok%C3%B3j On Monday, 3 August 2020, <abstract-wikipedia-request(a)lists.wikimedia.org> wrote: > Send Abstract-Wikipedia mailing list submissions to > abstract-wikipedia(a)lists.wikimedia.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.wikimedia.org/mailman/listinfo/abstract-wikipedia > or, via email, send a message with subject or body 'help' to > abstract-wikipedia-request(a)lists.wikimedia.org > > You can reach the person managing the list at > abstract-wikipedia-owner(a)lists.wikimedia.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Abstract-Wikipedia digest..." > > > Today's Topics: > > 1. Re: Natural Language and Mathematics Generation (Adam Sobieski) > 2. Re: Loose notes (Andy) > 3. Re: Loose notes (Arthur Smith) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 3 Aug 2020 18:23:03 +0000 > From: Adam Sobieski <adamsobieski(a)hotmail.com> > To: Charles Matthews <charles.r.matthews(a)ntlworld.com>, "General > public mailing list for the discussion of Abstract Wikipedia (aka > Wikilambda)" <abstract-wikipedia(a)lists.wikimedia.org> > Subject: Re: [Abstract-wikipedia] Natural Language and Mathematics > Generation > Message-ID: > <CH2PR12MB4184F2C81E4CD533ACFE9547C54D0(a)CH2PR12MB4184.namprd > 12.prod.outlook.com> > > Content-Type: text/plain; charset="utf-8" > > Charles, > > There is also MathML to consider. Work is underway at the W3C with respect > to a new version of MathML, MathML4 [1][2]. Work is underway with respect > to adding MathML support to Chromium [3][4]. > > Instead of LaTeX, MathML could be the way to go. > > > Best regards, > Adam > > [1] https://www.w3.org/community/mathml4/ > [2] https://mathml-refresh.github.io/mathml/ > [3] https://www.chromestatus.com/feature/5240822173794304 > [4] https://mathml.igalia.com/ > > From: Charles Matthews via Abstract-Wikipedia<mailto:abst > ract-wikipedia(a)lists.wikimedia.org> > Sent: Monday, August 3, 2020 1:53 PM > To: General public mailing list for the discussion of Abstract Wikipedia > (aka Wikilambda)<mailto:abstract-wikipedia@lists.wikimedia.org> > Subject: Re: [Abstract-wikipedia] Natural Language and Mathematics > Generation > > > > On 03 August 2020 at 16:50 Adam Sobieski <adamsobieski(a)hotmail.com> wrote: > > > > By utilizing <math>LaTeX</math> elements in an XML-based intermediate > output format, one could simply copy that mathematical content to the > resultant output wikitext [3]. Wikitext utilizes this same convention for > mathematical expressions [3]. > > > > Whether or not to include mathematics in Abstract Wikipedia is an > important decision to make at a future point. Choosing to include > mathematics would entail discussions about representing mathematical > knowledge on Wikidata. It would entail discussions about how specific > senses of certain words have mathematical meaning. It would entail > discussions about how algorithms should determine when to use mathematical > and scientific notations and when they should, instead, use paraphrases > with the semantic content expressed using natural language. These are just > some of the discussion topics which would arise should we desire to include > mathematical and scientific notations in Abstract Wikipedia articles. > > > > > > I'm disagreeing with much of this. > > On LaTeX: while it is "industry standard", I'd like to draw attention to a > point made in https://en.wikipedia.org/wiki/Help:Displaying_a_formula#Rend > ering: "Latex does not have full support for Unicode characters, and not > all characters render." > > It goes on to suggest that Vietnamese, for example, would not be well > catered for, in terms of its diacritics. > > I appreciate that we are only talking currently about scoping, and > high-level initial planning. But given AW's objectives, this is not a good > sign, and I don't think we should just assume that LaTeX as an incumbent > gets waved through. It is pre-Web, and something closer to HTML would be > preferable, in my view. > > My background is in mathematics, and began my Wikipedia career writing > mathematics articles. There are certainly issues, such as prose/notation > balance. Mathematical language is heavily overloaded, from the > disambiguation aspect. But I'm not really recognising the landscape of > issues set out there. > > Charles > >

2 1

Re: [Abstract-wikipedia] Loose notes
by Grounder UK 04 Aug '20

04 Aug '20

Hi, Andrzej The assumption at the moment is, I think, that we will be using the Wikidata lexicographical data [1]. This is not yet as extensive as Wiktionary data [2], but it addresses many of the integrity issues. As far as I understand it, the modelling of Sense still suffers from the flaw that a Sense is presented as a "child" of a Lexeme. So, for example, L1883-S1 is a Sense of Lexeme L1883, representing the English verb to "be" with a gloss of "exist" and a "synonym" relationship to L2148-S1, a Sense of Lexeme L2148, representing the English verb to "exist". I could be wrong, but the simple idea of a word-free Sense to which all languages can link is implemented only through a possible link to a concrete Wikidata Item, so both L1883-S1 and L2148-S1 are linked to Q468777 (existence) and Q203872 (being). Apart from that, a separate translation of each Sense into each corresponding Sense in each language seems to be the intent, at present. Wikidata also has Forms of Lexemes (but I didn't find "widziałem"). The Lexeme L185 ("see") has a Form L185-F3 ("saw") but this has no link to Form L18498-F1, the uninflected form of the verb to "saw" (unlike Wiktionary, which supports homographs implicitly). Each form has "grammatical features", showing that L185-F3 is the "simple past" of L185 but the same string, "saw", is the "simple present" of L18498. It does not explicitly say that this is not the case in the third person singular, but there is a different form, L18498-F2, which is both "simple present" and "third-person singular", so there may be a presumption that the more particular overrides the more general. For "abstract" Senses, we could think of "abstract" as a new language, and then have translations between "abstract" "language" and Senses in all natural (and synthetic) languages. This would give you your "senses dictionary" (and allow implied translations between any Senses linked to the "abstract" Sense. When we need to generate a word in a particular language, we would need to translate the "abstract" Sense to the target language Lexeme and then consult the Forms of that Lexeme to identify which ones are applicable, given the "grammatical features" of the context. Plenty more work to be done! Best regards, Al. [1] https://www.wikidata.org/wiki/Wikidata:Lexicographical_data/Documentation [2] https://www.aclweb.org/anthology/2020.idl-1.12.pdf On Monday, 3 August 2020, <abstract-wikipedia-request(a)lists.wikimedia.org> wrote: > Send Abstract-Wikipedia mailing list submissions to > abstract-wikipedia(a)lists.wikimedia.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.wikimedia.org/mailman/listinfo/abstract-wikipedia > or, via email, send a message with subject or body 'help' to > abstract-wikipedia-request(a)lists.wikimedia.org > > You can reach the person managing the list at > abstract-wikipedia-owner(a)lists.wikimedia.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Abstract-Wikipedia digest..." > > > Today's Topics: > > 1. Re: Comprehension questions (Charles Matthews) > 2. Natural Language and Mathematics Generation (Adam Sobieski) > 3. Re: Natural Language and Mathematics Generation (Charles Matthews) > 4. Loose notes (Andy) > > > ---------------------------------------------------------------------- > > > ------------------------------ > > Message: 4 > Date: Mon, 3 Aug 2020 12:29:03 +0200 > From: Andy <borucki.andrzej(a)gmail.com> > To: abstract-wikipedia(a)lists.wikimedia.org > Subject: [Abstract-wikipedia] Loose notes > Message-ID: > <CAE2KeAK00kSL=jJp8gNGPNp_N8KGH0yXXUXKSa6XLM9R-ParvA@ > mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > Hi, > > Abstract Wikipedia give benefits: > > - first, is creating multi-language corpus for machine translation > learning. The big disadvantage of the existing multi-language corpuses is > that most of data is from movie subtitles, which are very inaccurate. > > - second, that it will data for Word Sense Disambiguation learning and WSD > in many languages(!). > > In abstract form should be graph of senses. Senses will be choosed from > English Wordnet/UNL or English Wiktionary? UNL is piece of good work but is > inactive for years and not evolves. Wiktoinary senses have plus, that are > grouped by etymology – quite different senses are in other etymology group. > Abstract Wikipedia will linked with Wiktionary? Wiktionary senses numbers > should be now persistent , or better have unique idents. Wiktionary has > advantage that senses are translated to other languages, with disadvantage > that its points to words not senses in other language. Alternative Abstract > Wikipedia can have own sense list with idents but how to lik with > Wiktionary? > > Graph: should be possibility to create text in many/all laguages. For > example in English is “I saw”, in Polish “widziałemwidziałam” – Polish need > gender, in Abstract form should be gender of verb, even though some > languages not uses it. > > Senses dictionary can grow gradually with abstract text. If I edit abstract > text, editor should enforce me add word with senses to dictionary if not > exists and enable me to add new sense if not exists. > > Is neede: > > abstract text = corpus > > growing dictionary of senses > > growing senses to national language senses dictionary > > possibly link with Wiktionaries > > > Best regards, > > Andrzej >

3 4

Advanced user interface idea
by Andy 04 Aug '20

04 Aug '20

note: graphs in xml form are too wordy, maybe better using form with braces, dots and @ User open web editor. Pastes raw English text, for example first paragraph (maybe may be restriction to 1000 chars?) of https://en.wikipedia.org/wiki/Linux. Text is tokenized by spaCy, divided into sentences and words. For words and phrases are finding lemmas and parts of speech. Words changes color and become clickable. User can choose sense for lexem, add sense or add each lexem. Next, is shown structure of sentence graph, user can change it and add properties. In first stage, before making this editor user can edit graphs code in special language, it must be not too wordy. Best regards, Andrzej

1 0

Re: [Abstract-wikipedia] Natural Language and Mathematics Generation
by Grounder UK 03 Aug '20

03 Aug '20

More good points, Adam... At this stage, I can't say that formats bother me greatly, although clearly we need to think about them. We do have to start with Wikidata but I wonder whether we should also be looking at our wiki of functions. Could we consider a mathematical expression as a symbolic representation of an executable function? I like the idea of a Wikipedia that will actually compute the result of a function it is telling you about, not least because editors could verify that the syntax is correct by testing the function. But if some expressions are executable, that broadens the question of format. To have a string that could be copied into a spreadsheet, for example, would be an interesting function for many. So I'm wondering how far you can get by "labelizing" JSON objects with computer language labels rather than natural language ones. So our "multiply" function is "labelized" "=PRODUCT" and E=PRODUCT(m,POWER(c,2))... or E=m*c^2...? Thinking only about text, I think we are bound to take a broader WMF-wide view because we should at least consider how we can meet the requirements of each and every Wikipedia, without ignoring sister projects like Wikiversity. That's not to advocate a free-for-all, but if we increasingly represent the semantics of mathematical expressions, rather than their typography, this gives us something that can be represented more meaningfully in Wikidata and, from there, expressed in natural language as well as in a variety of symbolic and even functional forms. I happen to think it will also aid reuse of functions from the wiki, but I haven't given that idea much thought. Best regards, Al. On Monday, 3 August 2020, <abstract-wikipedia-request(a)lists.wikimedia.org> wrote: > Send Abstract-Wikipedia mailing list submissions to > abstract-wikipedia(a)lists.wikimedia.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.wikimedia.org/mailman/listinfo/abstract-wikipedia > or, via email, send a message with subject or body 'help' to > abstract-wikipedia-request(a)lists.wikimedia.org > > You can reach the person managing the list at > abstract-wikipedia-owner(a)lists.wikimedia.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Abstract-Wikipedia digest..." > > > Today's Topics: > > 1. Re: Natural Language and Mathematics Generation (Adam Sobieski) > 2. Re: Loose notes (Andy) > 3. Re: Loose notes (Arthur Smith) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Mon, 3 Aug 2020 18:23:03 +0000 > From: Adam Sobieski <adamsobieski(a)hotmail.com> > To: Charles Matthews <charles.r.matthews(a)ntlworld.com>, "General > public mailing list for the discussion of Abstract Wikipedia (aka > Wikilambda)" <abstract-wikipedia(a)lists.wikimedia.org> > Subject: Re: [Abstract-wikipedia] Natural Language and Mathematics > Generation > Message-ID: > <CH2PR12MB4184F2C81E4CD533ACFE9547C54D0@CH2PR12MB4184. > namprd12.prod.outlook.com> > > Content-Type: text/plain; charset="utf-8" > > Charles, > > There is also MathML to consider. Work is underway at the W3C with respect > to a new version of MathML, MathML4 [1][2]. Work is underway with respect > to adding MathML support to Chromium [3][4]. > > Instead of LaTeX, MathML could be the way to go. > > > Best regards, > Adam > > [1] https://www.w3.org/community/mathml4/ > [2] https://mathml-refresh.github.io/mathml/ > [3] https://www.chromestatus.com/feature/5240822173794304 > [4] https://mathml.igalia.com/ > > From: Charles Matthews via Abstract-Wikipedia<mailto:abst > ract-wikipedia(a)lists.wikimedia.org> > Sent: Monday, August 3, 2020 1:53 PM > To: General public mailing list for the discussion of Abstract Wikipedia > (aka Wikilambda)<mailto:abstract-wikipedia@lists.wikimedia.org> > Subject: Re: [Abstract-wikipedia] Natural Language and Mathematics > Generation > > > > On 03 August 2020 at 16:50 Adam Sobieski <adamsobieski(a)hotmail.com> wrote: > > > > By utilizing <math>LaTeX</math> elements in an XML-based intermediate > output format, one could simply copy that mathematical content to the > resultant output wikitext [3]. Wikitext utilizes this same convention for > mathematical expressions [3]. > > > > Whether or not to include mathematics in Abstract Wikipedia is an > important decision to make at a future point. Choosing to include > mathematics would entail discussions about representing mathematical > knowledge on Wikidata. It would entail discussions about how specific > senses of certain words have mathematical meaning. It would entail > discussions about how algorithms should determine when to use mathematical > and scientific notations and when they should, instead, use paraphrases > with the semantic content expressed using natural language. These are just > some of the discussion topics which would arise should we desire to include > mathematical and scientific notations in Abstract Wikipedia articles. > > > > > > I'm disagreeing with much of this. > > On LaTeX: while it is "industry standard", I'd like to draw attention to a > point made in https://en.wikipedia.org/wiki/Help:Displaying_a_formula# > Rendering: "Latex does not have full support for Unicode characters, and > not all characters render." > > It goes on to suggest that Vietnamese, for example, would not be well > catered for, in terms of its diacritics. > > I appreciate that we are only talking currently about scoping, and > high-level initial planning. But given AW's objectives, this is not a good > sign, and I don't think we should just assume that LaTeX as an incumbent > gets waved through. It is pre-Web, and something closer to HTML would be > preferable, in my view. > > My background is in mathematics, and began my Wikipedia career writing > mathematics articles. There are certainly issues, such as prose/notation > balance. Mathematical language is heavily overloaded, from the > disambiguation aspect. But I'm not really recognising the landscape of > issues set out there. > > Charles > > >

1 0

2024

2023

2022

2021

2020

Abstract-Wikipedia