Oops, sent before done...
Wikidata property to indicate a location
"evokes" : "non-containment"
"facet of" : "geographic location"
located in the administrative territorial entity - Wikidata
<https://www.wikidata.org/wiki/Property:P131>
"evokes" : "containment"
"external superproperty" :
On Tue, Jun 7, 2022 at 4:31 PM Thad Guidry <thadguidry(a)gmail.com> wrote:
To further the idea I had...
Looking at Wikidata property to indicate a location - Wikidata
<https://www.wikidata.org/wiki/Q18615777>
"facet of"
https://www.wikidata.org/wiki/Property:P1269 seems to be just
one aspect of providing some context.
If we had a property something like "evokes" then additionally we could
apply quite a bit for subtle context for Abstract Wikipedia's usage towards
modeling.
Wikidata property to indicate a location
"evokes" :
"facet of" : "geographic location"
Thad
https://www.linkedin.com/in/thadguidry/
https://calendly.com/thadguidry/
On Tue, Jun 7, 2022 at 4:20 PM Thad Guidry <thadguidry(a)gmail.com> wrote:
> Hi Denny,
>
> It appears quickly to me that much more metadata will likely need to be
> added to our existing Wikidata properties in order to surface their full
> contextual application towards curating and generating appropriate models
> for specific types of items. (Predicates are king sorta thing)
>
> For instance, "in" found in a sentence could be contextualized with
> containerization or generally "grouping" into a set... or not.
> Example: Is it "in" something (a container, an ocean, a pot, etc.) or is
> it "in" a location/place, or is it "in" a set of things (one
element in a
> set or group, a specific chemical bond in a chemical compound, etc.)
>
> It will be interesting to see what additional metadata is going to be
> needed for Wikidata properties beyond our current rudimentary "instance
of"
> P31
> I can imagine adding much more knowledge in general (metadata) about Wikidata
> property to indicate a location - Wikidata
> <https://www.wikidata.org/wiki/Q18615777> as an example.
>
> Keep up the great work and knowledge sharing, team!
>
> Thad
>
https://www.linkedin.com/in/thadguidry/
>
https://calendly.com/thadguidry/
>
>
> On Tue, Jun 7, 2022 at 3:38 PM Denny Vrandečić <dvrandecic(a)wikimedia.org>
> wrote:
>
>> The on-wiki version of this newsletter can be found here:
>>
https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Updates/2022-06-07
>>
>> --
>>
>> Communities will create (at least) two different types of articles using
>> Abstract Wikipedia: on the one hand, we will have highly-standardised
>> articles based entirely on Wikidata; and on the other hand, we will have
>> bespoke, hand-crafted content, assembled sentence by sentence. Today we
>> will discuss the first type, and we will discuss the second type in an
>> upcoming newsletter.
>>
>> Articles of the first type can be created very quickly and will likely
>> constitute the vast majority of articles for a long time to come. For that
>> we can use models, *i.e.* a text with variables. Put differently, a
>> text with gaps which get filled from a different source such as a list,
>> along the lines of the mad libs <https://en.wikipedia.org/wiki/Mad_Libs>
game.
>> A model can be created once for a specific type of item and then used for
>> every single item of this type that has enough data in Wikidata. The
>> resulting articles are similar to many bot-created articles that already
>> exist in various Wikipedias.
>>
>> For example, in many languages, bots were used to create or maintain the
>> articles for years (such as the articles about 1313
>> <https://www.wikidata.org/wiki/Q5735>, 1428
>> <https://www.wikidata.org/wiki/Q6315>, or 1697
>> <https://www.wikidata.org/wiki/Q7702>, each of which is available in
>> more than a hundred languages). In English Wikipedia, many articles for US
>> cities were created by a bot
>> <https://en.wikipedia.org/wiki/List_of_Wikipedia_controversies#2002> based
>> on the US census, and later updated after the 2010 census. Lsjbot
>> <https://en.wikipedia.org/wiki/Lsjbot> by Sverker Johansson is a well
>> known example of a bot that has created millions of articles about
>> locations or species across a few languages such as Swedish, Waray Waray,
>> or Cebuano. Comparable activities, although not as prolific, have been
>> going on in quite a few other languages.
>>
>> How do these approaches work? Assume you have a dataset such as the
>> following list of countries:
>> Caption
>> CountryCountryCapitalPopulation
>> Jordan Asia Amman 10428241
>> Nicaragua Central America Managua 5142098
>> Kyrgyzstan Asia Bishkek 6201500
>> Laos Asia Vientiane 6858160
>> Lebanon Asia Beirut 6100075
>>
>> Now we can create a model that can generate a complete text from this
>> data, such as
>>
>> “*<Country>* is a country in *<Continent>* with a population of
>> *<Population>*. The capital of *<Country>* is *<Capital>*.”
>>
>> With this text and the above dataset, we would have created the
>> following five proto-articles (references not shown for simplicity):
>>
>> *Jordan* is a country in Asia with a population of 10,428,241. The
>> capital of Jordan is Amman.
>>
>> *Nicaragua* is a country in Central America with a population of
>> 5,142,098. The capital of Nicaragua is Managua.
>>
>> *Kyrgyzstan* is a country in Asia with a population of 6,201,500. The
>> capital of Kyrgyzstan is Bishkek.
>>
>> *Laos* is a country in Asia with a population of 6,858,160. The capital
>> of Laos is Vientiane.
>>
>> *Lebanon* is a country in Asia with a population of 6,100,075. The
>> capital of Lebanon is Beirut.
>>
>> Classical textbooks on that topic such as *“Building natural language
>> generation systems”
>> <https://en.wikipedia.org/wiki/Special:BookSources/978-0-521-02451-8>*
call
>> this method *“mail merge”* (even though it is used for more than mail).
>> A model is combined with a dataset, often from a spreadsheet or a database.
>> This has been used for decades to create bulk mailings
>> <https://en.wikipedia.org/wiki/Mail_merge> and other bulk content, and
>> is a form of mass customisation
>> <https://en.wikipedia.org/wiki/Mass_customization>. The methods have
>> become increasingly complex over time and are able to answer more
>> questions: How to deal with missing or optional information? How to adapt
>> part of the text to the data, *e.g.* use plurals or grammatical gender
>> or noun classes where appropriate, *etc.*? The bots that were mentioned
>> above, which created millions of articles in various languages on
>> Wikipedia, have mostly worked along these lines.
>>
>> For a great example of how far the model approach can be pushed,
>> consider Magnus Manske’s Reasonator
>> <https://meta.wikimedia.org/wiki/Reasonator>, which, based on the data
>> in Wikidata, creates the following automatic description for Douglas
>> Adams <https://reasonator.toolforge.org/?q=Q42>:
>>
>> *Douglas Adams* was a British playwright, screenwriter, novelist,
>> children's writer, science fiction writer, comedian, and writer. He was
>> born on March 11, 1952 in Cambridge to Christopher Douglas Adams and Janet
>> Adams. He studied at St John's College from 1971 until 1974 and Brentwood
>> School from 1959 until 1970. His field of work included science fiction,
>> comedy, satire, and science fiction. He was a member of Groucho Club and
>> Footlights. He worked for The Digital Village from 1996 and for BBC. He
>> married Jane Belson on November 25, 1991 (married until on May 11, 2001 ),
>> Jane Belson on November 25, 1991 (married until on May 11, 2001 ), and Jane
>> Belson on November 25, 1991 (married until on May 11, 2001 ). His children
>> include Polly Adams, Polly Adams, and Polly Adams. He died of myocardial
>> infarction on May 11, 2001 in Santa Barbara. He was buried at Highgate
>> Cemetery.
>>
>> If we were to say that this is merely better than nothing, I think we
>> would undersell the achievement of Reasonator. The above text, together
>> with the appealing display of the structured data in Reasonator, leads to a
>> more comprehensive access to knowledge than many of the individual language
>> Wikipedias provide for Douglas Adams. For comparison, check out the
>> articles in Azery <https://az.wikipedia.org/wiki/Duqlas_Adams>, Urdu
>>
<https://ur.wikipedia.org/wiki/%DA%88%DA%AF%D9%84%D8%B3_%D8%A7%DB%8C%DA%88%D9%85%D8%B3>
>> , Malayalam
>>
<https://ml.wikipedia.org/wiki/%E0%B4%A1%E0%B4%97%E0%B5%8D%E0%B4%B2%E0%B4%B8%E0%B5%8D%E0%B4%86_%E0%B4%A1%E0%B4%82%E0%B4%B8%E0%B5%8D>
>> , Korean
>>
<https://ko.wikipedia.org/wiki/%EB%8D%94%EA%B8%80%EB%9F%AC%EC%8A%A4_%EC%95%A0%EB%8D%A4%EC%8A%A4>,
>> or Danish <https://da.wikipedia.org/wiki/Douglas_Adams>. At the same
>> time, it shows errors that most contributors wouldn’t know how to fix (such
>> as the repetition of the names of the children, or the spaces inside the
>> brackets, *etc.*).
>>
>> The Article placeholder
>> <https://www.mediawiki.org/wiki/Extension:ArticlePlaceholder> project
>> has partially fulfilled the role of filling content gaps, but the
>> developers have intentionally shied away from the results looking too much
>> like an article. They display structured data from Wikidata within the
>> context of a language Wikipedia. For example, here is the generated page
>> about *triceratops* in Haitian Creole
>> <https://ht.wikipedia.org/wiki/Espesyal:AboutTopic/Q14384>.
>>
>> One large disadvantage of using bots to create articles in Wikipedia has
>> been that this content was mostly controlled by a very small subset of the
>> community — often a single person. Many of the bots and datasets have not
>> been open sourced in a way that someone else could easily come in, make a
>> change, and re-run the bot. (Reasonator avoids this issue, because the text
>> is generated dynamically and is not incorporated into the actual Wikipedia
>> article.)
>>
>> With Wikifunctions and Wikidata, we will be able to give control over
>> all these steps to the wider community. Both the models and the data will
>> be edited on wiki, with all the usual advantages of having a wiki: there is
>> a clear history, everyone can edit through the Web, people can discuss,
>> *etc.*. The data used to populate the models will be maintained in
>> Wikidata, and the models themselves in Wikifunctions. This will allow us to
>> collaborate on the texts, unleash the creativity of the community, spot and
>> correct errors and edge cases together, and slowly extend the types of
>> items and the coverage per type.
>>
>> In a follow-up essay, we will discuss a different approach to creating
>> abstract content, where the content is not the result of a model based on
>> the type of the described item, but rather a manually constructed article,
>> built up sentence by sentence.
>>
>> *Development update from the week of May 27:*
>>
>> - The team had a session at Hackathon, which was well attended
>> (about 30 people). Thanks to everyone for being there and your questions
>> and comments!
>> - We also had follow-up meetings with User:Mahir256, to improve
>> alignment on the NLG stream
>> - Below is the brief weekly summary highlighting the status of each
>> workstream
>> - Performance:
>> - Observability document drafted.
>> - Updated Helm charts for getting function-* services in
>> staging.
>> - Completed performance metrics design and shared for review
>> - NLG:
>> - Scoped out necessary changes to Wikifunctions post-launch
>> - Metadata:
>> - Started recording and passing up some function-evaluator
>> timing metrics to the orchestrator
>> - Experience:
>> - WikiLambda (PHP) layer has been migrated to the new format
>> of typed lists
>> - Improved the mobile experience of the function view page
>> - Transitioned the Tabs component to use Codex's, thanks to
>> the Design Systems Team.
>> - Design: Carried out end-to-end user flow testing in Bangla.
>>
>> *(Apologies for this update being late. We plan to send out another
>> update this week)*
>> _______________________________________________
>> Abstract-Wikipedia mailing list --
>> abstract-wikipedia(a)lists.wikimedia.org
>> List information:
>>
https://lists.wikimedia.org/postorius/lists/abstract-wikipedia.lists.wikime…
>>
>