This breaking change is relevant for anyone who consumes Wikidata RDF data
through Special:EntityData (rather than the dumps) without using the “dump”
flavor.
When an Item references other entities (e.g. the statement P31:Q5), the
non-dump (?flavor=dump) RDF output of that Item would include the labels
and descriptions of the referenced entities (e.g. P31 and Q5) in all
languages. That bloats the output drastically and causes performance
issues. See Special:EntityData/Q1337.rdf
<https://www.wikidata.org/wiki/Special:EntityData/Q1337.rdf> as an example.
We will change this so that for referenced entities, only labels and
descriptions in the request language (set e.g. via ?uselang=) and its
fallback languages are included in the response. For the main entity being
requested, labels, descriptions and aliases are still included in all
languages available, of course.
If you don’t actually need this “stub” data of referenced entities at all,
and are only interested in data about the main entity being requested, we
encourage you to use the “dump” flavor instead (include flavor=dump in the
URL parameters). In that case, this change will not affect you at all,
since the dump flavor includes no stub data, regardless of language.
This change is currently available for testing at test.wikidata.org. It
will be deployed on Wikidata on August 23rd. You are welcome to give us
general feedback by leaving a comment in this ticket
<https://phabricator.wikimedia.org/T285795>.
If you have any questions please do not hesitate to ask.
Cheers,
--
Mohammed Sadat
*Community Communications Manager for Wikidata/Wikibase*
Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Phone: +49 (0)30 219 158 26-0
https://wikimedia.de
Keep up to date! Current news and exciting stories about Wikimedia,
Wikipedia and Free Knowledge in our newsletter (in German): Subscribe now
<https://www.wikimedia.de/newsletter/>.
Imagine a world in which every single human being can freely share in the
sum of all knowledge. Help us to achieve our vision!
https://spenden.wikimedia.de
Wikimedia Deutschland – Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Hi everyone,
We have our next Wikibase Live Session on Thursday, August 26th at
1600 UTC (18:00
Berlin).
What are you working on around Wikibase? You're welcome to come and share
with the Wikibase community.
*Details about how to participate are below:*
Time: 16:00 UTC (18:00 Berlin), 1 hour, Thursday 26th August 2021
Google Meet: https://meet.google.com/nky-nwdx-tuf
Join by phone:
https://meet.google.com/tel/nky-nwdx-tuf?pin=4267848269474&hs=1
Notes: https://etherpad.wikimedia.org/p/WBUG_2021.08.26
If you have any questions, please do not hesitate to ask.
Talk to you soon!
--
Mohammed Sadat
*Community Communications Manager for Wikidata/Wikibase*
Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Phone: +49 (0)30 219 158 26-0
https://wikimedia.de
Keep up to date! Current news and exciting stories about Wikimedia,
Wikipedia and Free Knowledge in our newsletter (in German): Subscribe now
<https://www.wikimedia.de/newsletter/>.
Imagine a world in which every single human being can freely share in the
sum of all knowledge. Help us to achieve our vision!
https://spenden.wikimedia.de
Wikimedia Deutschland – Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Hey all,
Henry (in CC) and I have been looking into the possibility of importing
a dataset in the order of around 10-20 million items into Wikibase, and
maybe around 50 million claims. Wikibase would be perfect for our needs,
but we have been struggling quite a lot to load the data.
We are using the Docker version. Initial attempts on a small sample of
10-20 thousand items were not promising, with the load taking a very
long time. We found that RaiseWikibase helped to considerably speed up
the initial load:
https://github.com/UB-Mannheim/RaiseWikibase
but on a small sample of 10-20 thousand items, the secondary indexing
process was taking several hours. This is the building_indexing()
process here (which just calls maintenance scripts):
https://github.com/UB-Mannheim/RaiseWikibase/blob/main/RaiseWikibase/raiser…
This seems to be necessary for labels to appear correctly in the wiki,
and for search to work.
Rather than call that method, we have been trying to invoke the
maintenance scripts directly and play with arguments that might help,
such as batch size. However, some of the scripts still take a long time,
even considering the small size of what we are loading. For example:
docker exec wikibase-docker_wikibase_1 bash "-c" "php
extensions/Wikibase/repo/maintenance/rebuildItemTerms.php --sleep 0.1
--batch-size 10000"
Takes around 2 hours on the small sample (which we could multiply by a
thousand for the full dataset, i.e., 83 days as an estimate).
Investigating the mysql database, it seems to be generating four tables:
wbt_item_terms, wbt_term_in_lang, wbt_text, and wbt_text_in_lang, but
these are in the order of 20,000 tuples when finished, so it is
surprising that the process takes so long. My guess is that the PHP code
is looking up pages per item, generating thousands of random accesses on
the disk, when it would seem better to just stream tuples/pages
contiguously from the table/disk?
Later on the CirrusSearch indexing is also taking a long time for the
small sample, generating jobs for batches that take a long time to
clear. In previous experience, ElasticSearch will happily eat millions
of documents in an hour. We are still looking at how batch sizes might
help, but it feels like it is taking much longer than it should.
Overall, we were wondering if we are approaching this bulk import in the
right way? It seems that the PHP scripts are not optimised for
performance/scale? Anyone has experience, tips or pointers on converting
and loading large-ish scale legacy data into Wikibase? Is there no
complete solution (envisaged) for this right now?
Best,
Aidan
Hi everyone,
We have our next Wikibase Live Session on Thursday, July 29th at 1600
UTC (18:00
Berlin) (Convert local time <https://zonestamp.toolforge.org/1627567205>).
This month we'll have a presentation by Luca Mauri explaining the basics of
setting up a Wikibase instance from scratch. We will also be discussing the
outcome of the Usergroup affiliate 2021 contact election.
*Details about how to participate are below:*
Time: 16:00 UTC (18:00 Berlin), 1 hour, Thursday 29th July 2021
Google Meet: https://meet.google.com/nky-nwdx-tuf
Join by phone:
https://meet.google.com/tel/nky-nwdx-tuf?pin=4267848269474&hs=1
Notes: https://etherpad.wikimedia.org/p/WBUG_2021.07.29
If you have any questions, please do not hesitate to ask.
Talk to you soon!
--
Mohammed Sadat
*Community Communications Manager for Wikidata/Wikibase*
Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Phone: +49 (0)30 219 158 26-0
https://wikimedia.de
Keep up to date! Current news and exciting stories about Wikimedia,
Wikipedia and Free Knowledge in our newsletter (in German): Subscribe now
<https://www.wikimedia.de/newsletter/>.
Imagine a world in which every single human being can freely share in the
sum of all knowledge. Help us to achieve our vision!
https://spenden.wikimedia.de
Wikimedia Deutschland – Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Hi,
we also see performance (at ingestion time) as one of the biggest bottlenecks of the current Wikibase infrastructure. So any progress on this is very welcome from out side.
I’m far from being an expert on this so it is more a feeling …..
My guess is that one large overhead is the fact that we can edit only one entity at a time. This means that for each entity we have: 1 HTTP request, 1 parsing process, at least 1 call to the database, 1 job to ES, 1 job for updating the cache. My feeling is that if the API could support bulk import (not saying this is easy) over multiple entities one can have large improvements.
BTW: Great work by everyone
Salut
D063520
> On 23. Jul 2021, at 14:01, wikibaseug-request(a)lists.wikimedia.org wrote:
>
> Send Wikibaseug mailing list submissions to
> wikibaseug(a)lists.wikimedia.org
>
> To subscribe or unsubscribe, please visit
> https://lists.wikimedia.org/postorius/lists/wikibaseug.lists.wikimedia.org/
>
> You can reach the person managing the list at
> wikibaseug-owner(a)lists.wikimedia.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Wikibaseug digest..."
>
> Today's Topics:
>
> 1. Re: Experiences/doubts regarding bulk imports into Wikibase
> (Aidan Hogan)
> 2. Re: Experiences/doubts regarding bulk imports into Wikibase
> (Renat Shigapov)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 22 Jul 2021 22:07:58 -0400
> From: Aidan Hogan <aidhog(a)gmail.com>
> Subject: [Wikibase] Re: Experiences/doubts regarding bulk imports into
> Wikibase
> To: wikibaseug(a)lists.wikimedia.org
> Message-ID: <6e9e37d7-f910-97cb-cf86-b95d337439bc(a)gmail.com>
> Content-Type: text/plain; charset=windows-1252; format=flowed
>
> Indeed, yes! Seeks on a traditional disk are in the order of 10ms, while
> on a solid state disk they rather tend to be around 0.1 ms, so one could
> expect a 100x speed-up when doing a lot of random accesses. One could
> also memory-map the database files perhaps for a greater latency,
> assuming the data fit in RAM.
>
> So an SSD should help a lot within the current framework, but it should
> not be necessary in general. Plus an ETL load framework should still be
> much more efficient for bulk loads than an SSD with the current process.
> SSDs have lower latency, but they still have latency. One million seeks
> on a hard disk takes around 3 hours, one million seeks on an SSD takes
> around 2 minutes, reading one million records (around 100MB)
> sequentially on a standard hard disk should take around one second.
>
> Still I think that trying with an SSD for now is a good idea!
>
> Best,
> Aidan
>
> On 2021-07-22 19:02, Laurence Parry wrote:
>> I wondered earlier if you were trying to use hard disks when talking
>> about random access time. I'm interested because I may be deploying
>> Wikibase on HDD in the near future. But I'm only considering that
>> because I know size and usage will be relatively low; I won't be doing
>> bulk loads.
>>
>> The reality is, Wikibase is not designed for HDD, and it doesn't work
>> well on it. It may work slightly better with hardware RAID, but
>> realistically any high-speed Wikibase implementation needs to be on SSD.
>> Wikidata uses several, just for WDQS:
>> https://mediawiki.org/wiki/Wikidata_Query_Service/Implementation#Hardware and
>> https://phabricator.wikimedia.org/T221632 which mentions they are used
>> in RAID 0. I think the MediaWiki side of things may be even larger,
>> because more must be stored.
>>
>> While testing optimal load strategy, Adam notes "SSDs should obviously
>> be used for storage."
>> https://addshore.com/2021/02/testing-wdqs-blazegraph-data-load-performance/
>>
>> This is in part a limitation of Blazegraph, for which fast disks are the
>> primary requirement, after which CPU and RAM matter (RAM may matters
>> more if you have to use HDD, and for reads; but for loading it has to
>> save the results):
>> https://github.com/blazegraph/database/wiki/Hardware_Configuration
>> https://sourceforge.net/p/bigdata/discussion/676946/thread/5b4acb02/#2810
>> https://github.com/blazegraph/database/issues/94#issuecomment-399141685
>>
>> If they are not available to you, it might be worth using a cloud system
>> to perform the load and take a compressed dump off of it afterwards.
>> They are not necessarily cheap, but you might only need it for a
>> relatively brief time. (Oracle is even giving away ARM VMs and SSD
>> storage for free now. Regrettably, Wikibase does not yet have an arm64
>> distribution, but I'm not aware of a specific reason why it could not
>> eventually.)
>>
>> It might also be worth looking into whether Blazegraph (and MySQL) has
>> any degraded reliability nodes which you can enable during the loading
>> process. From my own experience I know database tuning can be key when
>> attempting to use HDDs as a platform. Some ideas for MySQL:
>> https://mariadb.com/kb/en/innodb-system-variables/#innodb_flush_log_at_trx_…
>> https://mariadb.com/kb/en/replication-and-binary-log-system-variables/#binl…
>>
>> You can also try 'nobarrier' as a filesystem mount option to disable
>> fsync, and 'data=writeback' for ext4. Be aware it violates ACID
>> expectations and should only be used for test purposes or when you are
>> willing to rebuild the entire database/FS if something goes wrong.
>> Similarly, RAID 0 might be useful if you bear in mind losing one disk
>> means losing all the data.
>>
>> Best regards,
>> --
>> Laurence "GreenReaper" Parry
>> https://GreenReaper.co.uk
>> ------------------------------------------------------------------------
>> *From:* Aidan Hogan <aidhog(a)gmail.com>
>> *Sent:* Thursday, July 22, 2021 10:50:06 PM
>> *To:* wikibaseug(a)lists.wikimedia.org <wikibaseug(a)lists.wikimedia.org>
>> *Subject:* [Wikibase] Re: Experiences/doubts regarding bulk imports into
>> Wikibase
>> Thanks Adam for the detailed blog post!
>>
>> From your mail, I understand that the current Wikibase framework is
>> based on continuous updates, and in that sense it can process a high
>> rate of manual edits without problem. I think though that for high
>> performance bulk updates, the best option would be an "ETL" style
>> framework that loads data directly into the underlying databases using
>> custom tools. What seems to be hurting performance in the bulk load
>> scenario is:
>>
>> 1) The creation of massive amounts of requests/jobs all at once.
>> 2) Random disk accesses that require about 10ms per go on a conventional
>> disk if the data are not cached or read sequentially.
>>
>> These are not a problem per se for the typical running and maintenance
>> of a Wikibase instance, but rather only occur when one tries to import
>> millions of items at once. Given that we can assume that an "admin" is
>> importing the data, we can also bypass a lot of the typical processes
>> that are run to verify edit permissions, rate limits, etc.
>>
>> Overall, with the goal of importing a legacy dataset of around 10
>> million items, with maybe 100 million "values", in one day, on
>> conventional hardware, these two problems would need to be resolved. The
>> underlying databases should not have a problem processing data at this
>> scale/rate using bulk update/transaction methods, but they will have a
>> problem if each item is updated as a separate transaction; even with
>> batching, it can still lead to hundreds of thousands of transactions
>> that would choke up any persistent database running on a traditional
>> hard-disk. So I think that the most performant solution would be to
>> bypass Wikibase, and to prepare the data (e.g., a user-provided JSON
>> dump) to be loaded into the underlying databases directly that Wikibase
>> can later query. The issue with this approach is that it is not so
>> trivial to understand what data need to be loaded where (this requires a
>> deep understanding of Wikibase), and that such a bulk loader would need
>> to be "synchronised" with changes to the main Wikibase software, so it
>> would require maintenance over the years.
>>
>> Best,
>> Aidan
>>
>> On 2021-07-19 17:50, Addshore wrote:
>>> This thread and my earlier email prompted me to write a deep dive blog
>>> post on exactly what happens when you send a create item request to
>>> wbeditentity.
>>>
>>> You can find this newly published (and thus probably with some typos) below
>>> https://addshore.com/2021/07/what-happens-in-wikibase-when-you-make-a-new-i…
>> <https://addshore.com/2021/07/what-happens-in-wikibase-when-you-make-a-new-i…>
>>
>>> <https://addshore.com/2021/07/what-happens-in-wikibase-when-you-make-a-new-i…
>> <https://addshore.com/2021/07/what-happens-in-wikibase-when-you-make-a-new-i…>>
>>>
>>> Hopefully this can prompt some more questions, thoughts, changes etc for
>>> the bulk import case both for RAISE and for the API etc.
>>>
>>> Adam / Addshore
>>> Wikidata Wikibase Tech Lead
>>>
>>> On Mon, 19 Jul 2021 at 17:45, Addshore <addshorewiki(a)gmail.com
>>> <mailto:addshorewiki@gmail.com <mailto:addshorewiki@gmail.com>>> wrote:
>>>
>>> Hi all,
>>>
>>> Some general thoughts on this thread.
>>>
>>> > Overall, we were wondering if we are approaching this bulk import
>>> in the
>>> > right way? It seems that the PHP scripts are not optimised for
>>> > performance/scale?
>>>
>>> In general I would say that the "right" thing to do is to speed up
>>> the app, rather than try to write directly to the SQL db.
>>> Lots / all of the work currently being done in these maintenance
>>> scripts can be / is done by a deferred job queue already within
>>> Wikibase (if configured)
>>> If this is done, then having one or more separate job runners would
>>> be beneficial, see https://www.mediawiki.org/wiki/Manual:Job_queue
>> <https://www.mediawiki.org/wiki/Manual:Job_queue>
>>> <https://www.mediawiki.org/wiki/Manual:Job_queue
>> <https://www.mediawiki.org/wiki/Manual:Job_queue>>
>>> The other key things could be a fast primary SQL database, and fast
>>> host (or hosts) serving MediaWiki/Wikibase.
>>> Wikidata.org runs on a very large shared cluster (shared with all
>>> other Wikimedia sites) with multiple web, db, cache, and job hosts etc.
>>>
>>> If you are running these maintenance scripts as part of an import
>>> where performance of the wikibase does not matter to users then
>>> using sleep 0 will provide a performance increase.
>>>
>>> Something else that might make all of this slightly simpler would be
>>> the ability to have this secondary index work head straight to the
>>> job queue, rather than run as part of the maintenance scripts
>>> themselves?
>>> (I see that runJobs is called as the last step at
>>> https://github.com/UB-Mannheim/RaiseWikibase/blob/main/RaiseWikibase/raiser…
>> <https://github.com/UB-Mannheim/RaiseWikibase/blob/main/RaiseWikibase/raiser…>
>>> <https://github.com/UB-Mannheim/RaiseWikibase/blob/main/RaiseWikibase/raiser…
>> <https://github.com/UB-Mannheim/RaiseWikibase/blob/main/RaiseWikibase/raiser…>>
>>>
>>> Currently it looks like in the building_indexing method in Raise all
>>> items will be loaded from the DB 2 times, and all properties will be
>>> loaded 3 times.
>>> This pattern for large imports is likely to get slower and slower,
>>> and also not take advantage of any caching etc that exists.
>>> The regular API sequence for example would 1) store data in the DB
>>> 2) write to any quick secondary stores 3) add the entity to a shared
>>> cache and 4) schedule jobs to populate the secondary indexes.
>>> These jobs would run fairly soon after the initial write, leading to
>>> better performance.
>>>
>>> > > It seems that the PHP scripts are not optimised for performance/scale?
>>> > It seems so.
>>>
>>> Scale yes (these scripts ran across the whole of Wikidata.org)
>>> Performance not so much, these scripts were primarily designed to be
>>> a long running task between Wikibase updates, not as part of an
>>> import process.
>>>
>>> Regarding rebuildItemterms
>>> > Takes around 2 hours on the small sample (which we could multiply
>>> by a
>>> > thousand for the full dataset, i.e., 83 days as an estimate).
>>> Indeed this script will only go so fast, it processes all entities
>>> in series, so you will have the bottleneck of a single process
>>> iterating through all entities.
>>> The jobqueue comparison again is jobs get queued and executed by any
>>> number of wjob queue runners that you wish.
>>> The next bottle neck would then be the speed of your SQL database.
>>>
>>> There is probably lots more to discuss in this thread but I found
>>> navigating it quite hard.
>>> Hopefully the above will prompt some more discussion.
>>> It would be great to be able to chat somewhere more realtime (not an
>>> email thread) on this topic too.
>>>
>>> Adam / Addshore
>>> Wikidata Wikibase Tech Lead
>>>
>>> On Sat, 17 Jul 2021 at 22:54, Aidan Hogan <aidhog(a)gmail.com
>>> <mailto:aidhog@gmail.com <mailto:aidhog@gmail.com>>> wrote:
>>>
>>> Hi Renat,
>>>
>>> On 2021-07-16 15:54, Renat Shigapov wrote:
>>> > Hi Aidan,
>>> >
>>> > I am on holidays for a few days with a limited access to
>>> internet, but my quick reply to "Do you have further plans for
>>> extending RaiseWikibase?" is yes. I'll try to handle with those
>>> secondary tables.
>>>
>>> Go enjoy your holidays. :)
>>>
>>> I can quickly respond to one point:
>>>
>>> > Regarding (2): can that be done in MariaDB?
>>>
>>> Since 10.2.3, MariaDB has support for JSON:
>>>
>>> https://mariadb.com/kb/en/json-functions/
>> <https://mariadb.com/kb/en/json-functions/>
>>> <https://mariadb.com/kb/en/json-functions/
>> <https://mariadb.com/kb/en/json-functions/>>
>>>
>>> I have no experience with handling JSON in a relational
>>> database, so not
>>> sure overall whether it's a good solution, but I think it should be
>>> considerably more performant so long as the pages are iterated
>>> over,
>>> rather than each one being searched in the index.
>>>
>>> Best,
>>> Aidn
>>>
>>> > If you can make (3) done, the whole Wikibase community would
>>> be very thankful.
>>> >
>>> > Regarding (4): yeah, too radical and maintenance problems.
>>> >
>>> > Sorry for a short answer, I'll add on my return.
>>> >
>>> > Kind regards,
>>> > Renat
>>> > _______________________________________________
>>> > Wikibaseug mailing list -- wikibaseug(a)lists.wikimedia.org
>>> <mailto:wikibaseug@lists.wikimedia.org
>> <mailto:wikibaseug@lists.wikimedia.org>>
>>> > To unsubscribe send an email to
>>> wikibaseug-leave(a)lists.wikimedia.org
>>> <mailto:wikibaseug-leave@lists.wikimedia.org
>> <mailto:wikibaseug-leave@lists.wikimedia.org>>
>>> >
>>> _______________________________________________
>>> Wikibaseug mailing list -- wikibaseug(a)lists.wikimedia.org
>>> <mailto:wikibaseug@lists.wikimedia.org
>> <mailto:wikibaseug@lists.wikimedia.org>>
>>> To unsubscribe send an email to
>>> wikibaseug-leave(a)lists.wikimedia.org
>>> <mailto:wikibaseug-leave@lists.wikimedia.org
>> <mailto:wikibaseug-leave@lists.wikimedia.org>>
>>>
>>>
>>> _______________________________________________
>>> Wikibaseug mailing list -- wikibaseug(a)lists.wikimedia.org
>>> To unsubscribe send an email to wikibaseug-leave(a)lists.wikimedia.org
>>>
>> _______________________________________________
>> Wikibaseug mailing list -- wikibaseug(a)lists.wikimedia.org
>> To unsubscribe send an email to wikibaseug-leave(a)lists.wikimedia.org
>>
>> _______________________________________________
>> Wikibaseug mailing list -- wikibaseug(a)lists.wikimedia.org
>> To unsubscribe send an email to wikibaseug-leave(a)lists.wikimedia.org
>>
>
> ------------------------------
>
> Message: 2
> Date: Fri, 23 Jul 2021 08:07:17 -0000
> From: "Renat Shigapov" <renat.shigapov(a)bib.uni-mannheim.de>
> Subject: [Wikibase] Re: Experiences/doubts regarding bulk imports into
> Wikibase
> To: wikibaseug(a)lists.wikimedia.org
> Message-ID:
> <162702763752.25702.3885521959340606528(a)lists1001.wikimedia.org>
> Content-Type: text/plain; charset="utf-8"
>
> Dear all,
>
> We have the ticket "Improve bulk import via API" at phabricator now: https://phabricator.wikimedia.org/T287164. It's aimed to unite the related tickets and to discuss further development around bulk import in Wikibase. Your contributions are very welcome.
>
> Kind regards,
> Renat
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> Wikibaseug mailing list -- wikibaseug(a)lists.wikimedia.org
> To unsubscribe send an email to wikibaseug-leave(a)lists.wikimedia.org
>
>
> ------------------------------
>
> End of Wikibaseug Digest, Vol 33, Issue 13
> ******************************************
(Apologies for cross-posting)
Hello,
To enable us better understand your experience installing Wikibase or
updating the Wikibase software, please help us by answering a few questions
so we can identify areas of improvement for users.
Below is a survey in 2 parts consisting of 4 and 5 questions respectively,
and requires about 5 minutes each to complete.
If you would like to participate, please use these links (LimeSurvey):
-
Wikibase Installation Survey
<https://lime.wikimedia.de/index.php/925279?lang=en>
-
Wikibase Software Updating Survey
<https://lime.wikimedia.de/index.php/797463?lang=en>
You can also send me a private email with your answers and I’ll incorporate
them in the survey results.
We kindly request your participation prior to Friday, July 30th at 23:59
UTC.
If you have any questions, please do not hesitate to ask.
-----
Wikibase Installation survey questions
-
How did you install Wikibase?
-
Approximately when did you install Wikibase?
-
How was the installation of Wikibase?
-
Do you have any additional comments or feedback on the installation
process?
Wikibase Software Updating survey questions
-
How did you install your Wikibase instance?
-
Approximately when did you last update your Wikibase software?
-
What version of MediaWiki is your Wikibase instance running?
-
How was the process of updating Wikibase?
-
Do you have any additional comments or feedback on the update process?
-----
Mohammed Sadat
*Community Communications Manager for Wikidata/Wikibase*
Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Phone: +49 (0)30 219 158 26-0
https://wikimedia.de
Keep up to date! Current news and exciting stories about Wikimedia,
Wikipedia and Free Knowledge in our newsletter (in German): Subscribe now
<https://www.wikimedia.de/newsletter/>.
Imagine a world in which every single human being can freely share in the
sum of all knowledge. Help us to achieve our vision!
https://spenden.wikimedia.de
Wikimedia Deutschland – Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Dear list and GLAM-Wiki professionals and enthusiasts in particular,
The Luxembourg Competency Network on Digital Cultural Heritage is
developing a Shared Authority File, to combine the knowledge from national
heritage institutions and to increase the impact of the member
institutions' digitised collections.
The Shared Authority File uses Wikibase as its main data store and internal
user interface, and has created an extension that might be of relevance to
the broader GLAM sector: A datatype for Extended Date Time Format compliant
statements.
Wikibase contains a datetime representation based on ISO 8601:2004
<https://en.wikipedia.org/wiki/ISO_8601>. The project extended the Wikibase
software through an Extended Data Time Format extension, which includes ISO
8601:2019 <https://www.iso.org/obp/ui/#iso:std:iso:8601:-1:ed-1:v1:en>, and
their extensions in a new datatype which can then be utilized within a
Wikibase instance.
This resulted in two open source products. A library
<https://github.com/ProfessionalWiki/EDTF> for the validation and
humanisation of EDTF dates in PHP, and an extension
<https://github.com/ProfessionalWiki/WikibaseEdtf> to WikiBase to use this
library. Both are available as open source software for all to use and
improve. The humanisation is translated in various languages using
Translate.wiki <https://translatewiki.net/wiki/FAQ>.
With this email I’m asking support for the inclusion of this work into the
fall release of the Wikibase Docker bundle. You can voice your support
through this ticket: Include the EDTF Datatype extension in the Fall 2021
Wikibase Docker release <https://phabricator.wikimedia.org/T280656>
This would allow other GLAM initiatives using Wikibase to easily make use
of the EDTF standard within their own Wikibase deployments.
Feedback and contributions to the repositories for this extension as a
stand alone product are also welcomed!
Thanks for your attention and best wishes,
Maarten Brinkerink
Hello,
The next Wikidata+Wikibase office hours will take place on Wednesday 28th
July 2021 at 16:00 UTC (18:00 Berlin time) in the Wikidata Telegram group
<https://t.me/joinchat/IeCRo0j5Uag1qR4Tk8Ftsg>.
*The Wikidata and Wikibase office hours are online events where the
development team present what we have been working on over the past
quarter, and the community is welcome to ask questions and discuss
important issues related to the development of Wikidata and Wikibase.*
This month we will also be having a guest presentation about Toolhub
<https://meta.wikimedia.org/wiki/Toolhub> by Srishti Sethi from the
Wikimedia Foundation.
Looking forward to seeing you all.
--
Mohammed Sadat
*Community Communications Manager for Wikidata/Wikibase*
Wikimedia Deutschland e. V. | Tempelhofer Ufer 23-24 | 10963 Berlin
Phone: +49 (0)30 219 158 26-0
https://wikimedia.de
Keep up to date! Current news and exciting stories about Wikimedia,
Wikipedia and Free Knowledge in our newsletter (in German): Subscribe now
<https://www.wikimedia.de/newsletter/>.
Imagine a world in which every single human being can freely share in the
sum of all knowledge. Help us to achieve our vision!
https://spenden.wikimedia.de
Wikimedia Deutschland – Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Dea all,
Apologies! The original email I sent was meant for the Wikibase Stakeholder
Group, and not the User Group mailing list!
In any case, I assume many of you are members of both lists. If this email
doesn't make sense to you, please ignore. If you are curious and want to
learn more, feel free to reach out to me.
Kind regards,
Lozana
On Thu, Jul 8, 2021 at 5:17 PM Goeke-Smith, Jeffrey <GOEKESMI(a)msu.edu>
wrote:
> Please enjoy a PDF of my slides.
>
>
> ________________________________________
> From: Lozana Rossenova <lozana.rossenova(a)rhizome.org>
> Sent: Thursday, July 8, 2021 11:14 AM
> To: Goeke-Smith, Jeffrey
> Cc: Wikibase Community User Group
> Subject: Follow up from meeting 2021-07-08
>
> Dear all,
>
> Thanks to everyone who attended the meeting – great presentation and
> discussion!
> For those who missed it, check out the notes in the agenda doc:
> https://notepad.rhizome.org/wbsg-2021-07-08?view<
> https://urldefense.com/v3/__https://notepad.rhizome.org/wbsg-2021-07-08?vie…
> >
>
> @Jeff – would you be willing to share your slides with us – as a PDF? You
> can respond to this email and make sure to CC the group email so everyone
> gets it.
>
> @ Everyone else – if you want to keep group discussions going or reach out
> to specific group members for follow up – feel free to use the google group
> email address as a mailing list, it should work (I hope). Or consult the
> notepad members list for contact info:
> https://notepad.rhizome.org/wbsg-membership<
> https://urldefense.com/v3/__https://notepad.rhizome.org/wbsg-membership__;!…
> >
>
> Many thanks & have a great summer break!
>
> Looking forward to continuing the discussions in September.
> All best,
> Lozana
>
> --
> Lozana Rossenova (PhD, London South Bank University)
> Digital Archives Designer and Researcher
>
>
--
Lozana Rossenova (PhD, London South Bank University)
Digital Archives Designer and Researcher