Hi,
I'm thinking about writing an extension...and I haven't really done one totally from scratch yet, so please forgive the waste of bandwidth if I'm missing something in the documentation...and the length of this message.
Here's what I want to do and why. I'm running some wikis related to biology/genomics. We use Cite.php, and I'd like to automatically create pages for the references cited by citations in Cite. php. I've already modified Cite so that if someone enters "PMID:<id number>" as either the input text or the name= part, a block of code (essentially an extension of Cite where I've added a "hook" in Cite.php) goes to Pubmed, retrieves the detailed information about the citation and substitutes the reference text.
For example, if you have
<ref name='PMID:2167309'/>
as the third citation, the reference that shows up is
3. ↑ Rathod PK & Khatri A (1990) Synthesis and antiproliferative activity of threo-5-fluoro-L-dihydroorotate. J Biol Chem 265:14242-9 PMID:2167311
I like this because it reduces markup clutter that is a criticism of Cite. The next thing I want to do is change the external link to Pubmed to an internal link to a page in the wiki, where users can add commentary about the reference. These pages will be stubbed with more information from Pubmed, including a template, the abstract, and the reference. I'm thinking of two possible strategies:
1) Run something that creates the page (if needed) when the parser renders a page with a <ref> tag. 2) Run something that creates the page (if needed) when a user clicks on the link in the references section
These aren't mutually exclusive, of course, but what I like about #2 is that I don't create pages unless someone actually wants to look at them (upon reflection, this may create pages when the search engines hit the links, but that may be OK) and, more importantly, I think this can be done so that the page will be created if someone searches for the reference and the citation doesn't already exist.
The way I'm thinking of doing this is to hook into AlternateEdit and branch off to something that grabs the desired template from the templates namespace, populates the wikitext with data, saves it, and redirects to the saved page. Questions:
Q1) Am I nuts? (default=true) Is there anything that I'm forgetting that will make this blow up? Q2) Is that the right place to hook? Q3) What's the best way to do the create/save/redirect step? Q4) Has anyone already done something like this that I can adapt? Q5) Would this be useful to anyone else?
For Q2 and Q3, I don't want the user to see the edit form. I want it to look like the page was there all along. For #3 I have a kludgy way to do it: create the page as XML and run maintenance/importDump. There must be a better way, right? Thanks for any advice.
Jim ===================================== Jim Hu Associate Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054
Hi Jim,
You mention that you might want to have the reference pages created 'on demand' - if so, how do you know what content to put into those pages? Is it simply static text as read from a Template? And if so, what would that Template say?
-- Jim R. Wilson
On 2/7/07, Jim Hu jimhu@tamu.edu wrote:
Hi,
I'm thinking about writing an extension...and I haven't really done one totally from scratch yet, so please forgive the waste of bandwidth if I'm missing something in the documentation...and the length of this message.
Here's what I want to do and why. I'm running some wikis related to biology/genomics. We use Cite.php, and I'd like to automatically create pages for the references cited by citations in Cite. php. I've already modified Cite so that if someone enters "PMID:<id number>" as either the input text or the name= part, a block of code (essentially an extension of Cite where I've added a "hook" in Cite.php) goes to Pubmed, retrieves the detailed information about the citation and substitutes the reference text.
For example, if you have
<ref name='PMID:2167309'/>
as the third citation, the reference that shows up is
3. ↑ Rathod PK & Khatri A (1990) Synthesis and antiproliferative
activity of threo-5-fluoro-L-dihydroorotate. J Biol Chem 265:14242-9 PMID:2167311
I like this because it reduces markup clutter that is a criticism of Cite. The next thing I want to do is change the external link to Pubmed to an internal link to a page in the wiki, where users can add commentary about the reference. These pages will be stubbed with more information from Pubmed, including a template, the abstract, and the reference. I'm thinking of two possible strategies:
- Run something that creates the page (if needed) when the parser
renders a page with a <ref> tag. 2) Run something that creates the page (if needed) when a user clicks on the link in the references section
These aren't mutually exclusive, of course, but what I like about #2 is that I don't create pages unless someone actually wants to look at them (upon reflection, this may create pages when the search engines hit the links, but that may be OK) and, more importantly, I think this can be done so that the page will be created if someone searches for the reference and the citation doesn't already exist.
The way I'm thinking of doing this is to hook into AlternateEdit and branch off to something that grabs the desired template from the templates namespace, populates the wikitext with data, saves it, and redirects to the saved page. Questions:
Q1) Am I nuts? (default=true) Is there anything that I'm forgetting that will make this blow up? Q2) Is that the right place to hook? Q3) What's the best way to do the create/save/redirect step? Q4) Has anyone already done something like this that I can adapt? Q5) Would this be useful to anyone else?
For Q2 and Q3, I don't want the user to see the edit form. I want it to look like the page was there all along. For #3 I have a kludgy way to do it: create the page as XML and run maintenance/importDump. There must be a better way, right? Thanks for any advice.
Jim
Jim Hu Associate Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
Hi Jim,
The content will come from a query to an external web service, which will be used to fill the template. This is based on the page title being parsed to look for an accession to the external database.
Does that make sense?
Jim ===================================== Jim Hu Associate Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054
On Feb 7, 2007, at 12:06 PM, Jim Wilson wrote:
Hi Jim,
You mention that you might want to have the reference pages created 'on demand' - if so, how do you know what content to put into those pages? Is it simply static text as read from a Template? And if so, what would that Template say?
-- Jim R. Wilson
On 2/7/07, Jim Hu jimhu@tamu.edu wrote:
Hi,
I'm thinking about writing an extension...and I haven't really done one totally from scratch yet, so please forgive the waste of bandwidth if I'm missing something in the documentation...and the length of this message.
Here's what I want to do and why. I'm running some wikis related to biology/genomics. We use Cite.php, and I'd like to automatically create pages for the references cited by citations in Cite. php. I've already modified Cite so that if someone enters "PMID:<id number>" as either the input text or the name= part, a block of code (essentially an extension of Cite where I've added a "hook" in Cite.php) goes to Pubmed, retrieves the detailed information about the citation and substitutes the reference text.
For example, if you have
<ref name='PMID:2167309'/>
as the third citation, the reference that shows up is
3. ↑ Rathod PK & Khatri A (1990) Synthesis and
antiproliferative activity of threo-5-fluoro-L-dihydroorotate. J Biol Chem 265:14242-9 PMID:2167311
I like this because it reduces markup clutter that is a criticism of Cite. The next thing I want to do is change the external link to Pubmed to an internal link to a page in the wiki, where users can add commentary about the reference. These pages will be stubbed with more information from Pubmed, including a template, the abstract, and the reference. I'm thinking of two possible strategies:
- Run something that creates the page (if needed) when the parser
renders a page with a <ref> tag. 2) Run something that creates the page (if needed) when a user clicks on the link in the references section
These aren't mutually exclusive, of course, but what I like about #2 is that I don't create pages unless someone actually wants to look at them (upon reflection, this may create pages when the search engines hit the links, but that may be OK) and, more importantly, I think this can be done so that the page will be created if someone searches for the reference and the citation doesn't already exist.
The way I'm thinking of doing this is to hook into AlternateEdit and branch off to something that grabs the desired template from the templates namespace, populates the wikitext with data, saves it, and redirects to the saved page. Questions:
Q1) Am I nuts? (default=true) Is there anything that I'm forgetting that will make this blow up? Q2) Is that the right place to hook? Q3) What's the best way to do the create/save/redirect step? Q4) Has anyone already done something like this that I can adapt? Q5) Would this be useful to anyone else?
For Q2 and Q3, I don't want the user to see the edit form. I want it to look like the page was there all along. For #3 I have a kludgy way to do it: create the page as XML and run maintenance/importDump. There must be a better way, right? Thanks for any advice.
Jim
Jim Hu Associate Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
That makes sense, though I wonder, does the web-service already have a web-interface? I guess what I'm getting at is, if the information already exists on another web site, why do you want to duplicate it in your own wiki?
(Not sure I understand all the problem parameters - I want to help, I'm just trying to get a grasp of what's involved. Thanks in advance.)
On 2/7/07, Jim Hu jimhu@tamu.edu wrote:
Hi Jim,
The content will come from a query to an external web service, which will be used to fill the template. This is based on the page title being parsed to look for an accession to the external database.
Does that make sense?
Jim
Jim Hu Associate Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054
On Feb 7, 2007, at 12:06 PM, Jim Wilson wrote:
Hi Jim,
You mention that you might want to have the reference pages created 'on demand' - if so, how do you know what content to put into those pages? Is it simply static text as read from a Template? And if so, what would that Template say?
-- Jim R. Wilson
On 2/7/07, Jim Hu jimhu@tamu.edu wrote:
Hi,
I'm thinking about writing an extension...and I haven't really done one totally from scratch yet, so please forgive the waste of bandwidth if I'm missing something in the documentation...and the length of this message.
Here's what I want to do and why. I'm running some wikis related to biology/genomics. We use Cite.php, and I'd like to automatically create pages for the references cited by citations in Cite. php. I've already modified Cite so that if someone enters "PMID:<id number>" as either the input text or the name= part, a block of code (essentially an extension of Cite where I've added a "hook" in Cite.php) goes to Pubmed, retrieves the detailed information about the citation and substitutes the reference text.
For example, if you have
<ref name='PMID:2167309'/>
as the third citation, the reference that shows up is
3. ↑ Rathod PK & Khatri A (1990) Synthesis and
antiproliferative activity of threo-5-fluoro-L-dihydroorotate. J Biol Chem 265:14242-9 PMID:2167311
I like this because it reduces markup clutter that is a criticism of Cite. The next thing I want to do is change the external link to Pubmed to an internal link to a page in the wiki, where users can add commentary about the reference. These pages will be stubbed with more information from Pubmed, including a template, the abstract, and the reference. I'm thinking of two possible strategies:
- Run something that creates the page (if needed) when the parser
renders a page with a <ref> tag. 2) Run something that creates the page (if needed) when a user clicks on the link in the references section
These aren't mutually exclusive, of course, but what I like about #2 is that I don't create pages unless someone actually wants to look at them (upon reflection, this may create pages when the search engines hit the links, but that may be OK) and, more importantly, I think this can be done so that the page will be created if someone searches for the reference and the citation doesn't already exist.
The way I'm thinking of doing this is to hook into AlternateEdit and branch off to something that grabs the desired template from the templates namespace, populates the wikitext with data, saves it, and redirects to the saved page. Questions:
Q1) Am I nuts? (default=true) Is there anything that I'm forgetting that will make this blow up? Q2) Is that the right place to hook? Q3) What's the best way to do the create/save/redirect step? Q4) Has anyone already done something like this that I can adapt? Q5) Would this be useful to anyone else?
For Q2 and Q3, I don't want the user to see the edit form. I want it to look like the page was there all along. For #3 I have a kludgy way to do it: create the page as XML and run maintenance/importDump. There must be a better way, right? Thanks for any advice.
Jim
Jim Hu Associate Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
The new page would allow wiki-based annotation of the information from the other site, which is PubMed...so, yes it has a very nice and heavily trafficked web interface indeed!
For example, the web service at Pubmed provide the abstract and links to full text (at yet another website) for a publication. My users would want to add things like: "This paper describes a resource that turned out to be useful for doing X" or "Figure 1 in this paper shows this thing that the authors didn't notice" or "The xxx gene described in this paper is also known as yyy; they were shown to be the same 10 years later" etc.
These are kinds of knowledge that are scattered throughout the scientific community. My wikis are hoping to gather some of that distributed wisdom. Also, the page in my wiki would collect what other pages in the wiki refer to that item via the "what links here" capability.
If you'd like to see the kind of content I'm talking about, check out Pubmed. There's a long url, but I usually just use http:pubmed.com, which redirects. If you're in the US, your tax dollars support PubMed, and we're grateful for that! Mediawiki already recognizes this as a major resource, since typing a PMID <idnum> in a page automatically creates an external link to PubMed...it's an Easter Egg that we stumbled across.
http://en.wikipedia.org/wiki/Wikipedia:PMID ===================================== Jim Hu Associate Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054
On Feb 7, 2007, at 3:40 PM, Jim Wilson wrote:
That makes sense, though I wonder, does the web-service already have a web-interface? I guess what I'm getting at is, if the information already exists on another web site, why do you want to duplicate it in your own wiki?
(Not sure I understand all the problem parameters - I want to help, I'm just trying to get a grasp of what's involved. Thanks in advance.)
On 2/7/07, Jim Hu jimhu@tamu.edu wrote:
Hi Jim,
The content will come from a query to an external web service, which will be used to fill the template. This is based on the page title being parsed to look for an accession to the external database.
Does that make sense?
Jim
Jim Hu Associate Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054
On Feb 7, 2007, at 12:06 PM, Jim Wilson wrote:
Hi Jim,
You mention that you might want to have the reference pages created 'on demand' - if so, how do you know what content to put into those pages? Is it simply static text as read from a Template? And if so, what would that Template say?
-- Jim R. Wilson
On 2/7/07, Jim Hu jimhu@tamu.edu wrote:
Hi,
I'm thinking about writing an extension...and I haven't really done one totally from scratch yet, so please forgive the waste of bandwidth if I'm missing something in the documentation...and the length of this message.
Here's what I want to do and why. I'm running some wikis related to biology/genomics. We use Cite.php, and I'd like to automatically create pages for the references cited by citations in Cite. php. I've already modified Cite so that if someone enters "PMID:<id number>" as either the input text or the name= part, a block of code (essentially an extension of Cite where I've added a "hook" in Cite.php) goes to Pubmed, retrieves the detailed information about the citation and substitutes the reference text.
For example, if you have
<ref name='PMID:2167309'/>
as the third citation, the reference that shows up is
3. ↑ Rathod PK & Khatri A (1990) Synthesis and
antiproliferative activity of threo-5-fluoro-L-dihydroorotate. J Biol Chem 265:14242-9 PMID:2167311
I like this because it reduces markup clutter that is a criticism of Cite. The next thing I want to do is change the external link to Pubmed to an internal link to a page in the wiki, where users can add commentary about the reference. These pages will be stubbed with more information from Pubmed, including a template, the abstract, and the reference. I'm thinking of two possible strategies:
- Run something that creates the page (if needed) when the parser
renders a page with a <ref> tag. 2) Run something that creates the page (if needed) when a user clicks on the link in the references section
These aren't mutually exclusive, of course, but what I like about #2 is that I don't create pages unless someone actually wants to look at them (upon reflection, this may create pages when the search engines hit the links, but that may be OK) and, more importantly, I think this can be done so that the page will be created if someone searches for the reference and the citation doesn't already exist.
The way I'm thinking of doing this is to hook into AlternateEdit and branch off to something that grabs the desired template from the templates namespace, populates the wikitext with data, saves it, and redirects to the saved page. Questions:
Q1) Am I nuts? (default=true) Is there anything that I'm forgetting that will make this blow up? Q2) Is that the right place to hook? Q3) What's the best way to do the create/save/redirect step? Q4) Has anyone already done something like this that I can adapt? Q5) Would this be useful to anyone else?
For Q2 and Q3, I don't want the user to see the edit form. I want it to look like the page was there all along. For #3 I have a kludgy way to do it: create the page as XML and run maintenance/ importDump. There must be a better way, right? Thanks for any advice.
Jim
Jim Hu Associate Professor Dept. of Biochemistry and Biophysics 2128 TAMU Texas A&M Univ. College Station, TX 77843-2128 979-862-4054
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
Jim Hu wrote:
For example, the web service at Pubmed provide the abstract and links to full text (at yet another website) for a publication. My users would want to add things like: "This paper describes a resource that turned out to be useful for doing X" or "Figure 1 in this paper shows this thing that the authors didn't notice" or "The xxx gene described in this paper is also known as yyy; they were shown to be the same 10 years later" etc.
I have a similar problem. At http://runeberg.org/ I digitize old books, among them several encyclopedias. For the sake of familiarity, you can think about scanned books in Wikisource rather than my website.
In many cases an encyclopedia from 1889 is useful for knowing the population of Aberdeen in 1889. It could be nice to report what the current population is, but in some cases it is also important to point out that the reported number for 1889 was indeed wrong. But if scanning and OCRing one page takes 3 seconds and proofreading takes 3 minutes, how long does it take to check all the facts? Not knowing how this should best be addressed, it seemed like a stupid idea to digitize more old works that are full of errors.
When Wikipedia was started in 2001 and started to get off the ground, this became the obvious place to put information on the current and historic population of Aberdeen. The scanning of old texts no longer had to carry this role. It was really only in 2002 and 2003 that I got the energy to scan more works for my own site, and in 2005 I scanned this for Wikisource, http://en.wikisource.org/wiki/The_New_Student%27s_Reference_Work
Turns out Aberdeen's population in 1911 was 163,084, http://en.wikisource.org/wiki/The_New_Student%27s_Reference_Work/1-0016 http://en.wikisource.org/wiki/The_New_Student%27s_Reference_Work/Aberdeen but this bit of information is not linked to or included in http://en.wikipedia.org/wiki/Aberdeen#Population
So one problem still exists: From the scanned book page, there is no link to the Wikipedia article that provides more up-to-date information. The reader of the scanned page can of course use a search engine, and will often find the Wikipedia article. But is this really the ultimate solution? And even if the Wikipedia article is found, the other scanned pages that link to the same article are not found from there.
Should each scanned book page include a list of links to Wikipedia articles that are relevant for the page? Could such lists be compiled (or suggested) automatically?
Should Wikisource have a [[category:Aberdeen]] that collects all pages, chapters and books that pertain to this town? Today the English Wikisource has one [[Category:Works by subject]], but under this is a very small tree, compared to all articles in Wikipedia. There is no category for Aberdeen, but one for Scotland that has 15 links of which 4 are to articles in the 1911 Encyclopaedia Britannica. The 1911 EB article "Aberdeen (burgh)" is not among these four, http://en.wikisource.org/wiki/1911_Encyclop%C3%A6dia_Britannica/Aberdeen_%28...
Wikisource also has a [[Category:Ottoman Empire]] that contains four articles from the 1911 Encyclopaedia Britannica, one other chapter and two other works. But the corresponding category on the English Wikipedia has 56 pages and 12 immediate subcategories. Even the sub-subcategory Ottoman railways has 6 Wikipedia articles. On Wikisource there seem to be 6 mentions of the "Orient Express", but these are found through Google and not through links on the website, http://www.google.com/search?q=%22orient+express%22+site%3Aen.wikisource.org
wikitech-l@lists.wikimedia.org