Ran into some issues with downloading large files and forgot to post this earlier.
*http://paws-public.wmflabs.org/paws-public/6877667/projects/headings/dataset... http://paws-public.wmflabs.org/paws-public/6877667/projects/headings/datasets/enwiki_20160204_headings.tsv.bz2*
Columns:
- "page_id" : int - The identifier of the article - "page_title" - The title of the article - "heading_level" - The level of the heading in question - "heading_text" - The text of the heading
Enjoy!
-Aaron
On Mon, Mar 7, 2016 at 6:52 PM, Yuvi Panda yuvipanda@gmail.com wrote:
Just also wanted to note that these paws-public URLs will break in the near-to-mid future :)
On Mon, Mar 7, 2016 at 4:22 PM, Aaron Halfaker ahalfaker@wikimedia.org wrote:
Got some work done here. I'm using this as an opportunity to test out
PAWS.
See
http://paws-public.wmflabs.org/paws-public/EpochFail/projects/headings/extra...
It's still running right now, but I should have an output file that we
can
download and/or load into MySQL soon.
-Aaron
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
-- Yuvi Panda T http://yuvi.in/blog
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l