Edit source Talk
< Appropedia:Stubs from Wikipedia articles

Appropedia:Stubs from Wikipedia articles/bot work

From Appropedia

In the XML file from Wikipedia:Special:Export, replace "</title>" with " (from Wikipedia)</title>" (including the leading space in the replacement text).

A file with the pagenames (inside double square brackets) is needed for the bot work, each time a batch is uploaded.

  • Save the list from the Wikipedia:Special:Export page to a text file.
  • Add a line break at beginning and end.
  • Make a copy.
  • Replace \n with (from Wikipedia)]]\n[[ (this works in Gedit, the *nux text editor - "\n" means line break). Don't worry about spare ]] and [[ at the beginning and end if needed.

Tag each page:

  • python replace.py -regex "^" "{{attrib wikipedia raw}}\n" -file:sfwn -log:sfwlog - currently being debugged.

I'm using "sfw" ("stubs from Wikipedia") to indicate filenames in this project. E.g. "sfwn" is the list of filenames. The name isn't important, as long as it's consistent and hard to get mixed up.

Move pages - set up the commands using a spreadsheet:

  • Use the -log command - when moving fails, it's entered in the log, giving a record of pages to be merged.
  • The form is python movepages.py -from:"foo (from Wikipedia)" -to:"foo"

When there are many commands to run, it's best to make a shell script. I'm still learning to do this, but it seems to be of this form:

#!/bin/sh
cd (pywikipedia)/
rundate=`date +"%FT%T%z"`
python movepages.py -from:"abc (from Wikipedia)" -to:"abc"
python movepages.py -from:"xyz (from Wikipedia)" -to:"xyz"

...and so on, listing all the commands.

Replace templates not used in Appropedia. Make a script, then run this with an updated pagelist file for each new group of uploaded pages:

  • python replace.py... -file: -log: (working on it)

Templates to be replaced:

To find all pages that need culling/adapting, see Special:WhatLinksHere/Template:From wikipedia raw. (For the obsessive geek: Uniwiki slows things down for this kind of editing - major culling of lots of articles - so it can be faster to open a few tabs, then tab through clicking "Classic" in each.)