Get our free book (in Spanish or English) on rainwater now - To Catch the Rain.

Appropedia:Porting/Websites

From Appropedia
Jump to: navigation, search

To help:

  1. Pick a blog (from the section #To be ported or any suitable open-licensed source).
  2. Follow the steps at #Conversion

To be ported[edit]

These are useful blogs and simple HTML sites that can be easily converted to MediaWiki (see sections below).

Note: Only sites that are CC-BY, CC-BY-SA or public domain may be listed here.


Priorities[edit]

Start with these quick, easy pages, i.e. basic HTML. These are easily converted. This list is transcluded from Appropedia:Porting/Websites - see that page for porting instructions and other info.

Tech development[edit]

If there's a way to download & turn whole sites into MediaWiki markup in one go, that would be even better - i.e. scr*ping as the first step. (That word triggers certain filters, hence the censorship.) Once they're in MediaWiki format, they can then be sifted and moved manually to suitable pages. - It's probably just as easy (and safer) to do it page by page, as we have an easy conversion tool in the Wikedbox.

Blogs - start here[edit]

Non-blog sites[edit]

These are not in a simple chronological order, so they will need a plan for copying the pages.

  • Paste directly into pages with descriptive names - based on name of original page, but according to Appropedia naming conventions, i.e. lower case except for proper names. Modify page name when needed to avoid clashes or ambiguity.
  • For pages that are a series, add a table of contents, like {{TTH chapter links}}, at the top of each page. Make a list of all pages as you create them, in order, for this table of contents.
  • Add attrib notice at bottom.

Specifics:

Check these[edit]

These might have good relevant content, or which have good relevant content that you need to dig for. Note that just because it's a great site, doesn't mean it's suitable for porting to Appropedia. Scan through quickly and pick what's suitable. (Upcoming events may be good; past events only interesting if they have interesting info, e.g. about practices, designs, or networks.)

Places to look for more[edit]

  • Public domain info mentioned on Appropedia:Porting. Finding well-organized online manuals/guides is perfect, though these are often in PDF. (The Public Domain Search will be useful, but it needs to be restarted. Ping me if I haven't done this by mid-March 2010. --Chriswaterguy 14:56, 20 February 2010 (UTC))

Already converted, now needing more processing, breaking up and editing into suitable pages. Occasionally these will need updating with new posts:


Conversion[edit]

Requirements: Firefox web browser (or other Mozilla browsers should also work)

  • Choose a blog from the list above.
  • Start a new page with a "staging area" in the title, e.g. "Blog name staging area"
  • At the top of the page, paste this "code":
{{content staging area}}

== Content ==

  • Above the "Content" header, add the url of the site you are doing,
  • You have created the target page - leave it open and we'll come back to it later.
  • Open Wikedbox in a separate window.
  • Copy and paste the formatted text from the blog into the box at Wikedbox. It should keep much of its formatting.
  • Continue to copy all the content from the blog - the front page and following pages. .
    • If the standard blog view shows the complete posts, select the body of each page, and paste it in. Then go to the next page and do the same, pasting it below the previous content in the Wikedbox. ("Body" means everything except headers, sidebars and footers - just the blog posts themselves.)
    • If the standard view only displays the beginning of each post,blog can't be displayed in this way, it will be more work to do. Open each blog post in a separate window, copy the body of the post to Wikedbox.
  • If it is not a blog, but is divided by topic:
    • Find where the pages are all listed e.g. sitemap or navbar.
    • You may wish to add a "Progress" header above "Content" esp you're not doing it all at once.
    • When you copy in each page, start with four dashes ---- then the name of the page, e.g.
----

Rocket stoves

  • When you have a good amount of content (maybe you're afraid of losing the content if the browser crashes) click the wikify button above the edit box, which looks like: [w] - and wait until conversion is complete (should be a few seconds, depending on page size).
  • Copy the converted wiki text (ctrl+a to select all, then ctrl+c) and paste into the target page, below the "Content" header.
  • Click "Save."

Thank you for your help! You'll end up with a long page that looks very messy - this is an important first step in porting this content to Appropedia. Don't try to fix up the broken links & formatting - leave that for the next step.

Basic fixing up[edit]

A bot (probably ChriswaterguyBot) will then add templates to each post, convert tags to categories, turn some of the broken image links into external links, and make other repetitive fixes.

For reference, these are some of the commands used on one of the blogs:

python replace.py -regex "\[(\S*).*\[ IMAGE_LINK_HERE.*\d*px\|([^\]]*)]*" "[\\1 Image: \\2]"  -page:"Afrigadget/content staging area"  -summary:"fix image links (make into proper external links)"
python replace.py -regex "\[(\S*).*\[ IMAGE_LINK_HERE.*\d*px\|(.*)]" ""  -page:"Afrigadget/content staging area"  -summary:"making break between posts"
python replace.py -regex "(Tags: ]|, ]|SHARETHIS.addEntry\({.*\n| \| \[\S*#(comments|respond)\s*\d* .*omment.*])" ""  -page:"Afrigadget/content staging area" -summary:"removing misc bits of code from converted HTML"
<pre>python replace.py -regex "(Filed in:[\n\s]*|, )\[http://www.afrigadget.com/category/\S* ([^\]]*)]" "[[Category:\\2]]\n"  -page:"Afrigadget/content staging area" -summary:"Change blog categories to wiki categories"
 python replace.py -regex "\[(\S*).*Posted: [^\[]*(\[.*])" "{{attrib afrigadget | url=\\1| author=\\2}}"  -page:"Afrigadget/content staging area" -summary:"replacing header with 'attrib afrigadget' template"
 python  replace.py -regex "(\[http://www.afrigadget.com/tag/)(\S* )([\w\-\s]*])" "]\n[[Category:\\3]" -page:"Afrigadget/content staging area" -summary:"changing tags to categories"

sand pages:

  • python replace.py -regex "(<[\]*div>)*(\t)*" "" -page:"Slow Sand Filter staging area" -page:"Shared Source Initiative biosand staging area "
  • python replace.py -regex "{\|.*border=\"0\"" -page:"Slow Sand Filter staging area" -page:"Shared Source Initiative biosand staging area "
  • python replace.py -regex "\| colspan=\"5\" \|" -page:"Slow Sand Filter staging area" -page:"Shared Source Initiative biosand staging area "
  • python replace.py -regex "\| (
    )*(
    )* )*" "" -page:"Slow Sand Filter staging area" -page:"Shared Source Initiative biosand staging area "
  • python replace.py -regex "<br([ ]*[/]*)>" -"\n\n" -page:"Slow Sand Filter staging area" -page:"Shared Source Initiative biosand staging area "
  • python replace.py -regex "\r\n\r\n-\r\n\r\n" "\n\n" -page:"Slow Sand Filter staging area" -page:"Shared Source Initiative biosand staging area "

\| bgcolor=\"#......="[ ]*(rowspan=\"2\")*[ ]*\|


For FEMA pages:

  • remove (<[\]*div>)*(\t)*
  • change ''' with == for headings

If you're also interested in running a bot, that would be really appreciated. Contact me. --Chriswaterguy 08:51, 19 February 2010 (UTC)

  • Preview and fix problems as you are able to (or save first then do the fixing).
  • Note that it works for formatted text, and not for images. The urls of images are replaced with their filenames, so you must decide what to do:
    • Upload the images & ensure the links are correct - only if the images are open licensed; or
    • Manually fix the image links to correct links to the images -where they are very relevant, but not open licensed; or
    • Strip out the links entirely (easiest option)

WikEd instructions are found at Wikipedia:User:Cacycle/wikEd.

Making articles[edit]

After all that is done and the green light is given, it's time to convert to articles. Remove the text from the "staging area" and

Where there are

a lot of borderline cases, it's best done by or with someone who is familiar with Appropedia, who can make judgements about which content is suitable.

Retrieved from "https://www.appropedia.org/index.php?title=Appropedia:Porting/Websites&oldid=313218"