Public Domain Search/Included and excluded sites
- last update --06:20, 10 January 2008 (PST)
Currently the engine only searches 4 domains, but these are the big ones, and more (smaller) sites will be added soon. Some subdomains are not suitable (e.g. local or state government, or sites which reprint significant amounts of copyrighted material) so these are made into a list so they can be excluded, as described at APDS - How the search is built.
The included sites are:
- *.gutenberg.org, (removed temporarily - see APDS - Searching Gutenberg.org.)
The emphasis so far is on US federal government content - this is because:
- The engine is only a couple of months old - it will get broader in time
- US federal government sites form a vast body of content which dwarfs any other public domain content that we are aware of. (Other governments and large institutions generally do not release their work as public domain.)
- These sites include many pages relevant to Appropedia, on international aid (e.g. USAID), public health (e.g. CDC) and appropriate technology. While literature sites etc will be added, they are unlikely to add value to the search engine for Appropedia's purposes.
 Searching for the term "public domain"
Finding true public domain content is difficult. Many web pages use the term incorrectly, and when I follow the links and look for a clear statement on permissions/copyright, it turns out the content is usually only open access, or perhaps is re-usable but with a non-commercial clause and/or a no-derivatives clause.
As this search was not very fruitful, it was decided to suspend this search and start with the more fruitful task of refining the US federal sites (by building an exclusion list of non-federal sites in the .gov domain - see APDS - How the search is built).
 Other sites
Explore the public domain status of the sites of US federal institutes. There appear to be conflicting claims, for example surrounding content on the Smithsonian Institute (see US federal government websites and public domain).
US Military educational institutions - are these PD?:
 Problems with some US govt sites
APDS will not index a site unless it is very clear that the great majority of content on the site is public domain, and if the site itself does not state this, it is hard to be certain. Even if we a certain that a site is wrongly claiming copyright over its content, we can't expect them to be helpful in highlighting which content on their site is true copyright, held by a third party.