Distributed Proofreaders: Difference between revisions

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
imported>Gonnym
top: fix date template; cleanup
 
imported>OAbot
m Open access bot: url-access=subscription updated in citation with #oabot.
 
Line 1: Line 1:
{{Short description|Web-based proofreading project}}
{{Short description|Web-based proofreading project}}
{{More citations needed |date=May 2024}}
{{Infobox website
{{Infobox website
| name = Distributed Proofreaders
| name = Distributed Proofreaders
Line 14: Line 13:
| screenshot_alt = Screenshot of the proofreading interface on Distributed Proofreaders.
| screenshot_alt = Screenshot of the proofreading interface on Distributed Proofreaders.
| caption = Screenshot of the proofreading interface on Distributed Proofreaders.
| caption = Screenshot of the proofreading interface on Distributed Proofreaders.
| parent = [[Distributed Proofreaders Foundation]] (DPF)
| url = {{URL|https://www.pgdp.net}}
| url = {{URL|https://www.pgdp.net}}
| commercial = No<!-- "Yes", "No" or leave blank -->
| commercial = No<!-- "Yes", "No" or leave blank -->
| type = Not-for-profit
| type = Not-for-profit
| language = [[English language|English]], [[French language|French]]
| language = [[English language|English]], [[French language|French]], [[German language|German]]
| language_count = 2
| language_count = 3
| registration = Optional
| registration = Optional
| num_users =  
| num_users =  
Line 25: Line 23:
| programming_language = [[PHP]]<ref>{{cite web |url=https://github.com/DistributedProofreaders/dproofreaders |title=Distributed Proofreaders|publisher=github.com |access-date=2022-02-01 }}</ref>
| programming_language = [[PHP]]<ref>{{cite web |url=https://github.com/DistributedProofreaders/dproofreaders |title=Distributed Proofreaders|publisher=github.com |access-date=2022-02-01 }}</ref>
| country_of_origin = [[United States of America]]
| country_of_origin = [[United States of America]]
| owner = [[Distributed Proofreaders Foundation]]
| owner = Distributed Proofreaders Foundation (DPF)
| author = <!-- or: creator / authors / creators -->
| author = <!-- or: creator / authors / creators -->
| founder = Charles Franks
| founder = Charles Franks
Line 36: Line 34:
}}
}}


'''Distributed Proofreaders''' (commonly abbreviated as '''DP''' or '''PGDP''') is a web-based project that supports the development of [[e-text]]s for [[Project Gutenberg]] by allowing many people to work together in [[proofreading]] drafts of e-texts for errors. {{as of|July 2024|post=,}} the site had digitized 48,000 titles.<ref>{{cite web |url=https://blog.pgdp.net/2021/03/05/celebrating-41000-titles// |title=Celebrating 30,000 Titles &#124; Hot off the Press |publisher=Blog.pgdp.net |date=2015-07-07 |access-date=2016-09-15 |archive-url=https://web.archive.org/web/20161220140431/https://blog.pgdp.net/2015/07/07/celebrating-30000-titles/ |archive-date=2016-12-20 |url-status=live }}</ref><ref>{{cite web |url=https://blog.pgdp.net/2020/04/27/celebrating-39000-titles/ |title=Celebrating 39,000 Titles |publisher=Blog.pgdp.net |date=2020-11-08 |access-date=2020-04-27 |archive-url=https://web.archive.org/web/20200603215240/https://blog.pgdp.net/2020/04/27/celebrating-39000-titles/ |archive-date=2020-06-03 |url-status=live }}</ref><ref name=47000th /><ref name=48000th />
'''Distributed Proofreaders''' (commonly abbreviated as '''DP''' or '''PGDP''') is a web-based project that supports the development of [[e-text]]s for [[Project Gutenberg]] by allowing many people to work together in [[proofreading]] drafts of e-texts for errors. {{as of|April 2025|post=,}} the site had digitized 49,000 titles.<ref>{{cite web |last=Cantoni |first=Linda |date=2025-04-12 |title=Celebrating 49,000 Titles &#124; Hot off the Press |url=https://blog.pgdp.net/2025/04/12/celebrating-49000-titles/ |access-date=2025-10-24 |website=Hot off the Press: Book Reviews and Notes from Distributed Proofreaders |publisher=Distributed Proofreaders}}</ref>


== History ==
== History ==


Distributed Proofreaders was founded by Charles Franks in 2000 as an independent site to assist [[Project Gutenberg]].<ref>{{cite book
Distributed Proofreaders was founded by Charles Franks in 2000 as an independent site to assist [[Project Gutenberg]].<ref name=":0">{{cite book |last=Lessig |first=Lawrence |url=https://books.google.com/books?id=7eRPKIvEo9gC&pg=PT167 |title=Remix: Making Art and Commerce Thrive in the Hybrid Economy |publisher=Penguin |year=2009 |isbn=978-0-14-311613-4 |page=167}}</ref><ref name="brit1">{{cite web |date=11 August 2025 |title=Project Gutenberg |url=https://www.britannica.com/topic/Project-Gutenberg |access-date=20 October 2025 |website=Encyclopedia Britannica}}</ref> Distributed Proofreaders became an official Project Gutenberg site in 2002.<ref name=brit1 />
| first=Lawrence | last=Lessig | year=2009
| title=Remix: Making Art and Commerce Thrive in the Hybrid Economy
| page=109 | publisher=Penguin | isbn=978-0-14-311613-4
| url=https://books.google.com/books?id=7eRPKIvEo9gC&pg=PT109
}}</ref> Distributed Proofreaders became an official Project Gutenberg site in 2002.


On 8 November 2002, Distributed Proofreaders was [[Slashdot effect|slashdotted]],<ref>{{cite web|url=http://www.gutenberg.org/wiki/Gutenberg:Volunteers%27_Voices#Suzanne_Shell|title=Gutenberg:Volunteers' Voices|publisher=[[Project Gutenberg]]|access-date=2008-07-12|archive-url=https://web.archive.org/web/20080918202824/http://www.gutenberg.org/wiki/Gutenberg:Volunteers%27_Voices#Suzanne_Shell|archive-date=2008-09-18|url-status=dead}}</ref><ref>{{cite web|url=http://www.boingboing.net/2002/11/12/distributed-proofrea.html|title=Distributed Proofreading's slashdotting|date=12 November 2002 |publisher=[[Boing Boing]]|access-date=2008-07-12|archive-url=https://web.archive.org/web/20071109001542/http://www.boingboing.net/2002/11/12/distributed-proofrea.html|archive-date=2007-11-09|url-status=live}}</ref> and more than 4,000 new members joined in one day, causing an influx of new proofreaders and software developers, which helped to increase the quantity and quality of e-text production. In July 2015, the 30,000th Distributed Proofreaders produced e-text was posted to Project Gutenberg. DP-contributed e-texts comprised more than half of works in Project Gutenberg, {{as of|2015|07|lc=on}}.
On 8 November 2002, Distributed Proofreaders was [[Slashdot effect|slashdotted]],<ref>{{cite web|url=http://www.gutenberg.org/wiki/Gutenberg:Volunteers%27_Voices#Suzanne_Shell|title=Gutenberg:Volunteers' Voices|publisher=[[Project Gutenberg]]|access-date=2008-07-12|archive-url=https://web.archive.org/web/20080918202824/http://www.gutenberg.org/wiki/Gutenberg:Volunteers%27_Voices#Suzanne_Shell|archive-date=2008-09-18|url-status=dead}}</ref><ref>{{cite web|url=http://www.boingboing.net/2002/11/12/distributed-proofrea.html|title=Distributed Proofreading's slashdotting|date=12 November 2002 |publisher=[[Boing Boing]]|access-date=2008-07-12|archive-url=https://web.archive.org/web/20071109001542/http://www.boingboing.net/2002/11/12/distributed-proofrea.html|archive-date=2007-11-09|url-status=live}}</ref> and more than 4,000 new members joined in one day, causing an influx of new proofreaders and software developers, which helped to increase the quantity and quality of e-text production.  


On 31 July 2006, the [[Distributed Proofreaders Foundation]] was formed to provide Distributed Proofreaders with its own legal entity and [[Non-profit organization|not-for-profit]] status. [[Internal Revenue Service|IRS]] approval of section [[501(c)|501(c)(3)]] status was granted retroactive to 7 April 2006.
In 2006, the Distributed Proofreaders Foundation was formed to provide Distributed Proofreaders with its own legal entity and [[Non-profit organization|not-for-profit]] status, separate from Project Gutenberg.<ref name=":1">{{cite web |date=2025-06-18 |title=Distributed Proofreaders Foundation History |url=https://www.pgdp.net/wiki/DPFoundation:Distributed_Proofreaders_Foundation_History |access-date=2025-10-20 |website=DPWiki |language=en}}</ref><ref name="globemail">{{Cite web |last=John |first=Last |date=October 31, 2025 |title=Obituary: Project Gutenberg CEO Greg Newby helped put a trove of literature online |url=https://www.theglobeandmail.com/canada/article-greg-newby-project-gutenberg-ceo-literature-public-domain-hacker/ |access-date=November 4, 2025 |website=The Globe and Mail |language=en}}</ref> The founding trustees were Charles Franks, Juliet Sutherland, and [[Gregory B. Newby]].<ref name=":1" />
 
In July 2015, the 30,000th Distributed Proofreaders produced e-text was posted to Project Gutenberg. DP-contributed e-texts comprised more than half of works in Project Gutenberg by 2009.<ref name="brit1" />


== Proofreading process ==
== Proofreading process ==


[[Public domain]] works, typically books with expired copyright, are scanned by volunteers, or sourced from digitization projects and the images are run through [[optical character recognition]] (OCR) [[software]]. Since OCR software is far from perfect, many errors often appear in the resulting text. To correct them, pages are made available to volunteers via the Internet; the original page image and the recognized text appear side by side.<ref>{{cite conference
DP servers are located in the United States, and therefore works must be cleared by Project Gutenberg as being in the [[public domain]] according to [[Copyright law of the United States|United States copyright law]] before they can be proofread and eventually published.<ref name=":2">{{Cite journal |last=Newby |first=G. B. |last2=Franks |first2=C. |date=2003 |title=Distributed proofreading |url=http://ieeexplore.ieee.org/document/1204888/ |journal=2003 Joint Conference on Digital Libraries, 2003. Proceedings. |publisher=IEEE Comput. Soc |pages=361–363 |doi=10.1109/JCDL.2003.1204888 |isbn=978-0-7695-1939-5|url-access=subscription }}</ref>
 
[[Public domain]] works, typically books with expired copyright, are scanned by volunteers or sourced from digitization projects, and the images are run through [[optical character recognition]] (OCR) [[software]].<ref name=":2" /> Since OCR software is far from perfect, the resulting text always includes errors.<ref>{{Cite book |last=Piotrowski |first=Michael |url=https://www.google.com/books/edition/Natural_Language_Processing_for_Historic/vYhyEAAAQBAJ?hl=en&gbpv=1&pg=PA43&printsec=frontcover |title=Natural Language Processing for Historical Texts |date=2022-05-31 |publisher=Springer Nature |isbn=978-3-031-02146-6 |pages=43 |language=en}}</ref> To correct them, pages are made available to volunteers via the Internet; the original page image and the recognized text appear side by side.<ref>{{cite conference
  |author1=Gentry, Craig |author2=Ramzan, Zulfikar |author3=Stuart Stubblebine | title=Secure Distributed ''Human'' Computation
  |author1=Gentry, Craig |author2=Ramzan, Zulfikar |author3=Stuart Stubblebine | title=Secure Distributed ''Human'' Computation
  | book-title=Financial cryptography and data security: 9th International Conference
  | book-title=Financial cryptography and data security: 9th International Conference
Line 61: Line 58:
  | publisher=Springer | isbn=3-540-26656-9
  | publisher=Springer | isbn=3-540-26656-9
  | url=https://books.google.com/books?id=JegO2ly7IccC&pg=PA329
  | url=https://books.google.com/books?id=JegO2ly7IccC&pg=PA329
  | doi=10.1145/1064009.1064026 }}</ref> This process thereby distributes the time-consuming error-correction process, akin to [[distributed computing]].
  | doi=10.1145/1064009.1064026 }}</ref> Each set is presented to multiple volunteers to enter corrections, which results in a combined dataset that minimizes errors.<ref>{{Cite book |last=Christianson |first=Bruce |url=https://www.google.com/books/edition/Security_Protocols/NwGNVmUN84MC?hl=en&gbpv=1&pg=PA178&printsec=frontcover |title=Security Protocols: 14th International Workshop, Cambridge, UK, March 27-29, 2006, Revised Selected Papers |last2=Crispo |first2=Bruno |last3=Malcolm |first3=James A. |last4=Roe |first4=Michael |date=2009-10-15 |publisher=Springer Science & Business Media |isbn=978-3-642-04903-3 |pages=178 |language=en}}</ref> This process distributes the time-consuming error-correction process with a method akin to [[distributed computing]].<ref name=":0" />


Each page is proofread and formatted several times, and then a post-processor combines the pages and prepares the text for uploading to Project Gutenberg.
A post-processor combines the pages and prepares the text for uploading to Project Gutenberg.<ref name=":2" />


Besides custom software created to support the project, DP also runs a forum and a wiki for project coordinators and participants.
Besides custom software created to support the project, DP also runs a forum and a wiki for project coordinators and participants.
Line 71: Line 68:
=== DP Europe ===
=== DP Europe ===
In January 2004, Distributed Proofreaders Europe started, hosted by [[Project Rastko]], Serbia.<ref>{{cite web | first=Marie | last=Lebert | date=November 4, 2010 | title=Distributed Proofreaders, producteur des livres du Projet Gutenberg, a 10 ans | language=fr | work=Actualitté | url=http://www.actualitte.com/dossiers/1197-ebooks-projet-gutenberg-distributed-proofreaders.htm | access-date=2011-06-30 | archive-url=https://web.archive.org/web/20111005175932/http://www.actualitte.com/dossiers/1197-ebooks-projet-gutenberg-distributed-proofreaders.htm | archive-date=October 5, 2011 | url-status=live }}</ref> This site had the ability to process text in [[Unicode]] [[UTF-8]] encoding. Books proofread centered on European culture, with a considerable proportion of non-English texts including Hebrew, Arabic, Urdu, and many others. {{As of|2013|alt=As of October 2013}}, DP Europe had produced 787 e-texts, the last of these in November 2011.
In January 2004, Distributed Proofreaders Europe started, hosted by [[Project Rastko]], Serbia.<ref>{{cite web | first=Marie | last=Lebert | date=November 4, 2010 | title=Distributed Proofreaders, producteur des livres du Projet Gutenberg, a 10 ans | language=fr | work=Actualitté | url=http://www.actualitte.com/dossiers/1197-ebooks-projet-gutenberg-distributed-proofreaders.htm | access-date=2011-06-30 | archive-url=https://web.archive.org/web/20111005175932/http://www.actualitte.com/dossiers/1197-ebooks-projet-gutenberg-distributed-proofreaders.htm | archive-date=October 5, 2011 | url-status=live }}</ref> This site had the ability to process text in [[Unicode]] [[UTF-8]] encoding. Books proofread centered on European culture, with a considerable proportion of non-English texts including Hebrew, Arabic, Urdu, and many others. {{As of|2013|alt=As of October 2013}}, DP Europe had produced 787 e-texts, the last of these in November 2011.
The original DP is sometimes referred to as "DP International" by members of DP Europe. However, DP servers are located in the United States, and therefore works must be cleared by Project Gutenberg as being in the [[public domain]] according to U.S. [[copyright]] law before they can be proofread and eventually published at DP.


=== DP Canada ===
=== DP Canada ===
In December 2007, [[Distributed Proofreaders Canada]] launched to support the production of e-books for [[Project Gutenberg Canada]] and take advantage of shorter [[Copyright law of Canada|Canadian copyright]] terms. Although it was established by members of the original Distributed Proofreaders site, it is a separate entity. All its projects are posted to [[Faded Page]], their book archive website. In addition, it supplies books to Project Gutenberg Canada (which launched on [[Canada Day]] 2007) and (where copyright laws are compatible) to the original Project Gutenberg.
In December 2007, [[Distributed Proofreaders Canada]] launched to support the production of e-books for [[Project Gutenberg Canada]] and take advantage of shorter [[Copyright law of Canada|Canadian copyright]] terms. Although it was established by members of the original Distributed Proofreaders site, it is a separate entity.<ref>{{cite web |last=Lebert |first=Marie |date=November 5, 2010 |title=Distributed Proofreaders just celebrated its 10th anniversary |url=http://teleread.com/distributed-proofreaders-just-celebrated-its-10th-anniversary-by-marie-lebert/ |access-date=12 November 2025 |website=Teleread |publisher=}}</ref> All its projects are posted to [[Faded Page]], their book archive website. In addition, it supplies books to Project Gutenberg Canada, and, where copyright laws are compatible, to the original Project Gutenberg.
 
In addition to preserving [[Canadiana]], DP Canada is notable because it is the first major effort to take advantage of Canada's copyright laws which may allow more works to be preserved. Unlike copyright law in some other countries, Canada has a "life plus 50" copyright term. This means that works by authors who died more than fifty years ago may be preserved in Canada, whereas in other parts of the world those works may not be distributed because they are still under copyright.
 
Notable authors whose works may be preserved in Canada but not in other parts of the world include [[Clark Ashton Smith]], [[Dashiell Hammett]], [[Ernest Hemingway]], [[Carl Jung]], [[A. A. Milne]], [[Dorothy Sayers]], [[Nevil Shute]], [[Walter de la Mare]], [[Sheila Kaye-Smith]] and [[Amy Carmichael]].


== Milestones ==
== Milestones ==
The source for many of these entries is the DP Timeline.<ref>{{cite web |title=DP Timeline |url=https://www.pgdp.net/wiki/DP_Official_Documentation:General/DP_Timeline |access-date=2020-08-13 |website=DPWiki}}</ref>
{| class="wikitable"
{| class="wikitable"
|-
|-
! Milestone
!Milestone
! Date
!Date
! e-text
!e-text
!Source
|-
|-
! First
! First
| 1 Oct 2000
| 1 Oct 2000
| The Odyssey, [[Homer]], Lang tr. (first pages for proofreading)
| The Odyssey, [[Homer]], Lang tr. (first pages for proofreading)
| rowspan="30" |<ref>{{Cite web|title=DP Timeline - DPWiki|url=https://www.pgdp.net/wiki/DP_Official_Documentation:General/DP_Timeline|access-date=2020-08-13|website=www.pgdp.net}}</ref>
|-
|-
! 1,000th
! 1,000th
Line 113: Line 103:
| 24 Aug 2004
| 24 Aug 2004
| A Short Biographical Dictionary of English Literature, [[John William Cousin]]
| A Short Biographical Dictionary of English Literature, [[John William Cousin]]
|-
! 6,000th
| 2 Feb 2005
| [[The Journal of Sir Walter Scott]], [[Sir Walter Scott]]
|-
! 7,000th
| 23 Jun 2005
| Opúsculos por Alexandre Herculano (Vol. I), [[Alexandre Herculano]]; <br> Viage al Parnaso, [[Miguel de Cervantes]]; <br> Leabhráin an Irisleabhair-III, Various.
|-
! 8,000th
| 8 Feb 2006
| The Suppression of the African slave-trade to the United States of America, 1638-1870, [[W. E. B. Du Bois]]
|-
! 9,000th
| 8 Sep 2006
| History of the World War for Human Rights, [[Kelly Miller (scientist)|Kelly Miller]];<br> Poems, [[Christina Rossetti]];<br> Hey Diddle Diddle and Baby Bunting, [[Randolph Caldecott]]
|-
|-
! 10,000th
! 10,000th
| 9 Mar 2007
| 9 Mar 2007
| (See [[#10,000th E-book|10,000th E-book]] below)
| (See [[#10,000th E-book|10,000th E-book]] below)
|-
! 11,000th
| 12 Sep 2007
| Northern Nut Growers Association Thirty-Fourth Annual Report 1943, Northern Nut Growers Association
|-
! 12,000th
| 26 Jan 2008
| Zur Psychopathologie des Alltagslebens, [[Sigmund Freud]]
|-
! 13,000th
| 24 Jun 2008
| A World of Girls, [[L. T. Meade]]
|-
! 14,000th
| 1 Dec 2008
| The Art of Stage Dancing, [[Ned Wayburn]]
|-
|-
! 15,000th
! 15,000th
| 12 May 2009
| 12 May 2009
| Philosophical Transactions of the Royal Society - Vol 1 - 1666, Various. [[Henry Oldenburg]] (editor)
| Philosophical Transactions of the Royal Society - Vol 1 - 1666, Various. [[Henry Oldenburg]] (editor)
|-
! 16,000th
| 1 Oct 2009
| ABC Petits Contes, [[Jules Lemaître]]
|-
! 17,000th
| 4 Mar 2010
| The Position of Woman in Primitive Society, C. Gasquoine Hartley
|-
! 18,000th
| 15 Jun 2010
| Area Handbook for Romania, Eugene K. Keefe, et al.
|-
! 19,000th
| 10 Nov 2010
| [[Reynard cycle|Vanden Vos Reinaerde]] Uitgegeven en Toegelicht (anonymous)
|-
|-
! 20,000th
! 20,000th
| 10 April 2011
| 10 April 2011
| (See [[#20,000th E-book|20,000th E-book]] below)
| (See [[#20,000th E-book|20,000th E-book]] below)
|-
! 22,000th
| 2 Jan 2012
| "[[The Nibelungenlied]]", William Nanson Lettsom's translation
|-
|-
! 25,000th
! 25,000th
| 10 April 2013
| 10 April 2013
| The Art and Practice of Silver Printing, [[Henry Peach Robinson|H. P. Robinson]] and [[William de Wiveleslie Abney|Capt. Abney]]
| The Art and Practice of Silver Printing, [[Henry Peach Robinson|H. P. Robinson]] and [[William de Wiveleslie Abney|Capt. Abney]]<ref>{{cite web
|url=https://blog.pgdp.net/2013/04/10/a-silver-anniversary-25000-titles-posted/
|title=A Silver Anniversary—25,000 Titles posted to Project Gutenberg!
|date=10 April 2013
|publisher=Pgdp.net
|access-date=20 October 2025}}</ref>
|-
|-
! 30,000th
! 30,000th
| 7 July 2015
| 7 July 2015
| Graded Literature Readers: Fourth Book
| Graded Literature Readers: Fourth Book<ref>{{cite web
|url=https://blog.pgdp.net/2015/07/07/celebrating-30000-titles/
|title=Celebrating 30,000 Titles
|date=7 July 2015
|publisher=Pgdp.net
|access-date=20 October 2025}}</ref>
|-
|-
! 35,000th
! 35,000th
| 26 Jan 2018
| 26 Jan 2018
| Shores of the Polar Sea, a Narrative of the Arctic Expedition of 1875–1876
| Shores of the Polar Sea, a Narrative of the Arctic Expedition of 1875–1876<ref>{{cite web
|-
|url=https://blog.pgdp.net/2018/01/26/celebrating-35000-titles/
!36,000th
|title=Celebrating 35,000 Titles
|7 September 2018
|date=26 January 2018
|American Missionary
|publisher=Pgdp.net
|-
|access-date=20 October 2025}}</ref>
!37,000th
|16 April 2019
|French Painting of the 19th Century in the National Gallery of Art
|-
!38,000th
|8 November 2019
|The Birds of Australia (Vol. 3 of 7)
|-
!39,000th
|27 April 2020
|Wilhelm Hauffs sämtliche Werke in sechs Bänden. Bd. 6
|-
|-
!40,000th
!40,000th
|10 October 2020
|10 October 2020
|All four volumes of [[London Labour and the London Poor]]<ref>{{cite web|url=https://blog.pgdp.net/2020/10/10/celebrating-40000-titles |title=Celebrating 40,000 Titles |date=10 October 2020 |publisher=Pgdp.net}}</ref>
|All four volumes of [[London Labour and the London Poor]]<ref>{{cite web
|-
|url=https://blog.pgdp.net/2020/10/10/celebrating-40000-titles
!41,000th
|title=Celebrating 40,000 Titles
|5 March 2021
|date=10 October 2020
|[[Clara Barton]]'s The story of my childhood<ref>{{cite web|url=https://blog.pgdp.net/2021/03/05/celebrating-41000-titles |title=Celebrating 41,000 Titles |date=5 March 2021 |publisher=Pgdp.net}}</ref>
  |publisher=Pgdp.net}}</ref>
|-
!42,000th
|3 August 2021
|Carry On, Jeeves
|-
!43,000th
|31 January 2022
|Die Sitten der Völker, Zweiter Band<ref>{{cite web
| url = https://blog.pgdp.net/2022/02/01/celebrating-43000-titles/
| title = Celebrating 43,000 Titles
  | date = February 1, 2022
| website = blog.pgdp.net
| access-date = February 1, 2022}}</ref>
|-
!44,000th
|19 July 2022
|The trial of [[Émile Zola]]<ref>{{cite web |url=https://blog.pgdp.net/2022/07/19/celebrating-44000-titles/ |title=Celebrating 44,000 Titles |date=19 July 2022 |publisher=Pgdp.net}}</ref>
|-
|-
!45,000th
!45,000th
|18 January 2023
|18 January 2023
|Elihu Stewart's Down the Mackenzie and Up the Yukon in 1906<ref>{{cite web |url=https://blog.pgdp.net/2023/01/18/celebrating-45000-titles/ |title=Celebrating 45,000 Titles |date=18 January 2023 |publisher=Pgdp.net}}</ref>
|Elihu Stewart's Down the Mackenzie and Up the Yukon in 1906<ref>{{cite web
|-
|url=https://blog.pgdp.net/2023/01/18/celebrating-45000-titles/
!46,000th
|title=Celebrating 45,000 Titles
|3 July 2023
|date=18 January 2023
|The English and Scottish Popular Ballads (the [[Child Ballads]])<ref>{{cite web |url=https://blog.pgdp.net/2023/07/03/celebrating-46000-titles/ |title=Celebrating 46,000 Titles |date=3 July 2021 |publisher=Pgdp.net}}</ref>
|publisher=Pgdp.net}}</ref>
|-
!47,000th
|20 December 2023
|The [[Betty Crocker]] Picture Cooky Book<ref name=47000th>{{cite web |url=https://blog.pgdp.net/2023/12/20/celebrating-47000-titles/ |title=Celebrating 47,000 Titles |date=20 December 2023 |publisher=Pgdp.net}}</ref>
|-
!48,000th
|19 July 2024
|The Reign of King Oberon<ref name=48000th>{{cite web |url=https://blog.pgdp.net/2024/07/19/celebrating-48000-titles/ |title=Celebrating 48,000 Titles |date=19 July 2024 |publisher=Pgdp.net}}</ref>
|-
|}
|}


Line 286: Line 197:
On 7 July 2015, the 30,000th book milestone was celebrated with a group of thirty texts. One was numbered 30,000:<ref>{{cite web|url=http://www.pgdp.net/phpBB2/viewtopic.php?t=59120 |title=Distributed Proofreaders • View topic - 30,000 Unique Titles Preserved! |publisher=Pgdp.net |access-date=2016-09-15}}</ref>
On 7 July 2015, the 30,000th book milestone was celebrated with a group of thirty texts. One was numbered 30,000:<ref>{{cite web|url=http://www.pgdp.net/phpBB2/viewtopic.php?t=59120 |title=Distributed Proofreaders • View topic - 30,000 Unique Titles Preserved! |publisher=Pgdp.net |access-date=2016-09-15}}</ref>
*''Graded literature readers - Fourth book'', editors: [[Harry Pratt Judson]] and Ida C. Bender, 1900
*''Graded literature readers - Fourth book'', editors: [[Harry Pratt Judson]] and Ida C. Bender, 1900
=== 40,000th E-book ===
On 10 October 2020, the 40,000th book milestone was celebrated with the completion of a four-volume work, ''[[London Labour and the London Poor]]'', by [[Henry Mayhew]].<ref>{{cite web |url=https://blog.pgdp.net/2020/10/10/celebrating-40000-titles/ |title=Celebrating 40,000 Titles |publisher=Pgdp.net |access-date=2025-09-01}}</ref>


==See also==
==See also==

Latest revision as of 06:19, 17 November 2025

Template:Short description Script error: No such module "Infobox".Template:Template otherScript error: No such module "Check for unknown parameters".Script error: No such module "check for clobbered parameters".

Distributed Proofreaders (commonly abbreviated as DP or PGDP) is a web-based project that supports the development of e-texts for Project Gutenberg by allowing many people to work together in proofreading drafts of e-texts for errors. Template:As of the site had digitized 49,000 titles.[1]

History

Distributed Proofreaders was founded by Charles Franks in 2000 as an independent site to assist Project Gutenberg.[2][3] Distributed Proofreaders became an official Project Gutenberg site in 2002.[3]

On 8 November 2002, Distributed Proofreaders was slashdotted,[4][5] and more than 4,000 new members joined in one day, causing an influx of new proofreaders and software developers, which helped to increase the quantity and quality of e-text production.

In 2006, the Distributed Proofreaders Foundation was formed to provide Distributed Proofreaders with its own legal entity and not-for-profit status, separate from Project Gutenberg.[6][7] The founding trustees were Charles Franks, Juliet Sutherland, and Gregory B. Newby.[6]

In July 2015, the 30,000th Distributed Proofreaders produced e-text was posted to Project Gutenberg. DP-contributed e-texts comprised more than half of works in Project Gutenberg by 2009.[3]

Proofreading process

DP servers are located in the United States, and therefore works must be cleared by Project Gutenberg as being in the public domain according to United States copyright law before they can be proofread and eventually published.[8]

Public domain works, typically books with expired copyright, are scanned by volunteers or sourced from digitization projects, and the images are run through optical character recognition (OCR) software.[8] Since OCR software is far from perfect, the resulting text always includes errors.[9] To correct them, pages are made available to volunteers via the Internet; the original page image and the recognized text appear side by side.[10] Each set is presented to multiple volunteers to enter corrections, which results in a combined dataset that minimizes errors.[11] This process distributes the time-consuming error-correction process with a method akin to distributed computing.[2]

A post-processor combines the pages and prepares the text for uploading to Project Gutenberg.[8]

Besides custom software created to support the project, DP also runs a forum and a wiki for project coordinators and participants.

Related projects

DP Europe

In January 2004, Distributed Proofreaders Europe started, hosted by Project Rastko, Serbia.[12] This site had the ability to process text in Unicode UTF-8 encoding. Books proofread centered on European culture, with a considerable proportion of non-English texts including Hebrew, Arabic, Urdu, and many others. Template:As of, DP Europe had produced 787 e-texts, the last of these in November 2011.

DP Canada

In December 2007, Distributed Proofreaders Canada launched to support the production of e-books for Project Gutenberg Canada and take advantage of shorter Canadian copyright terms. Although it was established by members of the original Distributed Proofreaders site, it is a separate entity.[13] All its projects are posted to Faded Page, their book archive website. In addition, it supplies books to Project Gutenberg Canada, and, where copyright laws are compatible, to the original Project Gutenberg.

Milestones

The source for many of these entries is the DP Timeline.[14]

Milestone Date e-text
First 1 Oct 2000 The Odyssey, Homer, Lang tr. (first pages for proofreading)
1,000th 19 Feb 2003 Tales of St. Austin's, P. G. Wodehouse
2,000th 3 Sep 2003 Hamlet — the 'Bad Quarto', William Shakespeare
3,000th 14 Jan 2004 The Anatomy of Melancholy, Robert Burton
4,000th 6 Apr 2004 Aventures du Capitaine Hatteras, Jules Verne
5,000th 24 Aug 2004 A Short Biographical Dictionary of English Literature, John William Cousin
10,000th 9 Mar 2007 (See 10,000th E-book below)
15,000th 12 May 2009 Philosophical Transactions of the Royal Society - Vol 1 - 1666, Various. Henry Oldenburg (editor)
20,000th 10 April 2011 (See 20,000th E-book below)
25,000th 10 April 2013 The Art and Practice of Silver Printing, H. P. Robinson and Capt. Abney[15]
30,000th 7 July 2015 Graded Literature Readers: Fourth Book[16]
35,000th 26 Jan 2018 Shores of the Polar Sea, a Narrative of the Arctic Expedition of 1875–1876[17]
40,000th 10 October 2020 All four volumes of London Labour and the London Poor[18]
45,000th 18 January 2023 Elihu Stewart's Down the Mackenzie and Up the Yukon in 1906[19]

10,000th E-book

On 9 March 2007, Distributed Proofreaders announced the completion of more than 10,000 titles. In celebration, a collection of fifteen titles was published:

20,000th E-book

On April 10, 2011, the 20,000th book milestone was celebrated as a group release of bilingual books:[20]

30,000th E-book

On 7 July 2015, the 30,000th book milestone was celebrated with a group of thirty texts. One was numbered 30,000:[21]

  • Graded literature readers - Fourth book, editors: Harry Pratt Judson and Ida C. Bender, 1900

40,000th E-book

On 10 October 2020, the 40,000th book milestone was celebrated with the completion of a four-volume work, London Labour and the London Poor, by Henry Mayhew.[22]

See also

References

Template:Reflist

External links

  1. Script error: No such module "citation/CS1".
  2. a b Script error: No such module "citation/CS1".
  3. a b c Script error: No such module "citation/CS1".
  4. Script error: No such module "citation/CS1".
  5. Script error: No such module "citation/CS1".
  6. a b Script error: No such module "citation/CS1".
  7. Script error: No such module "citation/CS1".
  8. a b c Script error: No such module "Citation/CS1".
  9. Script error: No such module "citation/CS1".
  10. Script error: No such module "citation/CS1".
  11. Script error: No such module "citation/CS1".
  12. Script error: No such module "citation/CS1".
  13. Script error: No such module "citation/CS1".
  14. Script error: No such module "citation/CS1".
  15. Script error: No such module "citation/CS1".
  16. Script error: No such module "citation/CS1".
  17. Script error: No such module "citation/CS1".
  18. Script error: No such module "citation/CS1".
  19. Script error: No such module "citation/CS1".
  20. Distributed Proofreaders celebrates 20,000 books posted Template:Webarchive, Distributed Proofreaders, April 10, 2011
  21. Script error: No such module "citation/CS1".
  22. Script error: No such module "citation/CS1".