On digitize public-domain Chinese books or other resources

For example, de:wikisource is getting each year 2000 Euro from the german chapter for digitising. Each digitising project costs between 30 Euro and 100 Euro. Source

Goldzahn05:25, 8 April 2010

With a ballpark estimate of 1 million older Chinese works to be digitized - figure on 30 million Euro?

Collect10:11, 8 April 2010

That is not the point. The problem is to find people who will do the work for free. And there is far more work to do than just place a book on a scanner. I guess that one person is able to proofread not more than 10 pages a day. There is a second problem. I know that there are maybe 50.000 Chinese characters but most people know just a few thousand characters. If you don´t know a character you don´t know if the scan is correct. That means you need high skilled people. Those people don´t like to work for free and there are not much of them.

Goldzahn11:39, 8 April 2010

For the problem of Chinese character, it is not the case, in fact by some input method(for example Wubi method ), you can just input the characters by the structure of characters rather than their pronunciation. And Wubi is popular in China.

We are just at the stage from ideas to concrete plan. I will draft the detail of the plan recently. and I don't think it is a big project.

Mountain12:18, 8 April 2010

Just be sure not to underestimate what is involved - I am sure Amazon and Google have a strong interest here to be sure.

For Goldzahn - I already raised the issue of older words. The Kangxi dictionary has under 50,000 characters - which likely covers most ones to be found in out-of-copyright material. Apparently people are considered proficient with a knowledge of about 7,000 characters (which combine to form a much larger number of words). I therefore would defer to Mountain that the task is doable.

Collect13:09, 8 April 2010