On digitize public-domain Chinese books or other resources
Fragment of a discussion from Talk:Task force/Strategy
Just be sure not to underestimate what is involved - I am sure Amazon and Google have a strong interest here to be sure.
For Goldzahn - I already raised the issue of older words. The Kangxi dictionary has under 50,000 characters - which likely covers most ones to be found in out-of-copyright material. Apparently people are considered proficient with a knowledge of about 7,000 characters (which combine to form a much larger number of words). I therefore would defer to Mountain that the task is doable.