Proposal talk:Implement and deploy checksum revision table
|Thread title||Replies||Last modified|
|Hash in revision or text ?||1||13:17, 4 September 2011|
|Index||1||13:11, 4 September 2011|
Right now the proposal title and committed patch implement this in the
revision table. Why is this though ? In my opinion it makes more sense in the
text table (which the introduction paragraph of the proposal mentions as target table as well).
It's the hash of the text, not of the revision meta-data. There can (and should be) mutiple revisions with the same hash of the revision text. Right now MediaWiki only re-uses a text-table row if a revision is a direct revert of an earlier revision (using the "rollback" feature). If a normal undo takes place or if there were multiple editors between the vandalism and the user had to dig back manualy and save an old revision, then MediaWiki stores a second copy of the text.
Anyway, just to bring this up. Do we want it in the text table ?
No index needed ? If we want to re-use
text-table rows and query by a generated hash when saving the revision text, we would need an index, right ?
Yes, if you want to query by hash then you would need obviously an index but I haven't heard a use case yet where we really would want to query often the hash column. In addition, the checksum will not be always unique across different pages. If two different pages have been blanketed then they would have the same hash. So we might need a compounded index in that case but I would like to hear more different use cases first before we decide on including an index.