On Fri, Aug 26, 2011 at 4:44 PM, Thomas Krichel <krichel@openlib.org> wrote:
Berkeley DB level and I do not know how to fix it. The corruption is reported by the db4.6_verify tool. Berkeley DB documentation is not clear on how to fix it, or I'm not looking hard enough.
I do a
db_dump foo | db_load foo_clean
when I am desparate. On PubMed 20 million records, that takes several days to do.
wow. you are right and i didn't think of this method. luckily, we do not need it, since i've already removed the corrupted part and started over.
Which leads me to the whole other topic: in the longer run we should avoid Berkeley DB and use something else instead. Luckily, this shouldn't be too hard to do, since all the BerkleyDB-related code is concentrated in a couple of modules or so. And there are alternatives. Kyoto Cabinet http://fallabs.com/kyotocabinet/ being one of them.
or mongo or couch....
I have had *tons* of trouble with BDB. I hate it. But still I don't think we should work on this now. CZ will kick our buts.
You need to look at the ACIS code to find where the problem is. Blaming them on BDB is not the way forward I think.
All the corruption happened in the ACIS part of the RI db, not in the other parts of it. Most of the records that had value and key mixed up seem to come from the 2011 January. My guess is that it was from the time when nebka ran out of disk space or something similar. So, i do not blame BDB itself, but the whole set of circumstance. But the idea is that we need a more viable storage.
Now, I've moved out the corrupted BDB file ~/acis/RI/data/ACIS/records (renamed it).
and now I've actually removed it.
I think the data is now ready to be migrated to the new server (again). The changes in ~/acis/ need to be copied over too, as well as the (partially) fixed ~/acis/RI/data contents. Dan I suggest you to do that.
I think he should not do this at all. Instead he should start with the text data, and the most recent ACIS release, and build a clear dataset. I have done this today and I will forward my notes on it.
Thomas, sorry, i see your point, but please let me do my job here.
It may be a good idea to wait for the nearest run of the nightly script to create the RI database snapshots in ~/backup/2011/08/26 and to use those.
With these data and code changes, it should all run on the new server, but we'll do our tests and checks as soon as it is copied.
Any comments? Questions?
I bet that this will not work. Dan will still not be able to read your storables, and he will not read his own storables when he updates perl. Get rid of Storable. Keep BDB, for now.
You can bet, that's ok, but please not on the RAS-run list. Storable has brought you enough trouble, and now some to RAS, but that's a separate topic also. -ivan