Re: [RAS] Migration Help

26 Aug 2011

      Ivan Kurmanov writes
...
I've been working on nebka last week or more with RAS database.
The first thing i did: I modified the ACIS code to use Storable's
nfreeze() function when storing the data into the db. Also at that
time I've found that part of the ACIS code on nebka was already using
nfreeze().
Than I've written a test script which runs through the specified
database tables (mysql) and checks their values for being written with
Storable correctly.
Then the last 5 days or so i was working on the upgrade daemon
database (Berkeley DB) of RAS on nebka. It also contains
Storable-encoded strings, and is important. First, i've written a
script (actually, a version of the same script mentioned above), to
check the values in it (by full scanning, and checking each). That
found some issues.
In fact, that checking has found some serious data corruption, where
some keys where mixed with values and some values being hopelessly
broken.
I did suspect things where not ok because when I do fewer
  update (timout more then the default 1 week) things are
  not updated properly. I got serious complaints from CZ about
  it. I wanted to reduce the load on the box by setting 
  a higher TOO_OLD.
...
Second, I've written a script to correct values which need correction
(via nfreeze) and to remove the ones that cannot be corrected.
That work is now done. BUT at least a part of the update daemon
database is still corrupted there. The good news is that this part is
not RePEc-part and it would not (should not) cause any trouble to
rebuild it. The bad news is that it is corrupted on the internal
Berkeley DB level and I do not know how to fix it. The corruption is
reported by the db4.6_verify tool. Berkeley DB documentation is not
clear on how to fix it, or I'm not looking hard enough.
I do a 

db_dump foo | db_load foo_clean 

  when I am desparate. On PubMed 20 million records, that takes
  several days to do. 

  I have been desparete many times.
...
Which leads me to the whole other topic: in the longer run we should
avoid Berkeley DB and use something else instead. Luckily, this
shouldn't be too hard to do, since all the BerkleyDB-related code is
concentrated in a couple of modules or so. And there are alternatives.
Kyoto Cabinet http://fallabs.com/kyotocabinet/ being one of them.
or mongo or couch.... 

  I have had *tons* of trouble with BDB. I hate it.
  But still I don't think we should work on this now. 
  CZ will kick our buts. 

  You need to look at the ACIS code to find where the problem
  is. Blaming them on BDB is not the way forward I think.
...
Now, I've moved out the corrupted BDB file ~/acis/RI/data/ACIS/records
(renamed it).
I think the data is now ready to be migrated to the new server
(again). The changes in ~/acis/ need to be copied over too, as well as
the (partially) fixed ~/acis/RI/data contents. Dan I suggest you to do
that.
I think he should not do this at all. Instead he should start with
  the text data, and the most recent ACIS release, and build a clear
  dataset. I have done this today and I will forward my notes on it.
...
It may be a good idea to wait for the nearest run of the nightly
script to create the RI database snapshots in  ~/backup/2011/08/26 and
to use those.
With these data and code changes, it should all run on the new server,
but we'll do our tests and checks as soon as it is copied.
Any comments? Questions?
I bet that this will not work. Dan will still not be able 
  to read your storables, and he will not read his own storables
  when he updates perl. Get rid of Storable. Keep BDB, for now.

  Cheers,

  Thomas Krichel                    http://openlib.org/home/krichel
                                      http://authorprofile.org/pkr1
                                               skype: thomaskrichel