...
- Check leveldb cache store directory from the backup using a leveldb client.
- Verify the leveldb opens.
- Verify by iterating through the keys. (To expose any corruption)
Based on the flow of the background compaction process in the leveldb implementation ([1] and [2]), the manifest file is updated at the end of compaction, which is followed by the deletion of obsolete files. To verify successful backups, we can begin the backup with copying the manifest file followed by the rest of the files. And, at the end of the backup, the backed up manifest file can be compared with the current manifest. An unchanged manifest can be considered as a successful backup, and vice versa.
[1] https://github.com/google/leveldb/blob/master/db/db_impl.cc#L655
[2] https://github.com/google/leveldb/blob/master/db/version_set.cc#L811
The following script can be used to perform hot backup with verification (requires repair and verify scripts):
Code Block | ||||
---|---|---|---|---|
| ||||
#!/bin/bash
# Location of the fcrepo home directory
FCREPO_HOME=/var/lib/tomcat7/fcrepo4-data
# Destination Directory
BACKUP_TO=/home/vagrant/backup
# Backup exact (without the structural changes introduces by Python verfication script).
# (Additional temporary storage is used, if true)
BACKUP_EXACT=true
# Max Backup Attempts on failure
ATTEMPTS=10
echo `date`" $0: fcrepo home dir: $FCREPO_HOME"
echo `date`" $0: backup dir: $BACKUP_TO"
echo `date`" $0: max attempts on failure: $ATTEMPTS"
if [ ! -d $BACKUP_TO ]; then
mkdir $BACKUP_TO
fi
LEVELDB_DIR=fcrepo.ispn.repo.cache
DATA_DIR=dataFedoraRepository
backup_succeeded=false
attempts=$ATTEMPTS
echo `date`" $0: Backing up leveldb."
while [ $attempts -gt 0 ]; do
MANIFEST_FILE=`ls $FCREPO_HOME/$LEVELDB_DIR/$DATA_DIR/MANIFEST-*`
MANIFEST_MD5=`md5sum $MANIFEST_FILE`
rm -rf $BACKUP_TO/$LEVELDB_DIR"-tmp"
cp -r $FCREPO_HOME/$LEVELDB_DIR $BACKUP_TO/$LEVELDB_DIR"-tmp"
copy_success=$?
MANIFEST_FILE_POST_BACKUP=`ls $FCREPO_HOME/$LEVELDB_DIR/$DATA_DIR/MANIFEST-*`
MANIFEST_MD5_POST_BACKUP=`md5sum $MANIFEST_FILE_POST_BACKUP`
if [ "$MANIFEST_MD5" = "$MANIFEST_MD5_POST_BACKUP" ] && [ "$copy_success" = "0" ]; then
backup_succeeded=true
break;
fi
attempts=$((attempts - 1))
echo `date`" $0: leveldb manifest changed during backup process! $attempts attempts remaining."
done
if [ "$backup_succeeded" = false ]; then
echo `date`" $0: Failed to backup with a consistent leveldb manifest!"
else
echo `date`" $0: Backup created and verified leveldb manifest consistency!"
fi
if [ "$BACKUP_EXACT" = true ]; then
rm -rf $BACKUP_TO/$LEVELDB_DIR"-unchanged"
cp -r $BACKUP_TO/$LEVELDB_DIR"-tmp" $BACKUP_TO/$LEVELDB_DIR"-unchanged"
fi
backup_repaired=false
# Verify and repair using python script
python verify_leveldb.py $BACKUP_TO/$LEVELDB_DIR"-tmp"/$DATA_DIR
if [ "$?" != "0" ]; then
echo `date`" $0: Discovered backup corruption! Attempting to repair!"
python repair_leveldb.py $BACKUP_TO/$LEVELDB_DIR"-tmp"/$DATA_DIR
if [ "$?" != "0" ]; then
echo `date`" $0: Backup repair failed!"
else
python verify_leveldb.py $BACKUP_TO/$LEVELDB_DIR"-tmp"/$DATA_DIR
if [ "$?" != "0" ]; then
echo `date`" $0: Backup repair failed!"
else
echo `date`" $0: Backup repair succeeded!"
backup_repaired=true
backup_succeeded=true
fi
fi
else
echo `date`" $0: Backup passed corruption verification!"
fi
if [ "$backup_succeeded" = true ]; then
rm -rf $BACKUP_TO/$LEVELDB_DIR
if [ "$BACKUP_EXACT" = true ] && [ "$backup_repaired" = false ]; then
mv $BACKUP_TO/$LEVELDB_DIR"-unchanged" $BACKUP_TO/$LEVELDB_DIR
else
mv $BACKUP_TO/$LEVELDB_DIR"-tmp" $BACKUP_TO/$LEVELDB_DIR
fi
fi |
The following script can be used to verify a leveldb cache for corruptions:
Code Block | ||||
---|---|---|---|---|
| ||||
#!/usr/bin/python
import sys
import leveldb
from datetime import datetime
import subprocess
if (len(sys.argv) != 2) :
print 'Usage:'
print 'python verify_leveldb.py /path/to/leveldb/cache/'
sys.exit(1)
dbpath = sys.argv[1]
try:
startTime = datetime.now()
print sys.argv[0], ": Inspecting db: ", dbpath
db = leveldb.LevelDB(dbpath, create_if_missing=False, paranoid_checks=True)
it = db.RangeIter(None, None, True, True, True)
records = 0;
for key, value in it:
records = records + 1
print sys.argv[0], ": Backup verfiication successful!"
print sys.argv[0], ": Total records: ", records
print sys.argv[0], ": Time taken:", datetime.now() - startTime
except:
print sys.argv[0], ": Backup verfication failed!"
print sys.argv[0], ": Unexpected error:", sys.exc_info()[0]
raise
exit(1)
# Rename ldb files to sst (Python API uses ldb extension, whereas infinispan expects sst)
script = 'for f in ' + dbpath + '/*.ldb; do mv $f "${f%.ldb}.sst"; done'
subprocess.call(['/bin/bash', '-c', script]) |
Repairing Corrupt LevelDB
...
The below script can be used to repair corrupt leveldb cache:
Code Block | ||||
---|---|---|---|---|
| ||||
#!/usr/bin/python import leveldb import sys import os import subprocess if (len(sys.argv) != 2) : print 'Usage:' print 'python repair_leveldb.py /path/to/leveldb/cache' sys.exit(1) path = sys.argv[1] if not os.path.isdir(path) : print sys.argv[0], ": Directory %s does not exist!" % path sys.exit() # Use the leveldb module to repair the files leveldb.RepairDB(path) # Rename ldb files to sst (Python API uses ldb extension, whereas infinispan expects sst) script = 'for f in ' + path + '/*.ldb; do mv $f "${f%.ldb}.sst"; done' subprocess.call(['/bin/bash', '-c', script]) |