Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Check leveldb cache store directory from the backup using a leveldb client.
    1. Verify the leveldb opens.
    2. Verify by iterating through the keys. (To expose any corruption)
  2. Based on the flow of the background compaction process in the leveldb implementation ([1] and [2]), the manifest file is updated at the end of compaction, which is followed by the deletion of obsolete files. To verify successful backups, we can begin the backup with copying the manifest file followed by the rest of the files. And, at the end of the backup, the backed up manifest file can be compared with the current manifest. An unchanged manifest can be considered as a successful backup, and vice versa.


    [1] https://github.com/google/leveldb/blob/master/db/db_impl.cc#L655
    [2] https://github.com/google/leveldb/blob/master/db/version_set.cc#L811

     

The following script can be used to perform hot backup with verification (requires repair and verify scripts):

Code Block
languagebash
titlebackup_leveldb.sh
#!/bin/bash

# Location of the fcrepo home directory
FCREPO_HOME=/var/lib/tomcat7/fcrepo4-data
# Destination Directory
BACKUP_TO=/home/vagrant/backup
# Backup exact (without the structural changes introduces by Python verfication script).
# (Additional temporary storage is used, if true)
BACKUP_EXACT=true
# Max Backup Attempts on failure
ATTEMPTS=10

echo `date`" $0: fcrepo home dir: $FCREPO_HOME"
echo `date`" $0: backup dir: $BACKUP_TO"
echo `date`" $0: max attempts on failure: $ATTEMPTS"

if [ ! -d $BACKUP_TO ]; then
  mkdir $BACKUP_TO
fi

LEVELDB_DIR=fcrepo.ispn.repo.cache
DATA_DIR=dataFedoraRepository
backup_succeeded=false
attempts=$ATTEMPTS

echo `date`" $0: Backing up leveldb."
while [ $attempts -gt 0 ]; do
	MANIFEST_FILE=`ls $FCREPO_HOME/$LEVELDB_DIR/$DATA_DIR/MANIFEST-*`
  MANIFEST_MD5=`md5sum $MANIFEST_FILE`
  rm -rf $BACKUP_TO/$LEVELDB_DIR"-tmp"
  cp -r $FCREPO_HOME/$LEVELDB_DIR $BACKUP_TO/$LEVELDB_DIR"-tmp"
  copy_success=$?
  MANIFEST_FILE_POST_BACKUP=`ls $FCREPO_HOME/$LEVELDB_DIR/$DATA_DIR/MANIFEST-*`
  MANIFEST_MD5_POST_BACKUP=`md5sum $MANIFEST_FILE_POST_BACKUP`
  if [ "$MANIFEST_MD5" = "$MANIFEST_MD5_POST_BACKUP" ] && [ "$copy_success" = "0" ]; then
  	backup_succeeded=true
  	break;
  fi
  attempts=$((attempts - 1))
  echo `date`" $0: leveldb manifest changed during backup process! $attempts attempts remaining."
done

if [ "$backup_succeeded" = false ]; then
	echo `date`" $0: Failed to backup with a consistent leveldb manifest!"
else
	echo `date`" $0: Backup created and verified leveldb manifest consistency!"
fi

if [ "$BACKUP_EXACT" = true ]; then
	rm -rf $BACKUP_TO/$LEVELDB_DIR"-unchanged"
	cp -r $BACKUP_TO/$LEVELDB_DIR"-tmp" $BACKUP_TO/$LEVELDB_DIR"-unchanged"
fi

backup_repaired=false
# Verify and repair using python script
python verify_leveldb.py $BACKUP_TO/$LEVELDB_DIR"-tmp"/$DATA_DIR
if [ "$?" != "0" ]; then
	echo `date`" $0: Discovered backup corruption! Attempting to repair!"
	python repair_leveldb.py $BACKUP_TO/$LEVELDB_DIR"-tmp"/$DATA_DIR
	if [ "$?" != "0" ]; then
		echo `date`" $0: Backup repair failed!"
	else
		python verify_leveldb.py $BACKUP_TO/$LEVELDB_DIR"-tmp"/$DATA_DIR
		if [ "$?" != "0" ]; then
			echo `date`" $0: Backup repair failed!"
		else
			echo `date`" $0: Backup repair succeeded!"
			backup_repaired=true
			backup_succeeded=true
		fi
	fi
else
	echo `date`" $0: Backup passed corruption verification!"
fi

if [ "$backup_succeeded" = true ]; then
	rm -rf $BACKUP_TO/$LEVELDB_DIR
	if [ "$BACKUP_EXACT" = true ] && [ "$backup_repaired" = false ]; then
		mv $BACKUP_TO/$LEVELDB_DIR"-unchanged" $BACKUP_TO/$LEVELDB_DIR
	else
		mv $BACKUP_TO/$LEVELDB_DIR"-tmp" $BACKUP_TO/$LEVELDB_DIR
	fi
fi

 

The following script can be used to verify a leveldb cache for corruptions:

Code Block
languagepy
titleverify_leveldb.py
#!/usr/bin/python
import sys
import leveldb
from datetime import datetime
import subprocess
if (len(sys.argv) != 2) :
  print 'Usage:'
  print 'python verify_leveldb.py /path/to/leveldb/cache/'
  sys.exit(1)
dbpath = sys.argv[1]
try:
  startTime = datetime.now()
  print sys.argv[0], ": Inspecting db: ", dbpath
  db = leveldb.LevelDB(dbpath,  create_if_missing=False, paranoid_checks=True)
  it = db.RangeIter(None, None, True, True, True)
  records = 0;
  for key, value in it:
    records = records + 1
  print sys.argv[0], ": Backup verfiication successful!"
  print sys.argv[0], ": Total records: ", records
  print sys.argv[0], ": Time taken:", datetime.now() - startTime
except:
  print sys.argv[0], ": Backup verfication failed!"
  print sys.argv[0], ": Unexpected error:", sys.exc_info()[0]
  raise
  exit(1)
# Rename ldb files to sst (Python API uses ldb extension, whereas infinispan expects sst)
script = 'for f in ' + dbpath + '/*.ldb; do mv $f "${f%.ldb}.sst"; done'
subprocess.call(['/bin/bash', '-c', script])

 

Repairing Corrupt LevelDB

...

The below script can be used to repair corrupt leveldb cache:

Code Block
languagepy
titlerepair_leveldb.py
#!/usr/bin/python
import leveldb
import sys
import os
import subprocess

if (len(sys.argv) != 2) :
  print 'Usage:'
  print 'python repair_leveldb.py /path/to/leveldb/cache'
  sys.exit(1)

path = sys.argv[1]
if not os.path.isdir(path) :
  print sys.argv[0], ": Directory %s does not exist!" % path
  sys.exit()
 
# Use the leveldb module to repair the files
leveldb.RepairDB(path)

# Rename ldb files to sst (Python API uses ldb extension, whereas infinispan expects sst)
script = 'for f in ' + path + '/*.ldb; do mv $f "${f%.ldb}.sst"; done'
subprocess.call(['/bin/bash', '-c', script])