Configuration
Fedora 4 is built in top of the JCR implementation Modeshape. Modeshape uses Infinispan as a distributed datastore, which in turn uses the Messaging Toolkit JGroups to transfer state between nodes.
Therefore the following resources and documents contain a lot of important information about configuring Fedora 4's underlying projects.
- Fedora configuration inventory
- Modeshape documentation
- Modeshape configuration section
- Infinispan documentation
- Infinispan configuration section
- JGroups documentation
- JGroups configuration section
Step-By-Step guides for deploying Fedora 4 clusters
Deploy cluster using the UDP Multicast protocol for node discovery and the TCP protocol for replication
A few configuration options have to be set in order to have Fedora 4 work as a cluster on a local machine:
- -Xmx1024m Set the Java heap to 1GB
- -XX:MaxPermSize=256m Set the Java PermGen size to 256MB
- -Dfcrepo.modeshape.configuration=file:///path/to/repository.json The Modeshape configuration used for clustering
- -Djava.net.preferIPv4Stack=true Tell Java to use IPv4 rather than IPv6
- fcrepo.ispn.jgroups.configuration=/path/to/jgroups-fcrepo-tcp.xml Set the JGroups configuration file holding the TCP Transport defintions
- jgroups.udp.mcast_addr=239.42.42.42 Set the UDP multicast address for the JGroups cluster
- fcrepo.ispn.configuration=/path/to/infinispan.xml Set the Infinispan configuration file holding the Infinispan cluster configuration
JGroups configuration
In order to use the UDP Multicasting for node discovery and TCP for replication the following JGroups example configuration can be used:
Infinispan configuration
The following example configuration has it's replication timeout set to 10 minutes in order to mitigate the problem of SyncTimeouts when spanning one transaction over a lot of operations
Modeshape configuration
The following configuration has indexing disabled completely in order to increase ingest performance
Using Tomcat7
- Build the fcrepo war file or download the prebuilt fcrepo war file
- Build the War file as described on this page
- Fetch the WAR file from the download page
- Get Tomcat
Download Tomcat 7 and unpack it (Tomcat 7.0.50 is used in this example)
#> wget http://mirror.synyx.de/apache/tomcat/tomcat-7/v7.0.50/bin/apache-tomcat-7.0.50.tar.gz #> tar -zxvf apache-tomcat-7.0.50.tar.gz #> mv apache-tomcat-7.0.50 tomcat7
- Put the WAR file into tomcat's webapp directory or create a symbolic link
Copy the fcrepo-webapp-VERSION.war file
#> cp fcrepo-webapp-VERSION.war tomcat7/webapps/fcrepo.war
- Set the send/recv buffer sizes if neccessary
Use the following commands to set the buffer size. To persist these settings between reboots, also put them in /etc/sysctl.conf.
#> sysctl net.core.rmem_max=26214400 #> sysctl net.core.wmem_max=5242880
- Start instances
Using a custom configuration by pointing Fedora 4 to custom configuration files:
#> CATALINA_OPTS="-Xmx1024m -XX:MaxPermSize=256m -Djava.net.preferIPv4Stack=true -Djgroups.udp.mcast_addr=239.42.42.42 -Dfcrepo.modeshape.configuration=file:///path/to/repository.json -Dfcrepo.ispn.jgroups.configuration=/path/to/jgroups-fcrepo-tcp.xml -Dfcrepo.ispn.configuration=/path/to/infinispan.xml" bin/catalina.sh run
Deploy cluster using the UDP Multicast protocol for node discovery and replication
Issues with UDP Multicasting
Currently there are still issues using UDP Mulitcasting for replication, while using UDP for node discovery works as intended.
A couple of configuration options have to be set in order to have Fedora 4 work as a cluster on a local machine:
- Xmx1024m Set the Java heap to 1GB
- XX:MaxPermSize=256m Set the Java PermGen size to 256MB
- fcrepo.modeshape.configuration=file:///path/to/repository.json The Modeshape configuration used for clustering
- java.net.preferIPv4Stack=true Tell Java to use IPv4 rather than IPv6
- fcrepo.ispn.jgroups.configuration=/path/to/jgroups-fcrepo-udp.xml Set the JGroups configuration file holding the UDP Transport defintions
- jgroups.udp.mcast_addr=239.42.42.42 Set the UDP multicast address for the JGroups cluster
- fcrepo.ispn.configuration=/path/to/infinispan.xml Set the Infinispan configuration file holding the Infinispan cluster configuration
Using Tomcat7
- Build the fcrepo war file or download the prebuilt fcrepo war file
- Build the War file as described on this page
- Fetch the WAR file from the download page
- Get Tomcat
Download Tomcat 7 and unpack it (Tomcat 7.0.50 is used in this example)
#> wget http://mirror.synyx.de/apache/tomcat/tomcat-7/v7.0.50/bin/apache-tomcat-7.0.50.tar.gz #> tar -zxvf apache-tomcat-7.0.50.tar.gz #> mv apache-tomcat-7.0.50 tomcat7
- Put the WAR file into tomcat's webapp directory or create a symbolic link
Copy the fcrepo-webapp-VERSION.war file
#> cp fcrepo-webapp-VERSION.war tomcat7/webapps/fcrepo.war
- Setup the cluster configuration (Optional)
- This github project contains a sample configuration for a cluster in distributed mode.
- Change the configuration as required. Description is available at the JGroups, Infinispan and Modeshape documentations
- Make sure to point Fedora 4 to the configuration files by updating the file
$TOMCAT_HOME/bin/setenv.sh
(create if necessary) using the propertiesfcrepo.modeshape.configuration,
fcrepo.ispn.jgroups.configuration
andfcrepo.ispn.configuration
- Set the send/recv buffer sizes if neccessary
Use the following commands to set the buffer size
#> sysctl net.core.rmem_max=5242880 #> sysctl net.core.wmem_max=5242880
- Start instance
Using the default clustered configuration (Replication mode):
#> CATALINA_OPTS="-Xmx1024m -XX:MaxPermSize=256m -Dfcrepo.modeshape.configuration=config/clustered/repository.json -Djava.net.preferIPv4Stack=true -Djgroups.udp.mcast_addr=239.42.42.42" bin/catalina.sh run
Using a custom configuration by pointing Fedora 4 to custom configuration files:
#> CATALINA_OPTS="-Xmx1024m -XX:MaxPermSize=256m -Djava.net.preferIPv4Stack=true -Djgroups.udp.mcast_addr=239.42.42.42 -Dfcrepo.modeshape.configuration=file:///path/to/repository.json -Dfcrepo.ispn.jgroups.configuration=/path/to/jgroups-fedora-udp.xml -Dfcrepo.ispn.configuration=/path/to/infinispan.xml" bin/catalina.sh run
Deploy cluster using the UDP Multicast protocol for node discovery and replication on a single machine
Issues with UDP Multicasting
Currently there are still issues using UDP Mulitcasting for replication, while using UDP for node discovery works as intended.
A couple of configuration options have to be set in order to have Fedora 4 work as a cluster on a local machine:
- Xmx1024m Set the Java heap to 1GB
- XX:MaxPermSize=256m Set the Java PermGen size to 256MB
- fcrepo.modeshape.configuration=file:///home/ruckus/dev/tomcat7-8081/repository.json The Modeshape configuration used for clustering
- java.net.preferIPv4Stack=true Tell Java to use IPv4 rather than IPv6
- fcrepo.ispn.jgroups.configuration=/home/ruckus/dev/tomcat7-8081/jgroups-fcrepo-udp.xml Set the JGroups configuration file holding the UDP Transport defintions
- jgroups.udp.mcast_addr=239.42.42.42 Set the UDP multicast address for the JGroups cluster
- fcrepo.ispn.configuration=/home/ruckus/dev/tomcat7-8081/infinispan.xml Set the Infinispan configuration file holding the Infinispan cluster configuration
- fcrepo.jms.port=61617 The port used by ActiveMQ's JMS protocol. This needs to be distinct for every instance of fcrepo4
- fcrepo.stomp.port=61614 The port used by ActiveMQ Stomp protocol. This needs to be distinct for every instance of fcrepo4
Using Tomcat7
- Build the fcrepo war file or download the prebuilt fcrepo war file
- Build the War file as described on this page
- Fetch the WAR file from the download page
- Get Tomcat
Download Tomcat 7 and unpack it (Tomcat 7.0.50 is used in this example)
#> wget http://mirror.synyx.de/apache/tomcat/tomcat-7/v7.0.50/bin/apache-tomcat-7.0.50.tar.gz #> tar -zxvf apache-tomcat-7.0.50.tar.gz #> mv apache-tomcat-7.0.50 tomcat7-8080
- Put the WAR file into tomcat's webapp directory or create a symbolic link
Copy the fcrepo-webapp-VERSION.war file
#> cp fcrepo-webapp-VERSION.war tomcat7-8080/webapps/fcrepo.war
- Get the Infinispan XML configuration file
Download from github and put it into tomcat7-8080
#> wget -O infinispan.xml https://gist.github.com/fasseg/8646707/raw #> mv infinispan.xml tomcat7-8080/
- Get the Modeshape JSON configuration file
Download from Github and put into tomcat7-8080
#> wget -O repository.json https://gist.github.com/fasseg/8646727/raw #> mv repository.json tomcat7-8080/
- Get the JGroups UDP Transport configuration file
Download from Github and put into tomcat7-8080
#> wget -O jgroups-fedora-udp.xml https://gist.github.com/fasseg/8646743/raw #> mv jgroups-fedora-udp.xml tomcat7-8080/
- Set the send/recv buffer sizes if neccessary
Use the following commands to set the buffer size
#> sysctl net.core.rmem_max=5242880 #> sysctl net.core.wmem_max=5242880
- Copy the whole tomcat folder in order to create a second instance
#> cp -R tomcat7-8080/ tomcat7-8081
- Change the connector ports of the second Tomcat instance to 8081, 8006 and 8010
#> sed -i 's/8080/8081/g' tomcat7-8081/conf/server.xml #> sed -i 's/8005/8006/g' tomcat7-8081/conf/server.xml #> sed -i 's/8009/8010/g' tomcat7-8081/conf/server.xml
- Start the first instance
#> CATALINA_OPTS="-Xmx1024m -XX:MaxPermSize=256m -Dfcrepo.modeshape.configuration=file:///path/to/repository.json -Djava.net.preferIPv4Stack=true -Dfcrepo.ispn.jgroups.configuration=/path/to/jgroups-fedora-udp.xml -Djgroups.udp.mcast_addr=239.42.42.42 -Dfcrepo.ispn.configuration=/path/to/infinispan.xml" tomcat7-8080/bin/catalina.sh run
- Start the second instance
#> CATALINA_OPTS="-Xmx1024m -XX:MaxPermSize=256m -Dfcrepo.modeshape.configuration=file:///path/to/repository.json -Djava.net.preferIPv4Stack=true -Dfcrepo.ispn.jgroups.configuration=/path/to/jgroups-fedora-udp.xml -Djgroups.udp.mcast_addr=239.42.42.42 -Dfcrepo.ispn.configuration=/path/to/infinispan.xml -Dfcrepo.jms.port=61617 -Dfcrepo.stomp.port=61614" tomcat7-8081/bin/catalina.sh run
- Check that the instances are reachable and that the clustersize is '2':
#> wget http://localhost:8080/fcrepo/rest #> wget http://localhost:8081/fcrepo/rest
- Navigate to http://localhost:8080/fcrepo/rest
Using TCP for Discovery and Sync
If you cannot use UDP, which is recommended for larger clusters, then TCP can be used. There are some things to watch out for when configuring TCP.
TCPPING Element
Each host has a different TCPPING configuration. This can be tricky to configure:
- The initial_hosts attribute should only include remote hosts for the local node, not itself.
- The num_initial_members should match the count of remote hosts, i.e. it does not include the local node either.
- If you are running just one port on each host, then port_range will be 0.
- IP resolution is important if you use DNS names. The locally resolved IP of each remote host must match the TCP@bind_addr in the remote host config.
TCP Element
If you use a hostname for bind_addr, make sure that it resolved to the IP you want, probably an external one and not the loopback IP. This is easy to miss.
Deploying in AWS
Java options for Tomcat or Jetty
JAVA_OPTS="$JAVA_OPTS -Xmx1024m -XX:MaxPermSize=256m -Dfcrepo.modeshape.configuration=file:///config/clustered/repository.json" JAVA_OPTS="$JAVA_OPTS -Dfcrepo.ispn.configuration=config/infinispan/clustered/infinispan.xml" JAVA_OPTS="${JAVA_OPTS} -Dfcrepo.home=/tmp/wherever" JAVA_OPTS="${JAVA_OPTS} -Djgroups.tcp.address=<private-ip-address-of-ec2-instance>" JAVA_OPTS="${JAVA_OPTS} -Dfcrepo.ispn.numOwners=2 -Djava.net.PreferIPv4Stack=true" # The jgroups-ec2.xml file is included in ispn's jars JAVA_OPTS="${JAVA_OPTS} -Dfcrepo.ispn.jgroups.configuration=jgroups-ec2.xml" # This property overwrites the S3 bucketname variable in jgroups-ec2.xml JAVA_OPTS="${JAVA_OPTS} -Djgroups.s3.bucket=<some-bucket-that-you-have-already-created>"
See this example on Fedora Cluster Installation in AWS
Load Balancing Fedora 4 using Apache and mod_jk
Load balancing can be achieved by using an Apache server with mod_jk in front of the Fedora 4 cluster. Using mod_jk one has to create as many workers in the workers.properties configuration file as there are Fedora 4 nodes.
See this example on the RedHat pages
Firewall Notes
If your cluster has a firewall between nodes, you'll need to open the following ports:
UDP multicast source address and port for JGroups messaging
Example: if your nodes are on the 192.168.47.x network, then this would be your iptables rule:
UDP multicast firewall rule-A INPUT -m pkttype --pkt-type multicast -s 192.168.47.0/24 -j ACCEPT
TCP replication: TCP bind address and port (source)
Example: if your nodes are on the 192.168.47.x network, and the TCP bind_port is 7800, then this would be your iptables rule:TCP replication firewall rule-A INPUT -m state --state NEW -m tcp -p tcp -s 192.168.47.0/24 --dport 7800 -j ACCEPT
UDP replication: UDP bind address and port (source)
Example: if your nodes are on the 192.168.47.x network, and the UDP bind_port is 47788, then this would be your iptables rule:UDP replication firewall rule-A INPUT -m state --state NEW -m udp -p udp -s 192.168.47.0/24 --dport 47788 -j ACCEPT
Simple Shell script to coordinate a cluster
For pushing configurations and wars/jars, start, stop, restart and purge the Ubuntu 12.04 LTS cluster this small script gets used on the FIZ cluster.
In order to make this work without having to input passwords all the time for the sudo and ssh calls on the cluster nodes, I distributed a public ssh key on the cluster nodes for ssh auth and allowed the fcrepo user to execute
sudo
calls torm, cp, service
calls without a password.- The configuration of the FIZ cluster can be accessed here: https://github.com/fasseg/fiz-fcrepo-cluster-config. There is also a setenv.sh file in there which we symlinked to $TOMCAT_HOME/bin/setenv.sh, that sets the environment variables for the repository, jgroups and infinispan configuration.
- So on each node the layout on the file system looks like this:
/data/fcrepo
(the exploded war file, owned by fcrepo)/home/fcrepo/fiz-cluster-config
(the configuration and setenv.sh file, owned by fcrepo)/var/lib/tomcat7/webapps/fedora
(owned by root) symlinks to /data/fcrepo
- Using this setup jar updates can be pushed by the shell script to /data/fcrepo/WEB-INF/lib directly.
- Pushing a new WAR file to the nodes requires unpacking the WAR to /data/fcrepo therefore access to /tmp is required.
- The script is setup for six nodes with know IPs. So the
node[]
array will have to change for different configurations, as should the range defined in the for statements instart_cluster() purge_cluster()
andstop_cluster().