Solr Replication on a Single Solr Instance

Apache Solr can be used in master/slave mode using build in replication. Normally, you use a dedicated server as a master and another server as a slave. Since indexing many documents can take some time, there might be another reason for indexing besides load balancing, namely having a core for searching (which might be outdated) while indexing into another core which will be incomplete during the indexing process.

We can set up a replication scenario for this using a single solr instance with two cores, one for indexing (the master core) and one for searching (the slave core) and using solr replication to transfer the data from the master to the slave.

In the "standard TYPO3" installation of solr, here are the steps that you have to do to set up this scenario:

 

  1. Open the solr.xml file in your /opt/tomcat-solr/solr/ directory.
  2. Set up two cores like this:

<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="true">
	<cores adminPath="/admin/cores" shareSchema="true">
		<core name="master-de_DE" instanceDir="typo3cores" schema="german/schema.xml" dataDir="data/master-de_DE">
			<property name="enable.master" value="true" />
		</core>
		<core name="slave-de_DE" instanceDir="typo3cores" schema="german/schema.xml" dataDir="data/slave-de_DE" >
			<property name="enable.slave" value="true" />
		</core>
	</cores>
</solr>
  1. Set up the replication handler in your solrconfig.xml file:
<requestHandler name="/replication" class="solr.ReplicationHandler" >
    <lst name="master">
    	<str name="enable">${enable.master:false}</str>
        <str name="replicateAfter">optimize</str>
        <str name="backupAfter">optimize</str>
        <int name="maxNumberOfBackups">2</int> -->
        <str name="commitReserveDuration">00:00:10</str>
    </lst>
    <lst name="slave">
    	<str name="enable">${enable.slave:false}</str>
        <str name="masterUrl">http://localhost:8080/solr/master-de_DE/replication</str>
        <str name="pollInterval">00:00:20</str>
        <str name="httpConnTimeout">5000</str>
        <str name="httpReadTimeout">10000</str>
     </lst>
</requestHandler>
  1. Add a scheduler task "Optimize Solr Index" that will be executed once, after a complete re-indexing.

The given configuration triggers the replication right after the optimization of the index. 

Further information on Replication can be found here:

http://wiki.apache.org/solr/SolrConfigXml

http://wiki.apache.org/solr/SolrReplication

http://wiki.apache.org/solr/Solr.xml

http://docs.lucidworks.com/display/solr/Index+Replication


 
Inhalt © Michael Knoll 2009-2017  •  Powered by TYPO3  •  TypoScript Blogging by Fabrizio Branca  •  TYPO3 Photo Gallery Management by yag  •  Impressum