How to set up a clustered repository for Magnolia in 5 minutes.


Warning: WP_Syntax::substituteToken(): Argument #1 ($match) must be passed by reference, value given in /membri/maips21/wp-content/plugins/wp-syntax/wp-syntax.php on line 380

Warning: WP_Syntax::substituteToken(): Argument #1 ($match) must be passed by reference, value given in /membri/maips21/wp-content/plugins/wp-syntax/wp-syntax.php on line 380

Warning: WP_Syntax::substituteToken(): Argument #1 ($match) must be passed by reference, value given in /membri/maips21/wp-content/plugins/wp-syntax/wp-syntax.php on line 380

Warning: WP_Syntax::substituteToken(): Argument #1 ($match) must be passed by reference, value given in /membri/maips21/wp-content/plugins/wp-syntax/wp-syntax.php on line 380

One of the big advantages of using Magnolia is the underlyng database reliability, provided by Apache Jackrabbit.

Since Jackrabbit 1.6 version is possible to build “clustered” repository. This option, combined with the Author/Public dualism of Magnolia CMS can lead a lot of advantages:

  • save on public reflect instantly on author and viceversa
  • no need to activate content

Let see how to configure a clustered repository between a Magnolia Author instance and a Magnolia Public instance.

Ingredients:

  • Magnolia > 4.3
  • MySQL database
  • 5 minutes of time

Let’s start.

As first thing, we need to install Magnolia as usual. Assuming to have it up and running, we now stop the web server.

Magnolia uses WEB-INF/config/default/repositories.xml for defining repositories.
Open it and add as following:

<!-- magnolia shared repository -->
<repository name="magnoliaShared" provider="info.magnolia.jackrabbit.ProviderImpl" loadOnStartup="true">
        <param name="configFile" value="${magnolia.repositories.jackrabbit.shared.config}" />
        <param name="repositoryHome" value="${magnolia.repositories.home}/magnoliaShared" />
        <!-- the default node types are loaded automatically
        <param name="customNodeTypes" value="WEB-INF/config/repo-conf/nodetypes/magnolia_nodetypes.xml" />
        -->
        <param name="contextFactoryClass" value="org.apache.jackrabbit.core.jndi.provider.DummyInitialContextFactory" />
        <param name="providerURL" value="localhost" />
        <param name="bindName" value="${magnolia.webapp}Shared" />
        <workspace name="usersShared"></workspace>
    </repository>

and add a Map node inside the RepositoryMapping:

<map name="usersShared" repositoryName="magnoliaShared" workspaceName="usersShared"></map>

We now open the magnolia.properties file, located under WEB-INF/config/default folder and add this line:

magnolia.repositories.jackrabbit.shared.config=WEB-INF/config/repo-conf/jackrabbit-bundle-mysql-shared-search.xml

The last part is to create a MySQL database and add the right infos on jackrabbit-bundle-mysql-shared-search.xml file. Assuming you have a running MySQL on localhost:3630 with repoShared as schema name, user and password, the config file should be:

 
< ?xml version="1.0" encoding="UTF-8"?>
< !DOCTYPE Repository PUBLIC "-//The Apache Software Foundation//DTD Jackrabbit 1.4//EN" "http://jackrabbit.apache.org/dtd/repository-1.4.dtd">
<repository>
	<filesystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
		<param name="path" value="${rep.home}/repository" />
	</filesystem>
	<security appName="Jackrabbit">
		<accessmanager class="org.apache.jackrabbit.core.security.SimpleAccessManager"></accessmanager>
		<loginmodule class="org.apache.jackrabbit.core.security.SimpleLoginModule">
			<param name="anonymousId" value="anonymous" />
		</loginmodule>
	</security>
	<workspaces rootPath="${rep.home}/workspaces"
		defaultWorkspace="default"></workspaces>
	<workspace name="default">
		<filesystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
			<param name="path" value="${wsp.home}/default" />
		</filesystem>
		<persistencemanager class="org.apache.jackrabbit.core.persistence.bundle.MySqlPersistenceManager">
			<param name="driver" value="com.mysql.jdbc.Driver" />
			<param name="url" value="jdbc:mysql://localhost:3306/userShared" />
			<param name="schema" value="mysql" /><!-- warning, this is not the schema name, it's the db type	-->
			<param name="user" value="userShared" />
			<param name="password" value="userShared" />
			<param name="schemaObjectPrefix" value="${wsp.name}_" />
			<param name="externalBLOBs" value="false" />
		</persistencemanager>
		<searchindex class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
			<param name="path" value="${wsp.home}/index" />
			<param name="useCompoundFile" value="true" />
			<param name="minMergeDocs" value="100" />
			<param name="volatileIdleTime" value="3" />
			<param name="maxMergeDocs" value="100000" />
			<param name="mergeFactor" value="10" />
			<param name="maxFieldLength" value="10000" />
			<param name="bufferSize" value="10" />
			<param name="cacheSize" value="1000" />
			<param name="forceConsistencyCheck" value="false" />
			<param name="autoRepair" value="true" />
			<param name="analyzer"
				value="org.apache.lucene.analysis.standard.StandardAnalyzer" />
			<param name="queryClass" value="org.apache.jackrabbit.core.query.QueryImpl" />
			<param name="respectDocumentOrder" value="true" />
			<param name="resultFetchSize" value="2147483647" />
			<param name="extractorPoolSize" value="3" />
			<param name="extractorTimeout" value="100" />
			<param name="extractorBackLogSize" value="100" />
			<param name="textFilterClasses"
				value="org.apache.jackrabbit.extractor.MsWordTextExtractor,
               org.apache.jackrabbit.extractor.MsExcelTextExtractor,
               org.apache.jackrabbit.extractor.MsPowerPointTextExtractor,
               org.apache.jackrabbit.extractor.PdfTextExtractor,
               org.apache.jackrabbit.extractor.OpenOfficeTextExtractor,
               org.apache.jackrabbit.extractor.RTFTextExtractor,
               org.apache.jackrabbit.extractor.HTMLTextExtractor,
               org.apache.jackrabbit.extractor.PlainTextExtractor,
               org.apache.jackrabbit.extractor.XMLTextExtractor" />
		</searchindex>
	</workspace>
	<versioning rootPath="${rep.home}/version">
		<filesystem class="org.apache.jackrabbit.core.fs.local.LocalFileSystem">
			<param name="path" value="${rep.home}/workspaces/version" />
		</filesystem>
		<persistencemanager class="org.apache.jackrabbit.core.persistence.bundle.MySqlPersistenceManager">
			<param name="driver" value="com.mysql.jdbc.Driver" />
			<param name="url" value="jdbc:mysql://localhost:3306/userShared" />
			<param name="schema" value="mysql" /><!-- warning, this is not the schema name, it's the db type	-->
			<param name="user" value="userShared" />
			<param name="password" value="userShared" />
			<param name="schemaObjectPrefix" value="version_" />
			<param name="externalBLOBs" value="false" />
		</persistencemanager>
	</versioning>
 
	<!-- clustering: author: node0, public1: node1, public2: node2, ... -->
	<cluster id="node0" syncDelay="2000">
		<journal class="org.apache.jackrabbit.core.journal.DatabaseJournal">
			<param name="revision" value="${rep.home}/revision.log" />
			<param name="driver" value="com.mysql.jdbc.Driver" />
			<param name="url" value="jdbc:mysql://localhost:3306/userShared" />
			<param name="schema" value="mysql" /><!-- warning, this is not the schema name, it's the db type	-->
			<param name="user" value="userShared" />
			<param name="password" value="userShared" />
			<param name="schemaObjectPrefix" value="journal_" />
		</journal>
	</cluster>
</repository>

IMPORTANT: each cluster node MUST have a unique ID.
In our example we use id=”node0″ for author and id=”node1″ for public.

This should be enough! Let me know if are there any points missing!!

Latest articles

Matteo Pelucco Written by:

4 Comments

  1. Will
    November 30

    Hi Matteo

    Thanks for the post. 2 remarks:
    – Watch out for typos: Often you write element names in lower case where they should be upper case (e.g. should be
    – Too bad your blog is now well indexed by Google. I would never have found your article if you hadn’t posted about in on the Magnolia mailing list.

    BR,
    Will

  2. Matteo Pelucco
    November 30

    Hi Will, yes, my blog should be enhanced as regards SEO 😉 I’m working on this! And for element names, you are right.. I assumed you referred to “Instance” and others, I should have fixed them.
    Anyway, thanks for being interested in!
    M.

  3. Jonathan Man
    May 5

    Great article thanks.

    How would the following work if Author and Public sat on its own Tomcat instances as they cannot read each other’s file systems:

  4. Jonathan Man
    May 5

    * filesystem class=”org.apache.jackrabbit.core.fs.local.LocalFileSystem”
    param name=”path” value=”${rep.home}/repository” /

Leave a Reply to Jonathan Man

Your email address will not be published. Required fields are marked *


9 × nine =