Database Sharding with Liferay – Part 1
Liferay portal allows hosting completely unrelated portals using a feature called Portal Instances. Using Liferay control panel we can create multiple portal instances. Each portal instance will have its own users, sites, organizations, user groups and so on. Internally portal instance is also referred as company. Liferay keeps company id in almost all tables to distinguish data based on portal instance. If many portal instances are created and each portal instance has huge amount of data then at certain point performance of database will deteriorate because of huge amount of data. In this situation it will be a good idea to partition the data based on instance id or company id. This technique to distribute rows of table in multiple databases is called Database Sharding. Liferay portal supports this feature. This blog will explain how to use Database Sharding feature with Liferay 6.1.
Lets learn how to configure database sharding with an example. In our example scenario we want to create three portal instances. We want to store data of each portal instance to separate database. It is assumed that MySQL is installed on local environment and MySQL Driver jar is copied in tomcat lib directory. This steps refer Liferay 6.1 CE GA2 version.
Here are the steps to configure database sharding on local Liferay environment.
Step 1 Database sharding feature in Liferay works with portal instances as explained above. So on Local environment we will need to create multiple portal instances. To create multiple portal instance, we need to have multiple domains. On local environment we can add multiple domains by editing hosts file. So add following entries in hosts file.
Above step maps local host ip to two domains. By default liferay creates one portal instance and it is mapped to localhost. So we will create two more instances and map them with above domains respectively.
Step 2 Add following properties in portal-ext.properties file.
#SHARD 1 Configuration
#SHARD 2 Configuration
#SHARD 3 Configuration
#Spring configuration files to be loadded. By adding shard-data-source-spring.xml in the list database sharding feature
#can be enabled
Step 3 Now start the Lifeary portal server and access using http://localhost:8080. Liferay will show setup wizard to configure the portal. After you signed in to the portal if you check user_ table in shard1 database you will find the test user. Which means 1st portal instance is mapped to default shard.
Step 4 Now add two portal instances from Server Administration | Portal Instances section of control panel. While adding the instances make sure you use domain1.com and domain2.com as web id. Here are the sample screenshots for adding portal instance.
Step 5 Now access other two instances using http://domain1.com:8080 and http://domain2.com:8080 URLs. You can sign in to both the instances using email@example.com and firstname.lastname@example.org user ids. Default password will be “test”. Now if we check user_ table in shard2 and shard3 databases, we will find record of email@example.com and firstname.lastname@example.org users respectively.
So with above example we learned that each portal instance data is stored in seperate database. By default we can have three shards. And Liferay assigns shard to portal instance on round ribbon bases. Which means if we create another portal instance in above example, its data will be stored in shard1 database.
Liferay also supports manual shard selection algorithm. Also we can add more than three shards as well. I will cover these in my next blog.
|I am Samir Bhatt. I have more than 12 years of experience in IT Industry. I have been performing role of Software Architect since last four years. I am focused on open source technologies. I have specialized knowledge of Liferay Portal and has written two books on the subject. I lead Liferay practice in my organization with a strength of more than 140+ people. I am also Certified Liferay Trainer and conducted many public and private Liferay trainings across the world. Apart from Liferay I have good knowledge of Big Data technologies like MongoDB, Apache Hadoop. Throughout my carrier I worked on variety of technologies & Frameworks that includes JAVA, J2EE,Visual Basic, Oracle PLSQL, Spring Framework, Struts, Hibernate,Liferay, Pentaho and so on.|