Some notes when developing an IPB based website with big data

Ips_ipb_logoRecent time, I supporting a development team on customizing an intensive social network website with integrated IPB forum software. The main problems that the team faced are (1) very slow on accessing the website, and (2) the db is often crashed in the peak time (about 1000 online users). There are some checklists which might help you reduce effort on searching solutions when you are in the same scenario as follows:

  1. By default, IPB uses database for storing cache. This will not be a problem if we do not have much concurrent online users, but in a high intensive website, it might put much more load on the database server. So, remember to choose an alternative caching method when using IPB.
  2. In a very big data, the searching feature in IPB forum (and most other forum) might be painful. The problem is that IPB use MySQL full-text search by default, and this operation will consume most server resources in case the data is big (in our case, it takes 20 – 60s to perform on a 10G posts table). A good solution is using a search engine instead of using full-text search.  And thanks to IPB that it supports SphinX as a back-end searching method on-the-fly. The tutorial on installing and configuring SphinX with IPB can be found [here].
  3. Try to off-load the database by using Replication (to separate read and update operations) and MySQL proxy to load-balance slave nodes. We will also need the replication driver for IPB.
  4. Try to use as powerful server specs as possible. MySQL loves ram, so adding more ram helps increasing some major configuration parameter. Always remember to calculate how much memory your server will need with MySQL Calculator.
  5. Another common mistake when integrating IPB to another software is duplicating the whole forum data to built-in forum feature in the target software, and add trigger to writing new content twice when a new topic is posted. This not only increases unnecessary redundant data, but also put more load on the db server to perform written twice. This will also increase the necessary dedicated memory for db-engine for quick querying. So, try to re-write a connection between 2 softwares without duplicating forum data.

Leave a comment

Your email address will not be published. Required fields are marked *