Optimizing Magma's Performance
Last updated at 2:08 am UTC on 12 January 2007
Efficiency was always a goal when building Magma. The work-load is heavy for the client but relatively light for the server, especially with short transactions allowing, theoretically, for consistent performance more client sessions connect.
If profiling your program reveals a lot of time spent in Magma, considering the following performance-sensitive guidelines may help.
Use ReadStrategies
- Read strategies can be used to optimize how many objects are accessed within a single call to the server.
Keep your commits medium-small
Commits should be put in your program as close to the mutations to the persistent model as possible. Commits are serialized on the server, so large commits that take several seconds will most likely cause requests to queue in the server.
At the same time, you don't want commits to be so microscopic that you end up smothering the network with requests. For example, if building an OrderedCollection of 100 medium-sized objects, you should do those in one commit instead of 100 commits. However, if the objects are very large and completely non-persistent, you may want to do 100 commits.
Keep your cachedObjectCount as low as possible
With a connected Magma session, evaluate:
mySession cachedObjectCount
This number reprensents how many entries Magma has in its IdentityDictionaries. Magma tries to avoid the performance issues related to Squeak's IdentityDictionaries, but it can still slow down if you allow tens of thousands of objects to be cached in memory.
If you're not sure why your cachedObjectCount is growing, you can use cachedObjectCountByClass to see which ones are the most proliferate (they are sorted by most-occurrences at the top). If you see "UndefinedObject" near the top of the list, you need to send #finalizeOids to your session. This is because Squeak can be lazy about finalizing the entries in its WeakDictionary's.
As you traverse parts of the model, you should stubOut: objects you no longer need. For example, after iterating a collection of large objects, stubOut: the collection object if you no longer need them. MagmaSession>>stubOut: chops off large branches of objects so the memory they consume can be reclaimed by the garbage collector.
But avoid too many calls to stubOut:. For example, after you've enumerated the collection of large objects, stubOut: the Collection object itself, not each object in the collection. This is due to unfortunate irony that stubOut: requires use of one of Squeak's most inefficienct methods; Dictionary>>#removeKey:. While fast in other Smalltalks, this method is VERY slow in Squeak but required for stubOut:.
Finally, after you've stubbed out a large object, you may find it necessary to call "mySession finalizeOids". Unfortunately, Squeak's WeakIdentityKeyDictionary does not always remove finalized entries in a timely fashion, resulting in, once again, these very important Dictionaries slowing everything down.
mySession finalizeOids
Other optimizations
- Don't use MagmaSession>>refreshPersistentObjectsEvenWhenChangedOnlyByMe.
- Use commitAndBegin for bulk-load programs.
- Experiment and optimize your key and record sizes of your MagmaCollections. Avoid too many duplicate keys (e.g., don't index the word "the").
Know thy indexes
MagmaCollections have good read performance, but adding and removing objects is very slow. In theory, starting with an empty MagmaCollection, the rate-of-insertion will deteriorate a little bit before settling on a relatively fixed rate, IF you have a good key-dispersal.
If you put in a lot of duplicate keys, it will gradually get more costly to keep adding more of that key because a linear search for the "end" of that chain of keys is performed to find the point of insertion. So, for example, when you build a simple keyword index, consider eliminating prepositions such as "the" and "at".
Removing from MagmaCollections is even more expensive than insertions. Avoid using this operation for performance-intensive operations.