1

I have application running on Wildfly 10.0.0 server. I store my entities in database but to improve performance I decided to use Infinispan cache (cache-aside pattern). Application performance has increased but still have a problem with calculation relationship parent - children (Foo and SubFoos):

This is entities code:

@Entity
class Foo
{
      Long id;
}

@Entity
class SubFoo
{
      Long id;
      Long fooId;
}

This is service code:

public class FooService
{
    @Inject
    EntityManger em;

    @Inject
    Cache<Long, Foo> fooCache;

    @Inject
    Cache<Long, SubFoo> subFooCache;

    public void save(Foo foo)
    {
        em.save(foo);
        fooCache.put(foo.id, foo);
    }

    public void save(SubFoo subFoo)
    {
        em.save(subFoo);
        subFooCache.put(subFoo.id, subFoo);
    }

    public void remove ....

    public void update ....

    public Foo get(Long fooId)
    {
        return fooCache.get(fooId);
    }

    public SubFoo get(Long subFooId)
    {
        return subFooCache.get(subFooId);
    }


    public List<SubFoo> findSubFoo(Long fooId)
    {
        return subFooCache.values().stream().filter( subFoo -> subFoo.fooId == fooId ).collect( Collector.list() );
    }
}

The problem is findSubFoo method. Every time has to check the all subFoos collection. And this method has still large impact on application performance.

Is infinispan is possible to simulate the use of database indexes or resolve that problem otherwise?

Approach 1

I tried to use TreeCache to store list as a cache value, and keep concurrency and transactional support. The treeNode keep FooId as node root path and subFooId as a leaves. This approach was OK when the number of requests was small. During many requests where states without consistency, where short states with lack of consistency. It seemed that Tx was commited and one cache(normal for entity subFoo) was refreshed, but second(foo2subFoo) not yet. But after a short period of time, everything was ok and the consistency of the data goes back.

Source code:

Cache provider with producers:

@ApplicationScoped
public class CacheProvider
{
    private EmbeddedCacheManager cacheManager;

    @PostConstruct
    public void init()
    {
        final GlobalConfiguration globalConfig =
            new GlobalConfigurationBuilder().nonClusteredDefault().globalJmxStatistics()
                .allowDuplicateDomains( true ).build();

        final Configuration entityDefaultConfig =
            new ConfigurationBuilder().transaction().transactionMode( TransactionMode.TRANSACTIONAL )
                .lockingMode( LockingMode.OPTIMISTIC )
                .eviction().strategy( EvictionStrategy.NONE ).build();

        final Configuration indexDefaultConfig = new ConfigurationBuilder()
            .transaction().transactionMode( TransactionMode.TRANSACTIONAL )
            .eviction().strategy( EvictionStrategy.NONE )
            .invocationBatching().enable()
            .build();

        cacheManager = new DefaultCacheManager( globalConfig  );

        cacheManager.defineConfiguration( "Foos", entityDefaultConfig );
        cacheManager.defineConfiguration( "SubFoos", entityDefaultConfig );
        cacheManager.defineConfiguration( "Foo2SubFoos", indexDefaultConfig );
    }

    @Produces
    public Cache<Long, Foo> createFooCache()
    {
        final Cache<Long, Foo> entityCache = cacheManager.getCache( "Foos" );
        return entityCache;
    }

    @Produces
    public Cache<Long, SubFoo> createSubFooCache()
    {
        final Cache<Long, SubFoo> entityCache = cacheManager.getCache( "SubFoos" );
        return entityCache;
    }

    @Produces
    public TreeCache<Long, Boolean> createFoo2SubFoos()
    {
        Cache<Long, Boolean> cache = cacheManager.getCache("Foo2SubFoos");

        final TreeCacheFactory treeCacheFactory = new TreeCacheFactory();
        final TreeCache<Long, Boolean> treeCache = treeCacheFactory.createTreeCache( cache );

        return treeCache;       
    }
}

And i extended my FooService with support for TreeCache: When subFoo id added, removed or deleted foo2SubFoo cache is refreshed too.

public class FooService
{
    @Inject
    EntityManger em;

    @Inject
    Cache<Long, Foo> fooCache;

    @Inject
    Cache<Long, SubFoo> subFooCache;

    @Inject
    TreeCache<Long, Boolean> foo2SubFoosCache;  


    public void save(Foo foo)
    {
        em.save(foo);
        fooCache.put(foo.id, foo);
    }

    public void save(SubFoo subFoo)
    {
        em.save(subFoo);
        subFooCache.put(subFoo.id, subFoo);

        Fqn fqn = Fqn.fromElements( subFoo.fooId );
        foo2SubFoosCache.put( fqn, subFoo,id, Boolean.TRUE );
    }

    public void remove ....

    public void update ....

    public Foo get(Long fooId)
    {
        return fooCache.get(fooId);
    }

    public SubFoo get(Long subFooId)
    {
        return subFooCache.get(subFooId);
    }


    public List<SubFoo> findSubFoo(Long fooId)
    {
        Fqn fqn = Fqn.fromElements( fooId );
        return foo2SubFoosCache.getKeys(fooId).stream().map( subFooId -> subFooCache.get(subFooId)).collect( Collector.list() );
    }
}
ostry
  • 117
  • 1
  • 2
  • 8

1 Answers1

3

Infinispan has the capability of indexes. Look for "Infinispan Query".

Personally I am not a big fan of indexing in a cache. The features are very different between vendors, if even existing. For a problem like yours I don't see a justification to do indexing. I'd suggest to try the simplest possible solution(s) that work with any cache.

Solution 1:

Add another cache, that is essentially the index from the foo IDs to the subFoo IDs: Cache<Long, List<Long>>

Since this cache is only storing the IDs, there is no duplication of the data.

Solution 2:

Add the subFoo IDs to the cache of the parent entity:

class CachedFoo {
  Foo foo;
  List<Long> subFooIds;
}

Cache<Long, CachedFoo> fooCache;

Solution 3:

In case you have not so much data and findSubFoo is rarely used, consider an alternative in process cache with the signature Cache<Long, List<SubFoo>. The in process cache should store the object references and not the data values, so you have no redundancy.

cruftex
  • 5,545
  • 2
  • 20
  • 36
  • Thank you for the answer. I had tried to implement the idea from Solution 1 before I decided to use TreeCache. But I failed to ensure consistency between transactions. Let's suppose that the are two parallel transactions Tx1 and Tx2 will modified List from Cache> for the same key. Tx1 will add new value, and Tx2 will remove value from list. The final result will be List from last commited transaction (some time will be Tx1, some time will be Tx2). To avoid this, it is necessary to use PESSIMICTIC lock mode. – ostry Aug 29 '16 at 08:33
  • I don't see how a TreeCache is the solution here. In general: If you mutate your caches directly, you need to do proper transactions/locking if you want to ensure consistency. Another approach is just to invalidate the caches on mutation and always populate the caches by reading from the database. Additionaly, see my remarks here: http://stackoverflow.com/questions/25969983/guava-cache-how-to-block-access-while-doing-removal/26011908#26011908 – cruftex Aug 29 '16 at 08:49
  • I had hoped, that TreeCache was some kind of workaround for this case. Of course, still remained the problem of concurrency. Especially when one of the Tx deletes the node and second one add element to list. The solution to this problem was pessimistic locking mode. It seemed that everything works. Unfortunately, the large number of requests were moments of inconsistency. I think that approving changes after Tx, each cache performed independently. – ostry Aug 29 '16 at 10:28