22

I have 3 entities. Branch,Subject,Topic. Branch has list of subjects and Subject has list of topics. Also subjectList and topicList both are lazy. I want to fetch all branch including its subjects and topics in single query.

1.

@Entity
public class Branch implements Serializable 
{
    private static final long serialVersionUID = 1L;
    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    private Integer id;
    private String name;

    @OneToMany(mappedBy = "branch")
    private List<Subject> subjectList;
    //Getters and Setters
}

2.

@Entity
public class Subject implements Serializable 
{
    private static final long serialVersionUID = 1L;
    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    private Integer id;

    private String name;

    @ManyToOne()
    private Branch branch;

    @OneToMany(mappedBy = "subject")
    private List<Topic> topicList;
    //Getters and Setters       
}

3.

@Entity
public class Topic implements Serializable 
{
    private static final long serialVersionUID = 1L;
    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    private Integer id;
    private String name;

    @ManyToOne()
    private Subject subject;
    //Getters and Setters
}

I tried the method below but it didn't work.

@NamedEntityGraph(name="branch_subject", 
attributeNodes = {
    @NamedAttributeNode(value="name"),
    @NamedAttributeNode(value="subjectList", subgraph = "subjectListGraph")
},
subgraphs = {
    @NamedSubgraph(name="subjectListGraph",
            attributeNodes = {
                @NamedAttributeNode(value="name"),
                @NamedAttributeNode(value = "topicList", subgraph = "topicListGraph")
            }
    ),
    @NamedSubgraph(name="topicListGraph",
            attributeNodes = {
                    @NamedAttributeNode("name")
            }
    )
}
)

Also following code is used to fetch data from database, I am using JPQL as follows

    EntityGraph branchEntityGraph = entityManager
                .getEntityGraph("branch_subject");

        Branch branch = entityManager
                .createQuery("SELECT b from Branch b WHERE b.id=:ID",
                        Branch.class)
                .setHint("javax.persistence.loadgraph", branchEntityGraph)
                .setParameter("ID", branch1.getId()).getResultList().get(0);

This gives below exception

org.hibernate.loader.MultipleBagFetchException: cannot simultaneously fetch multiple bags
masber
  • 2,875
  • 7
  • 29
  • 49
Rajendra Thorat
  • 3,248
  • 3
  • 15
  • 24

3 Answers3

20

Hibernate doesn't allow you to fetch multiple Bags because it would end up fetching a Cartesian Product.

M → N → P one-to-many or many-to-many relations

For a multi-nested hierarchy, you can use JOIN FETCH on multiple collections as long as your lists are mapped as Set.

M → N and M → P one-to-many or many-to-many relations

For sibling collections, like M → N and M → P, don't switch to using Set instead of List.

Using a Set instead of a List to avoid the MultipleBagFetchException is a very bad idea since you will still end up with a Cartesian Product, and that's going to cause performance issues because you are going to fetch M x N x P records.

In this case, a much better approach is to fetch one collection with the first query and use additional queries for the remaining collections:

List<Post> _posts = entityManager.createQuery("""
    select distinct p
    from Post p
    left join fetch p.comments
    where p.id between :minId and :maxId
    """, Post.class)
.setParameter("minId", 1L)
.setParameter("maxId", 50L)
.setHint(QueryHints.PASS_DISTINCT_THROUGH, false)
.getResultList();

_posts = entityManager.createQuery("""
    select distinct p
    from Post p
    left join fetch p.tags t
    where p in :posts
    """, Post.class)
.setParameter("posts", _posts)
.setHint(QueryHints.PASS_DISTINCT_THROUGH, false)
.getResultList();

This strategy allows you to avoid the M x N x P result set by fetching M x (N + P) records instead.

Fetching from the child-side to the parent

If you have to use INNER JOIN when fetching the child collection, then you can simply [fetch from the inner-most Child up to the root][3] and reassemble the structure afterward. This is much more efficient since the query goes like this:

select t 
from Topic t
join t.subject s
join s.branch b
Vlad Mihalcea
  • 142,745
  • 71
  • 566
  • 911
  • I don't understand why it has to be that complex. What if I set all `@OneToMany` from the example as `fetchType=EAGER` - it will fetch everything in one query. – kolobok Nov 19 '18 at 11:18
  • Agree with @VladMihalcea. I run into a similar situation where I had a Restaurant object with a list of Menu objects. Then a Menu has a list of MenuItem. I was initially trying to fetch the entire graph for a specific restaurant and later realized this isn't even needed since my first query lazy loads a list of Restaurant already. Then for my second query if I need details about a specific Restaurant, I can just get all the MenuItem object for this selected Restaurant and pass it back to the client. And the client already has the necessary info for the selected Restaurant like id and so forth. – Taf Jun 27 '20 at 18:04
  • @vlad-mihalcea I am not finding Cartesian Product in OP's relation. Yes, there is one in your example as you are fetching 2 collections of same entity and these collections can be completely unrelated. But when we fetch nested 1-n collections, there will not be any Cartesian Product, at least that's what I found. – THe_strOX Dec 18 '20 at 08:22
  • @THe_strOX If you select 100 `Branch` rows, each `Branch` having 90 `Subject` children, and each `Subject` having 80 `Topic` children, how many rows do you think the final result set will contain? If it's 100x80x90, then doesn't it look like a Cartesian Product? – Vlad Mihalcea Dec 18 '20 at 08:53
  • @vlad-mihalcea Yes you are correct. I guess I had a Brain Fart in thinking that is not Cartesian product. However, we will be fetching the same number of rows even if we fire multiple queries because there are 100*800*90 Topics in 1-n relation. So, why not fetch in single query for OP's example? – THe_strOX Dec 18 '20 at 10:08
  • @THe_strOX If you do 2 queries, the first result set will contain 100x90 and the second one 100x80. So, basically, instead of fetching `M*N*P` records, you fetch `M*(N+P)`, which is way more efficient. – Vlad Mihalcea Dec 18 '20 at 10:26
  • @vlad-mihalcea Math is correct only for your example, but not for OP's example. You cannot Join Branch to Topic directly. Can you? The fact that there will be 100x80x90 topics shows that we need to fetch 100x80x90 rows in any case. – THe_strOX Dec 18 '20 at 10:35
  • The first query fetches Branches and Subjects. The second query fetches the Subjects with Topics where the `subject.branch in :branches` binds the list of branches fetched by the first query. – Vlad Mihalcea Dec 18 '20 at 11:00
  • @vlad-mihalcea Your second query will result in same no. of rows as you would otherwise get by firing single query. Where clause will filter nothing. Also, are you saying that in your example there will be less than 100x80x90 topics? Because I do not see that happening. So, we need to fetch 100x80x90 rows in all case. – THe_strOX Dec 18 '20 at 11:56
  • A test is worth 1000 words. Run [this test case](https://github.com/vladmihalcea/high-performance-java-persistence/blob/master/core/src/test/java/com/vladmihalcea/book/hpjp/hibernate/fetching/multiple/MultiLevelCollectionFethingTest.java#L108), and you'll see exactly what I'm talking about. – Vlad Mihalcea Dec 18 '20 at 13:04
  • @vlad-mihalcea I ran your test. I ignored 2nd query due to out of scope. 1st query outputs 1000 rows. 3rd query outputs 5000 rows. If I run a single query by adding another join fetch `left join fetch pc.votes v` in your 1st query, I get 5000 rows. Your test also gets passed running this single query and removing 2nd and 3rd query. Note that I need to make 1 of the collection Set and ignore tags test case as it is out of scope. My question from beginning was we are not gaining anything using 2 queries when we fetch nested 1-n collections, instead losing. What am I missing? – THe_strOX Dec 18 '20 at 14:34
  • 1
    I think you might be talking about entities that are returned by the queries while I'm talking about JDBC `ResultSet` rows. Hibernate deduplicates the joined records, of course, but the underlying query result set could still be huge when fetching multiple collections at once. – Vlad Mihalcea Dec 18 '20 at 20:57
  • @vlad-mihalcea I am taking about the resultset. Fetching nested 1-n collection can have huge resultset, however this huge resultset is necessary and you need to fetch them even if you split the queries. Note that fetching multiple collection of same entity is out of scope for my argument as splitting the query reduces the resultset in this scenario. In your testcase there are 50*20*5 votes, so the final result should have 5000 rows. And since post, postcomment and vote is nested 1-n relation we do not get any benefit by splitting the query. OP's scenario is also nested 1-n. – THe_strOX Dec 19 '20 at 03:00
  • That's correct. I updated the answer accordingly. – Vlad Mihalcea Dec 19 '20 at 08:25
  • @VladMihalcea If we go by two queries, then isn't there a benefit that Hibernate will have to deduplicate less entities, as opposed to single query approach where more number of entities needs to be deduplicated. – JavaLearner Dec 20 '20 at 10:52
  • 1
    Yes, that's basically what this solution does. – Vlad Mihalcea Dec 20 '20 at 13:13
  • Hi @Vlad, Sorry, maybe I didn't understand correctly. If there are 100 branches and each branch has 90 subjects, we have 100x90 = 9000 subjects. So doesn't cartesian product (`branch` `CROSS JOIN` `subject`) means 100x9000? (there is no join-condition) – Arash May 05 '22 at 18:23
  • 2
    A CROSS JOIN creates a Cartesian Product while a JOIN creates a filtered Cartesian Product. However, the answer is about replacing an unwanted Cartesian Product with two SQL queries, so instead of retrieving 100 x 50 x 50 = 250,000 you retrieve 2 x (100 x 50) = 10,000. So, you end up fetching the desired data, but without paying a 96% overhead. – Vlad Mihalcea May 06 '22 at 07:50
0

The solution is fairly simple use JPA 2.1 where you need add entiry grapgh at JPA repository only

@Repository
public interface BranchRepo extends JpaRepository< Branch, Long> {

@EntityGraph(attributePaths = {"subjectList" , 
"subjectList.topicList"})
List<Product> findAll();
}

This will create Join between the bi-directional mapping of Branch-> SubjectList-> TopicList

Blockquote you need to override equals method in each of these entity classes with ID else this will not work. You can get rid of complex multilevel NamedEntityGraph at each entity class which is not needed

AtomicAJ
  • 11
  • 1
-1

I guess I had some similar issue. All my Device entities have a Transaction object which in my case stores the datetime and user and computer information for that particular object. Also have a DeviceType entity that has transaction object and has a Many to one relation with my Device entity. So Actually I had both 1 level and 2 levels of nested relations with this Transaction Entity. I got some nested exception because of that. Using Subgraph fixed that issue for me. Here is my NamedEntityGraphs definition code. Hope it helps:

@NamedEntityGraphs(value = {
    @NamedEntityGraph(name = "Device.AllRelations",
            attributeNodes = {
                    @NamedAttributeNode("transaction"),
                    @NamedAttributeNode(value = "deviceType", subgraph = "DeviceType.Transaction")
            },
            subgraphs = {
                    @NamedSubgraph(name = "DeviceType.Transaction", attributeNodes = {
                            @NamedAttributeNode("transaction")
                    })
            }
    ),

    @NamedEntityGraph(name = "Device.TransactionOnly", attributeNodes = {
            @NamedAttributeNode("transaction")
    }),
})
shizhen
  • 12,251
  • 9
  • 52
  • 88
Kamyar Miremadi
  • 169
  • 2
  • 7