6

I have an organization table like this:

OrgID | OrgInviteCode | OrgName       | Status | ProjectTableName | InsertOn
-------------------------------------------------------------------------------------------
 1    | RC12T67       | Organization1 | Active |  Project1        | 2015-12-19 15:37:43.333
 2    | BC56uI7       | Organization2 | Active |  Project2        | 2015-12-19 15:37:43.333
 3    | ORG1456       | Organization3 | Active |  Project3        | 2015-12-19 15:37:43.333
 4    | ORG2856       | Organization4 | Active |  Project4        | 2015-12-19 15:37:43.333

And I have a stored procedure to create a dynamic table for project.

If any new organization is created successfully then we call the stored procedure to create the project table for that organization.

Each organization has its own project table. So the project table name is dynamic for each org and it's name stored in organization table.

Organization1 --> Project table

ProjectID | OrgID | ProjectName | ProjectInvideCode |   Address1        | Address2 |  City      |State    |   ZIP      |  Country
-------------------------------------------------------------------------------------------------------------------------------
1         |   1   |  Org1Proj1   |  XJ34590         | 235 Harrison St. |            | Syracuse  | AK      |  23456234    |  US
2         |   1   |  Org1Proj2   |  JKI8907         | 35 Sterling  St. |            | Syracuse  | NY      |  23456456    |  US

Organization2 --> Project table

ProjectID | OrgID | ProjectName | ProjectInvideCode |   Address1 | Address2 |  City       |State    |   ZIP    |  Country
-------------------------------------------------------------------------------------------------------------------------------
1         |   2   |  Org2Proj1   |  RUIO90          | 90 Ram St. |          | Los Angeles | CA    |  23456234    |  US
2         |   2   |  Org2Proj2   |  KLOP907         | 35 Wide St.|          | Chicago     | IL    |  23456456    |  US

I'm currently working in integrating the search feature. Users or Anonymous users may search data based on the below logic:

  • Search with organization name or organization invite code.
  • Search with project name or project invite code.
  • Search with project address, city, state, country

I know it's very simple to find the search result for organization name and invite code, because all the content is resides in same table.

But it's more complicated to get the search result for the projects(name or invite code) due to dynamic table name. I found this link in How to fetch data from dynamic multiple tables?, so I think it's not a better solution because search needs to very fast.

The reason why we separate the project tables based on organization because in our requirement they clearly mention that "We have 1000000 organization, but each organization having more then 1 Million projects". Hope you understand the concept we don't want to dump the 1000000(Organization) * 1 Million = XXXXXX projects in single table.

Questions:

  • How we can search the projects data in effective way?
  • Does our project separate table concept is worst? Do you have any suggestion to handle in much better?
  • Is there any effective way in C#?

Tools and Technology:

  • Asp.Net 4.5, C#
  • MVC 5
  • Entity Framework code first
  • SQL Server 2012
Community
  • 1
  • 1
  • Out of curiosity, why don't you want to dump 1M organizations with 1M projects each into one table? I mean, this sounds like a lot of data, so instead of hacking away with SQL Server and 1T tables, perhaps you need a distributed high-performance database architecture instead? – Keith Payne Dec 19 '15 at 14:25
  • @KeithPayne AFAIK it doesn't make sense of keeping those data's in single table.. We already know the data is much big so while building from scratch we have to design the database with respect the data, scalability, performance etc.. – Chandrasekar Kesavan Dec 19 '15 at 15:06
  • How many tables do you plan on having? 1 million organizations x 1 million projects = 1 trillion. Do you want to design the system to handle 1 trillion tables? Have you checked the specs on SQL Server maximums? And then, how will you index your list of tables? A sorted index has to be maintained somewhere so that you can associate your table with the org + project. Where will the index be stored? – Keith Payne Dec 19 '15 at 15:28
  • @KeithPayne I think you misunderstood the concept, If we have one table for all projects then we have 1 trillion data. For 1M organization we create 1M project tables. In each organization project table we may have 1M projects or more then that. – Chandrasekar Kesavan Dec 19 '15 at 16:17
  • @KeithPayne Good catch about the index. While creating the dynamic project table, we will create the foreign key reference between Organization(OrgID) and Project(OrgID). Please have a look into the table. – Chandrasekar Kesavan Dec 19 '15 at 16:34
  • @KeithPayne,@Ingaz Do you think that, it's much more easier in NoSQL like mongodb, cassandra. – Chandrasekar Kesavan Dec 19 '15 at 16:36
  • @Chandru, although I don't have practical experience working with terabyte sized databases, I'd prefer to work with a database that has one table with trillion rows, rather than a database that has million tables. SQL Server (and SQL language in general) is designed to work with static schema. Most likely it will be non-trivial in any case with this volume of data. – Vladimir Baranov Dec 26 '15 at 23:29

2 Answers2

5

I think you can create VIEW combining all project tables

SELECT
    REPLACE(
        REPLACE( 
            REPLACE(
            (
                SELECT DISTINCT 'SELECT * FROM Organization O JOIN ' 
                    + ProjectTableName  
                    + ' PT ON O.OrgId = PT.OrgId WHERE O.OrgId = ' + convert(varchar(10), OrgId) 
                    as [text()] FROM Organization
                FOR XML PATH ('DELIMITER')  
            ), '</DELIMITER><DELIMITER>', ' 
            UNION ALL 
            '), '</DELIMITER>', '')
    ,'<DELIMITER>', 'CREATE VIEW Organization_Projects 
AS
')

it must be quite near in efficency to queries on source tables.

Alex Yu
  • 3,412
  • 1
  • 25
  • 38
  • Do you think that, it's much more easier in NoSQL like mongodb, cassandra. – Chandrasekar Kesavan Dec 19 '15 at 16:37
  • In general, I don't think so. Document orient storages are benificial for data with dynamic schemas. In your case: if every organization had a project table with unique column set than it could be beneficial to use this storages. – Alex Yu Dec 19 '15 at 16:46
2

If you want faster lookup, you can do this by compromising during write/insertation time. Out of following three points:

  • Search with organization name or organization invite code.
  • Search with project name or project invite code.
  • Search with project address, city, state, country

First is straigt forward as you said. Lets look at 2 and 3. I would suggest a generic way by which you can handle both the points, lets take address as example.

  1. Create table Address with column AddressId and Address, store only unique address in this table.You can make address as unique key. Store addressId in Organisation and Project table and not complete address. This will save space and avoid spelling error
  2. Create table OrganisationAddressMap with column id, AddressId, ProjectId and OrganisationId. Store only unique row in this table
  3. When you search for address, look in above two table to identify relevant projectId and OrganisationId. And then based on these two ids query correct table.

In above scenario you will be searcing for very less no of rows for address. Similarly create two tables for each query term like project name, project invite code, city, state and country.

techExplorer
  • 810
  • 7
  • 16
  • Thanks for your reply. It helps me. So you suggesting me to have all the projects in one table. I'm I right? We have one search box and user's may search(Orgname or OrglnviteCode or ProjName or ProjlnviteCode or Address etc..) and we have to get the result based on the logics which I mentioned. According to your solution you are splitting the columns not the rows. – Chandrasekar Kesavan Dec 23 '15 at 16:03
  • Yes, having all projects in one table with unique entry will be very fast to search. Along with search box you can give drop down to select on which field user want to search this will make your search faster by approx. 8 times as you will search in only one of the eight field and not all. You can also have an option as "all" in dropdown in which case you will search in all tables one by one. If you feel this is correct answer please mark it as answer . – techExplorer Dec 24 '15 at 03:13