Azure Cosmos DB Hierarchical Partition Keys

Selecting a partition key for your Cosmos DB is one of the most important choices you need to make for your Cosmos DB project. You really need to take your time and have a plan for your project. Where is this application will be in 1 year? 5 years? How much data are you planning to store? If your application will become popular and you start to have users all over the county or world, do you think your partition key can oversee a growth like this? These are the some of the questions you need to ask yourself. Selecting a partition key is like selecting a life partner for your project. You need a good one that will grow with your project together.

Sometimes, it does not matter how much time you spend to find a good partition key. Your document simply does not have good one. In those cases, usually the best thing you can do is combining multiple properties together and generate a unique custom property called synthetic key. 

You might have a great partition key today but when your business starts to grow, you start to have users from different states, countries and you realize the partition key you have might not be able to handle the data anymore. What do you do then?

Azure Cosmos DB team has been working to give you better solution for problems like this. Team has been working on a game changer feature called Hierarchical Partition Key. With this new feature, you can select up to three partition keys for your data. This feature is still in private preview. Logical partition size will not change with this change. Partition size will be specific to your hierarchical partition. For example, if your partition key is County and State. Partition size will be 20 GB in State level.

Front-End changes are not ready yet so I cannot show you how to create a hierarchical partition key by using Azure Portal. I can share with you how to do it from .NET SDK. In the following example, I create a new container with hierarchical partition key.

Let’s say we have a marketing company, and we offer tracking for email campaigns. We are small company with big dreams. We want to sell this solution to other states, countries one day. For today, nobody will blame us if we select ClientId as a partition key. But as I said before, we want to be a global company one day, clients from all over the world means we might have problems with the partition key later so we want to make the partition key unique as much as we can without generating a synthetic key. Thanks to hierarchical partition key feature, we can make CountryCode City ClientId a partition key. 

public class Transaction
        {
            public string id { get; set; }
            public string CountryCode { get; set; }
            public string State {get; set; }
            public string City { get; set; }
            public string ClientId { get; set; }
            public string Email { get; set; }
            public string Ip { get; set; }            
            public string TransactionId { get; set; }
            public DateTime TransactionDt { get; set; }
            public string ClickedAction {get; set; }
            public bool Completed { get; set; }
            public bool Failed { get; set; }
            public string FailReason { get; set; }
        }

Since Front-End is not ready to select three properties for partition key. We can create a container with hierarchical partition key from .NET SDK. (You need to be in the private preview program to be able to run the following code)      

static async Task<bool> CreateNewContainer()
        {
            var hiepartition = new List<string> { 
            "/CountryCode", "/City", "/ClientId" 
            };
            Database db = client.GetDatabase("Marketing");
            var containerprops = new ContainerProperties(
               id: "HiePartContainer", 
               partitionKeyPaths: hiepartition);
            var newcontainer = await db.CreateContainerAsync(containerprops, throughput: 400);
            return newcontainer.StatusCode == System.Net.HttpStatusCode.Created;            
        }
   

Following code displays how to do that. Partition Key becomes a list of string rather than string in the code. I am sure you will find many ways to use Hierarchical partition keys in your current Cosmos DB solutions when it will be available. Hierarchical Partition Keys feature is available only for new containers in private preview and honestly (my opinion) I do not think this will change when it goes to GA since there is no Repartitioning option in Cosmos DB.   You can always create a new container and move your data into the new container if you are up to it. My next post will be about the performance of hierarchical partition keys.

This blog featured as part of Azure Week. Find more great Azure content here.

About the Author:

Hasan is a Subject Matter Expert on Azure Cosmos DB; he is recognized by Microsoft as Data Platform MVP. He is the owner of SavranWeb Consulting and works at Progressive Insurance as a Business Intelligence Manager. Hasan spends his days architecting cutting edge business solutions by using the latest Web and Database technologies. Hasan has more than 15 years of experience in the software industry as a developer, software architect, manager, and CEO. He has spoken at many conferences worldwide; He is an active member of the HTML5 and Web Assembly W3C groups. Hasan likes to write about SQL, Azure Cosmos DB, C#, and Front-End development on his blog.

Reference:

Savran, H. (2021). Azure Cosmos DB Hierarchical Partition Keys. Available at: https://h-savran.blogspot.com/2021/12/azure-cosmos-db-hierarchical-partition.html [Accessed at: 28th June 2022]

Share this on...

Rate this Post:

Share: