Windows BranchCache deep dive
BranchCache is a WAN optimization technology that is included in some editions of Windows starting from Windows 7 and Windows Server 2008 R2.
How does it work? And what are the modes of operation for such technology? That is what we will be discussing in this post.
First, from client perspective, some editions of Windows have already the BranchCache client. If you go to your services on your Windows, you will see a service called BranchCache.
BranchCache technology is an enhancement of the (Peer Distribution) technology that existed in Windows Vista. In some TechNet pages, you will see this technology referred by (Peer Distribution) instead of BranchCache.
It is important to know that BranchCache will not speed up your YouTube browsing for example or any Internet browsing experience, instead it will speed up your access to internal portals like SharePoint, corporate portals, and it can speed up your access to remote File Shares on your main site. WSUS, SCCM, and SharePoint are famous candidates for this technology.
BranchCache works only with two main protocols [HTTP [including the HTTPS traffic], and SMB [or CIFS] which is the protocol used to access file shares.
So far, we know that you don’t need to install anything on your Windows machines if you picked the right edition, and that BranchCache will only work for Internal HTTP (HTTPS) sites and file shares. Now, let us talk about what is required from the server perspective. Your servers that you want your clients to cache content from are called [Content Servers]. Those content servers should be running BranchCache supported Windows server.
BranchCache modes of operation
BranchCache Technology operates in two modes:
- Distributed Cache Mode
- Hosted Cache Mode
In BranchCache Distributed Cache mode, each machine in the branch will cache content in its local disk, and when another machine in the same site needs to access the same file, it will pull it from neighbor machines.
In BranchCache Hosted Cache mode, you need to install a Windows Server (BranchCache supported Server editions only) in the branch, called Hosted Cache Server. Whenever a client downloads a file from a BranchCache capable server, it will copy the file to that hosted cache server. When another machine wants to download the same file, it will contact the hosted cache server and get it from there.
Bob walks at the office 7 AM in the morning, fires up his windows PC and wants to download the company newsletter that is hosted on a remote BranchCache enabled server in the main office. Since he is configured to use BranchCache, his laptop will start downloading identifiers or hashes that describe the data instead of the data itself and those identifiers are so small [Step 1].
So, he pulls down those very quickly (couple of kilobytes), and he uses them to do something called multicast shout, searching for a peer on the local subnet in the branch that has already downloaded that data [step 2]. But since he is the first one who views this content, he shouts, nobody there, so he goes back to the WAN link and does a complete download of the data [Step 3].
It takes a little bit longer, but once he is done, his computer keeps that data locally so that it can be made available to other peers.
Now, 15 minutes later, Alice comes at the office, fires her Windows computer, and tries to access the same newsletter. She downloads the identifiers [Step 4], does a multicast shouts [Step 5]. Bam… finds Bob PC, and downloads the content very quickly [Step 6].
You can see that you don’t need to have any servers in the branch office. All what need to be done, is to enable BranchCache on those branch clients, which can be done easily via group policies.
Keep in mind that you need some disk space on client machines to host cache files, and some extra processing to reply to cache request from neighbor machines. Also, you can notice that cache availability in the branch will drop when laptops go offline or hibernate. The more computers in the branch, the more available content on site, the more multicast shouts and replies. When you have large site with a lot of machines, going to the hosted cache BranchCache mode would make more senses.
Hosted Cache Mode for BranchCache requires an Enterprise Edition of Windows Server 2008 R2 or Windows 2012 (called Hosted Cache Server) in the branch office. Any client which downloads a file from a BranchCache capable server, will share this file with the hosted cache server, so other machines on the branch can benefit from.
So, how does it work? As seen in the figure below, Bob shows up early, poor Bob. He goes to view that company newsletter, pulls the identifiers [Step 1].
This time, instead of doing that multicast shout, he does a unicast query for the hosted cache server and asks, do you have this stuff that I want [Step 2].
The hosted cache Server doesn’t have that content as no body downloaded it yet. Bob goes back to that WAN link, downloads the whole file, and once he is done [Step 3], his PC advertise the fact he got that newsletter to the hosted cache Server, and pushes up the identifiers [Step 4].
In some point then, the hosted cache server connects back to him and pulls down the data, so it can be available to other users in the branch [Step 5 and 6].
Alice 15 minutes later, fires her PC, tries to download the same company newsletter, downloads the identifiers [Step 7], searches the hosted cache server, and gets the data fast from there [Step 8 and 9].
One of the advantages of this mode of operation, is that all cache content on the branch is available on the hosted cache server, and it will not consume disk space on client machines.
Another advantage is that the since this mode of operation doesn’t depend on multicast shouts, but on unicast traffic, clients on different subnets can benefit from the cache server. For example, if you have a branch office that has machines on different subnets, then you may consider this mode of operation. Also, content will be always available, even if many machines are turned off, as the content is now hosted on the cache server.
The only disadvantage of this mode of operation, is that you need to have a server on the branch office that is running Windows 2008 R2/2012 Enterprise Edition.
How BranchCache works with http traffic? And how things really work when you cache http(s) traffic? This is what we will be discussing here. Let me tell you a story.
The picture shows the networking stack on the client machine, and on the server side. On the client, we have IE, which makes use of wininet, which is one of the client http stacks that ships with Windows, and below that we have the BranchCache service.
The client opens IE, you know Bob, going to view that company newsletter and he is going to open a URL. IE makes use of wininet component which is BranchCache capable, and here we make our http get which is transmitted across the wire, hits the web server and interrupted by http.sys which is BranchCache capable [Step 1 and 2].
What we are doing here, is making a normal http request, only marking it as BranchCache capable. So, the web server says (hey this client knows BranchCache, so I am going to send him some hashes rather than the full data). Http.sys gets the data from IIS [Step 3 and 4], uses those Peer distribution APIs, and sends the data to the BranchCache service [Step 5] where it cuts them into chunks [Step 6], and calculates hashes for those chunks. After that, those hashes are sent to http.sys [Step 7] which replies to the client with BranchCache response that contains something called (content information structure), which is a list of all hashes describing those chunks of the data [Step 8].
On the client side, wininet gets that, and remember it is BranchCache capable, so it uses those Peer Distribution APIs and goes down to the BranchCache Service [Step 9].
BranchCache service looks for that data on a peer computer or a hosted cache server depending on how the client is configured [Step 10]. Once we get the data, we verify it across the list of hashes we get from the web server and we pass it up to wininet and back to IE [Step 1 and 12].
BranchCache is hocked in the Client Side Caching (CSC) component on SMB (same component that allows us to use offline files technology) and it is the integration point for BranchCache. So, what happened when a client makes a BranchCache SMB request?
The application makes a read file operation on a remote SMB server [Step 1] and that is intercepted by the CSC driver. A request is then made to the CSC service to go and pre-fetch the file that the app is looking for.
CSC service will then go and pre-fetch the hashes of the file instead of the complete file [Step 2 and 3]. The cache download is going to happen over the SMB protocol. We are going to download hashes instead of data. The server will then will try to pull those hashes from a cache hash [Step 4].
Hashes come back to the client [Step 5], to the CSC service, and then Peer distribution APIs are called so we can find the data on a peer client or on the hosted cache server. After the content is retrieved from a peer client or hosted cache server, we will supply it back to the CSC driver, then to the APP, and finally we will put it in the CSC cache [Step 6 and 7].
Subsequent access to the same file is served from the CSC cache and not the branch cache.
Now back to the point where “Hashes on the file server are stored on the hash cache”. There are different ways to fill it up. There is a service that will calculate hashes when the CPU utilization is low on the server and the second way is using a tool called HashGen that you can run against the share and it will force generation of those hashes.
The first time that a file is requested from a file share or web server, it must generate hashes of the data as they are moving through the network stack. Generating hashes is not that big deal, but does take a little of time, so we don’t hold up a client requesting a file just to generate hashes. We will give him the data right away and we will calculate hashes as the data is moving through the network stack. So, if you are going to play with BranchCache, remember that the third hit is when you will notice some improvement.
We will talk about how we can deploy BranchCache. First all workstations should be running supported version of Windows. The content server (your IIS or File Server) should be running BranchCache supported version of Windows Server.
After that, we need to install the optional Windows BranchCache component on the Windows web or file server on the main site (those are called Content Servers). This can be done by going to Add Features and choose “Windows BranchCache” component for web servers, or “if we are talking about file servers”, by going to File Services from Server Manager, choose “Select Role Services” and click “BranchCache for network files”.
For BranchCache clients to operate on a distributed cache mode, a GPO should be applied to configure the clients to act as BranchCache clients. We can also configure the percentage of disk space that BranchCache files will consume on the local disk (by default 5%).
For BranchCache in hosted cache mode, we need the hosted cache server to be running supported version of Windows. This is a big limitation as you need to license an Enterprise Edition of Windows for the branch hosted cache server. The hosted cache server also requires a digital certificate so that clients can authenticate it before downloading cache content.
BranchCache group policy is the best way to configure clients to use this technology. A client can only be operating in either Distributed Cache mode or Hosted Cache mode but not both. The below figure shows how GPO can be used to enable BranchCache on clients and choose the mode of operation. If Hosted Cache mode is selected, then you need to type the FQDN of the Hosted Cache server existing in the branch. You can also set the percentage of the local disk that can be used for cache content in case you are using Distributed Cache mode.
Blocks, Segments and Hashes
We will dig inside BranchCache technology and describe how the content server will cut the data into blocks and generate hashes that are required for BranchCache to function.
The content servers (your Windows web server or file share) are the source of the data that need to be cached. Think of it as the SharePoint nodes or your WSUS server.
Content servers will divide data into Segments that are big in size, and then will generate hashes for each segment (S1, S2…etc.). Segments are unit of discovery, and when a BranchCache client wants to access any file, it will download those segment hashes, and uses them to ask neighbor peers or the hosted cache server: “does anyone have a file that is structured from segments with hashes S1,S2….Sn? “.
Segments are also divided to smaller units called Blocks, and hashes for those blocks are calculated on the content server and returned to the clients (B1, B2,,etc). Once a BranchCache client gets a reply from a source on his branch that it has data with segment hash S1 for example, it will download the blocks (B1, B2…Bn) that construct that segment.
Think of segments as unit of discovery and Blocks as unit of download from neighbor clients.
So why do we have segments and each segment is divided to Blocks? We need the unit of discovery to be big in size so that we don’t flood the network with discovery packets. That is why we use hashes of segments to describe data. For example, if a file is 100k in size, and the segment size is 25k, then the file is divided to 4 segments and we only need to send 4 multicast discovery packets in the local branch subnet to discover that data (in case of distributed cache mode).
Once that 100k file is discovered, and if the block size is 5k in size, then we will download the 5*4 blocks from the cache source in the branch (each segment contains 5 blocks and we have 4 segments).
The reason why we use blocks to download data instead of segments, in case a source cache machine went down for any reason, we will still download other blocks from another source without losing much data because the block size is small in size.
After you have deployed BranchCache in your network, how can you measure or confirm that it is working, and how can you troubleshoot BranchCache issues after deployment?
There are amazing performance counters in Windows client that can tell you exactly how your BranchCache deployment is behaving.
Just go to your Start icon, type PerfMon, and the Performance Monitor snap in will open, click on Performance Monitor icon, then right click any empty area in the middle pane, and chose Add Counters. Now scroll to BranchCache, and choose all the sub counters and then click Add and OK.
Then change the Graph Type to Reports as shown in the below figure. Watch those counters:
- Bytes from cache indicates the amount of data that the client retrieve from a neighbor peer of from the Hosted Cache server.
- Bytes from server, is an indication of data moving across the WAN.
Another effective way to configure and monitor your BranchCache activity and behavior is by using the netsh BranchCache command.
Open a command window, and type the following command. This will show you a summary of the BranchCache settings on that client..
netsh branchcache show localcache
You can also flush the BranchCache content cache on your client Windows machine by typing the following command on elevated cmd:
netsh branchcache flush
To see if the Windows machine is configured with BranchCache and to know in which mode it is operating, type:
netsh branchcache show status
As you can see form the output, the first line is showing that the client is configured with BranchCache in Distributed Cache mode.
The second line shows that this machine (which happened to be a laptop), will not serve clients from its cache if it is running on battery power, so that your laptop will not run out of power while processing BranchCache requests from neighbor peers.
The last line shows that the BranchCache service is running which does not necessary indicates that the client is configured with BranchCache. Only the first line from the output indicates that this client is configured with BranchCache.
BranchCache file location
We will be talking about some parameters to consider when deploying BranchCache. Mainly where the cache content files are stored.
Clients will always store cache files + the hashes of those files that describe the data in the following location: C:\Windows\ServiceProfiles\NetworkService\AppData\Local\PeerDistRepub.
To change this location on your Windows BranchCache clients, use this command:
netsh branchcache set localcache directory=D:\MyCacheFiles
Now The size of that cache is by default 5% of the whole hard disk, to change the max size of cache content, use one of those commands:
To change the cache size to a fix value in bytes, type:
Netsh branchcache set cachesize 20971520
To change the cache size to a percentage of disk space, type:
Netsh branchcache set cachesize size=20 percent=TRUE
Now, let us move to the content servers (your WSUS and file shares). Those content servers will generate hashes for their content, so that BranchCache clients can download them instead of the whole content. The location of those hashes is called (Publication Cache). By default, it is under %WINDIR%\ServiceProfiles\NetworkService\AppData\Local\PeerDistPub.
To change the location of the Publication Cache, use this command:
netsh branchcache set publicationcache directory=D:\BranchCache\
By default, Publication Cache will consume 1% from the hard disk. To change this number, use one of the following commands:
Netsh branchcache set publicationcachesize 20971520
Netsh branchcache set publicationcachesize size=20 percent=TRUE
If a BranchCache client wants to access a remote file share, the BranchCache will sense the latency to the remote file share server. If the latency is below 80 ms (default value), then the client will not use BranchCache. This is only applicable for accessing remote file shares and note web sites (not applicable for http or BITS traffic).
To change this value, run this command to configure the SMB latency to 20 ms for example:
Netsh branchcache smb set latency latency=20
Another important note that laptops on battery will not participate in BranchCache if it is participating in Distributed Cache mode. This is the default behavior to preserver power.
Content Server Hashes
As explained before, content servers are those servers with content that you want to cache content from. Examples are you WSUS server or file share servers.
Those content servers will generate hashes for content that is requested only. So, when the first client requests a certain file, the file is downloaded completely and the hashes start to generate on the server. You need three different clients to request the same file to start getting benefit from BranchCache.
Also note that hashes are generated for files bigger than 64k in size, so you will not benefit from BranchCache when dealing with files less than 64 in size.
Hashes on the content servers will be lost or deleted if the BranchCache service restarted. BranchCache service will then start generated hashes again when files are accessed.
Q: What are the requirements for clients to participate in BranchCache Technology?
A: Clients should be running supported edition of Windows and have the BranchCache local service set to Automatic. After that, the clients should be under the scope of a group policy that will enable them for BranchCache and will open couple of local Windows firewall exceptions.
Q: What are the requirements for my WSUS server or file server so that clients can cache content from?
A: Your WSUS or your file server, are called (Content Servers) and should be running supported edition of Windows. You then have to go and add the feature that is called (BranchCache) from the Add Features Wizard.
Q: Do I need to open anything on my firewalls or to contact my ISP provider for any changes in order to deploy BranchCache?
A: Absolutely NO. The clients will use native original ports when connecting to your WSUS/IIS/File servers.
Q: In Distributed Cache mode, clients will cache content locally on their hard disk. Can you tell me more and will it fill up the client hard disk?
A: By default, clients configured for BranchCache in Distributed Cache mode will download content on the C drive of their hard disk under “C:\Windows\ServiceProfiles\NetworkService\AppData\Local\PeerDistPub”. By default, BranchCache will only consume 5% of the total disk space. Make sure that you have enough space on your C drive to handle cache files. When those 5% are consumed, BranchCache service will start overwriting old content (least accessed).
Q: If my machines are using BranchCache, is it possible that they may get old cached data from neighbor peers?
A: NO. You will never ever get old data. BranchCache is designed to ensure it can work perfectly even with the most dynamic web sites that have content changing very quickly. The reason why this is true, is that clients will always connect to the live web site or share, get the newest hashes, and then requesting it from its neighbor machines if they have such data in their cache.
Q: If the main link between the branch and the main site is down, will my branch machines continue getting cached content from their neighbor peers?
A: NO. Because each machine should connect to the content server first that is located in the main site, to get those hashes that describe the data, before requesting the complete data from neighbor peers.
Q: What should I do if I want to troubleshoot a problem from my BranchCache client that cannot access a certain internal web site or file share? How can I temporarily disable BranchCache on that machine so I can troubleshoot the problem?
A: Just stop the BranchCache Service, troubleshoot your problem, and then enable it again.
Q: What do you recommend: Distributed Cache mode or Hosted Cache mode?
A: Well, it depends .Hosted Cache mode is excellent if the branch office has more than 50 machines (numbers are changed in Windows 8) because you don’t need to consume disk space on branch machines or introduce slight processing overhead on their machines for replying to BranchCache requests from neighbor machines. But this requires that you install a server in the branch site with BranchCache supported edition from Windows Server.
Finally, always remember to have your Windows client machines with good space on their C drive just in case.
Q: Is it possible for neighbor machines to request access to cache content on my machine without being authorized to do so?
A: NO. Because each BranchCache client will encrypt the data with a unique key that is shared with the content server. So neighbor machines should connect to the content server first, authenticate, get that encryption key, before asking your machine for cache content.
Q: I am concerned about security and I am not sure if I can trust such technology and have sensitive files cached everywhere.
A: Take it easy. We didn’t mention everything yet. BranchCache security is a long and complicated topic that I will be very pleased to discuss it with you if you drop me an email, and I will explain to you how BranchCache uses effective cryptography to protect data. For now, just take it from me: IT IS SECURE.
Q: What will happen if the BranchCache service fail to download content from neighbor peer or from the content server?
A: When BranchCache is unable to retrieve data from a peer or from the Hosted Cache, the upper layer protocol will return to the server for content. If a failure occurs in the Branch Caching component, the upper layer protocol should seamlessly download content from the server. No BranchCache misconfiguration or failure should prevent the display of a webpage or connection to a share.