![]() | |
![]() |
| | Thread Tools | Search this Thread | Display Modes |
#1
| |||
| |||
|
#2
| |||||||||
| |||||||||
|
|
Scalability implies lots of web servers all referring back to a central SQL server, which in turn implies limited caching which in turn hurts performance opportunities. It most certainly does not. Caching avoids the SQL bottleneck. |
|
I would like to use typed datasets for all the benefits they have, and I would like to use timestamps to assist in concurrent edit checking. timestamps won't help you with concurrency because the timestamp isn't |
|
I would like to cache the datasets in the asp.net application. Nope, cache is a poor choice because it is per process. You have a multi-cpu |
|
I would cache data key in a client side cookie. What happens if cookies are lost, unreadable or client turns them off? |
|
multiple servers and CPUs). *In my view getting into cache synchronisation across web servers will hurt the very performance gains we are trying to get via caching in the first place.* Yes. |
|
As a user becomes interested in a specific set of data (house) the datakey cookie would be set, and this would drive the selection of web process that is best suited to serve the request. Yes, but this is all driven by the client. Not a particularly good choice |
|
distinct url by using application pool configuration, but I haven't confirmed this yet. That doesn't solve your cache affinity problem. |
|
different web process, however I believe the DOS risk is sufficiently small that this design is still widely applicable. You are pushing a cookie to the client, the wrong client can regenerate |
|
Hello all, We know that designing a web application that is both scaleable and high performance is difficult. Scalability implies lots of web servers all referring back to a central SQL server, which in turn implies limited caching which in turn hurts performance opportunities. Clearly there is no right answer for all scenarios, but I have been thinking over a particular design which I would like to get your views on... This scenario involves a collection of data which is concerned with an overall user operation. The data is persisted across multiple tables, but the primary keys are hierarchical in nature. Eg house relates to rooms, relates to furniture. The house primary key forms part of the room and furniture primary keys. I would like to use typed datasets for all the benefits they have, and I would like to use timestamps to assist in concurrent edit checking. One dataset would hold the data for one house (plus related tables). I would like to cache the datasets in the asp.net application. If it times out, so be it I can go fetch a new version. I would expect data edits to be applied to the database as part of the web request operation, so dataset and database remain in sync. I would not anticipate using Session state for this application. I would cache data key in a client side cookie. I require affinity to the specific cache and therefore web process (across multiple servers and CPUs). *In my view getting into cache synchronisation across web servers will hurt the very performance gains we are trying to get via caching in the first place.* As a user becomes interested in a specific set of data (house) the datakey cookie would be set, and this would drive the selection of web process that is best suited to serve the request. Consequently as the user works with the site, different requests may be served by different web processes/servers. If the datakey cookie is not set, then no cache affinity is required. I have looked for some extension to Microsoft's Network Load Balancer using a provider pattern to allow me to control the selection criteria of a specific web process, but without success. I want to take advantage of the NLB heart beat facility. The scenario I imagine is say a collection of four web processes (spread say across two servers each with dual processor). I *think* I can give each web process a distinct url by using application pool configuration, but I haven't confirmed this yet. So I would expect my web process selection algorithm to be driven by the value of the cookie holding a datakey. The algorithm would distribute the requests according to the data keys. I was thinking something simple like modulo 4 of the house ID in this scenario. When a server goes down NLB should know this, and expose this to my provider code. My web process selection algorithm would check the required web process is alive (refering to NLB API), and make an alternate selection if necesary. So far as I am aware, the piece of the picture that is missing is a provider pattern API in NLB to facilitate this. I wonder if this is something that is on the drawing board at Microsoft (or a third party supplier for that matter). Apart from that piece missing, the main disadvantage I can see in this design is it's defence against denial of service attack. Theoretically attackers would need to select just four distinct IDs that each hit a different web process, however I believe the DOS risk is sufficiently small that this design is still widely applicable. Other issues that I know come into play include: security of datakey overhead of establishing potential ssl sessions on each new web process, as datakey changes (I think this is relatively infrequent) authentication cookies would need to apply to scope of entire web farm. authorisation to access data would need to occur in the web application, not just the database. NB this design does not preclude distributed web farm clusters on different continents (each cluster potentially caching the same data), because at the end of the day if concurrent data edits are detected, the dataset can be refreshed from the database, and the user can reconfirm their edit operation. Also in this scenario, there are likely to be multiple databases synchronised using replication. Typically the set of editable data would configured for each database. I would welcome a lively discussion on the viability on the design. Thanks very much for your time. Martin |
#3
| |||||||
| |||||||
|
|
See inline. Scalability implies lots of web servers all referring back to a central SQL server, which in turn implies limited caching which in turn hurts performance opportunities. It most certainly does not. Caching avoids the SQL bottleneck. The point I'm making here is that in a web farm environment, the standard |
|
I would like to use typed datasets for all the benefits they have, and I would like to use timestamps to assist in concurrent edit checking. timestamps won't help you with concurrency because the timestamp isn't guaranteed accurate since windows is not a real time OS. Even a minor lag will thru off your sync on a heavy traffic day. Here is a quote from http://support.microsoft.com/kb/170380 |
|
I would like to cache the datasets in the asp.net application. Nope, cache is a poor choice because it is per process. You have a multi-cpu architecture on a web farm. That leads to cache duplication. I want to address this application pool configuration, but not sure if I |
|
I would cache data key in a client side cookie. What happens if cookies are lost, unreadable or client turns them off? If they're turned off they could be put in the url (by an http module) |
|
multiple servers and CPUs). *In my view getting into cache synchronisation across web servers will hurt the very performance gains we are trying to get via caching in the first place.* Yes. As a user becomes interested in a specific set of data (house) the datakey cookie would be set, and this would drive the selection of web process that is best suited to serve the request. Yes, but this is all driven by the client. Not a particularly good choice since the client doesn't have to follow the rules you impose; that is, a client can most easily disable cookies. I *think* I can give each web process a distinct url by using application pool configuration, but I haven't confirmed this yet. That doesn't solve your cache affinity problem. Doesn't it? I've not tried yet. |
|
different web process, however I believe the DOS risk is sufficiently small that this design is still widely applicable. You are pushing a cookie to the client, the wrong client can regenerate multiple cookies that in turn drive the caching mechanism in your architecture right? Then, it's easy to flood the cache architecture from the client since every request is valid. I agree |
|
-- Regards, Alvin Bruney ------------------------------------------------------ Shameless author plug Excel Services for .NET is coming... OWC Black book on Amazon and www.lulu.com/owc "Martin" <x@y.z> wrote in message news:%23KIyBnNLHHA.4460 (AT) TK2MSFTNGP03 (DOT) phx.gbl... Hello all, We know that designing a web application that is both scaleable and high performance is difficult. Scalability implies lots of web servers all referring back to a central SQL server, which in turn implies limited caching which in turn hurts performance opportunities. Clearly there is no right answer for all scenarios, but I have been thinking over a particular design which I would like to get your views on... This scenario involves a collection of data which is concerned with an overall user operation. The data is persisted across multiple tables, but the primary keys are hierarchical in nature. Eg house relates to rooms, relates to furniture. The house primary key forms part of the room and furniture primary keys. I would like to use typed datasets for all the benefits they have, and I would like to use timestamps to assist in concurrent edit checking. One dataset would hold the data for one house (plus related tables). I would like to cache the datasets in the asp.net application. If it times out, so be it I can go fetch a new version. I would expect data edits to be applied to the database as part of the web request operation, so dataset and database remain in sync. I would not anticipate using Session state for this application. I would cache data key in a client side cookie. I require affinity to the specific cache and therefore web process (across multiple servers and CPUs). *In my view getting into cache synchronisation across web servers will hurt the very performance gains we are trying to get via caching in the first place.* As a user becomes interested in a specific set of data (house) the datakey cookie would be set, and this would drive the selection of web process that is best suited to serve the request. Consequently as the user works with the site, different requests may be served by different web processes/servers. If the datakey cookie is not set, then no cache affinity is required. I have looked for some extension to Microsoft's Network Load Balancer using a provider pattern to allow me to control the selection criteria of a specific web process, but without success. I want to take advantage of the NLB heart beat facility. The scenario I imagine is say a collection of four web processes (spread say across two servers each with dual processor). I *think* I can give each web process a distinct url by using application pool configuration, but I haven't confirmed this yet. So I would expect my web process selection algorithm to be driven by the value of the cookie holding a datakey. The algorithm would distribute the requests according to the data keys. I was thinking something simple like modulo 4 of the house ID in this scenario. When a server goes down NLB should know this, and expose this to my provider code. My web process selection algorithm would check the required web process is alive (refering to NLB API), and make an alternate selection if necesary. So far as I am aware, the piece of the picture that is missing is a provider pattern API in NLB to facilitate this. I wonder if this is something that is on the drawing board at Microsoft (or a third party supplier for that matter). Apart from that piece missing, the main disadvantage I can see in this design is it's defence against denial of service attack. Theoretically attackers would need to select just four distinct IDs that each hit a different web process, however I believe the DOS risk is sufficiently small that this design is still widely applicable. Other issues that I know come into play include: security of datakey overhead of establishing potential ssl sessions on each new web process, as datakey changes (I think this is relatively infrequent) authentication cookies would need to apply to scope of entire web farm. authorisation to access data would need to occur in the web application, not just the database. NB this design does not preclude distributed web farm clusters on different continents (each cluster potentially caching the same data), because at the end of the day if concurrent data edits are detected, the dataset can be refreshed from the database, and the user can reconfirm their edit operation. Also in this scenario, there are likely to be multiple databases synchronised using replication. Typically the set of editable data would configured for each database. I would welcome a lively discussion on the viability on the design. Thanks very much for your time. Martin |
#4
| ||||||||
| ||||||||
|
|
Are you disagreeing with the whole philosophy of using the cache to help serve the request as close to the client as possible? In principle, yes because it causes more problems than it solves especially |
|
If you have good web references to how you would approach the overall goal of increased performance with caching in webfarms, that would be interesting. Actually, the patterns and practice group at MS has released the |
|
The point I'm making here is that in a web farm environment, the standard practice is to reference back to central db server *not* to use caching. It may be standard practice, but it is dead wrong with respect to |
|
What's wrong with that? In even a moderate concurrent environment, by the time you read the data it |
|
What would you do? For a web farm, that requires shared resources, you have to move the dataset |
|
Got any ideaas then? Come to think about it, I think the asp net cache service is a valid choice. |
|
What's your DOS answer? If you go that route, you'd need to somehow flag invalid responses and only |
|
Hello Alvin, Are you disagreeing with the whole philosophy of using the cache to help serve the request as close to the client as possible? I appreciate this brings challenges in a webfarm environment, and that's what I'm wanting to address. If you have good web references to how you would approach the overall goal of increased performance with caching in webfarms, that would be interesting. I've made individual comments inline. Thanks Martin "Alvin Bruney [MVP]" <some guy without an email address> wrote in message news:%23L7UHftMHHA.1912 (AT) TK2MSFTNGP02 (DOT) phx.gbl... See inline. Scalability implies lots of web servers all referring back to a central SQL server, which in turn implies limited caching which in turn hurts performance opportunities. It most certainly does not. Caching avoids the SQL bottleneck. The point I'm making here is that in a web farm environment, the standard practice is to reference back to central db server *not* to use caching. Using caching introduces new challenges which I'm trying to address. I would like to use typed datasets for all the benefits they have, and I would like to use timestamps to assist in concurrent edit checking. timestamps won't help you with concurrency because the timestamp isn't guaranteed accurate since windows is not a real time OS. Even a minor lag will thru off your sync on a heavy traffic day. Here is a quote from http://support.microsoft.com/kb/170380 "TimeStamp is a SQL Server data type that is automatically updated every time a row is inserted or updated. Values in TimeStamp columns are not datetime data; they are, by default, defined as binary(8) varbinary(8), indicating the sequence of Microsoft SQL Server activity on the row. A table can have only one TimeStamp column. The TimeStamp data type is simply a monotonically-increasing counter whose values will always be unique within a database. " What's wrong with that? I would like to cache the datasets in the asp.net application. Nope, cache is a poor choice because it is per process. You have a multi-cpu architecture on a web farm. That leads to cache duplication. I want to address this application pool configuration, but not sure if I can. What would you do? I would cache data key in a client side cookie. What happens if cookies are lost, unreadable or client turns them off? If they're turned off they could be put in the url (by an http module) If they are lost or unreadable that would cause interference with the users browsing experience. multiple servers and CPUs). *In my view getting into cache synchronisation across web servers will hurt the very performance gains we are trying to get via caching in the first place.* Yes. As a user becomes interested in a specific set of data (house) the datakey cookie would be set, and this would drive the selection of web process that is best suited to serve the request. Yes, but this is all driven by the client. Not a particularly good choice since the client doesn't have to follow the rules you impose; that is, a client can most easily disable cookies. I *think* I can give each web process a distinct url by using application pool configuration, but I haven't confirmed this yet. That doesn't solve your cache affinity problem. Doesn't it? I've not tried yet. Got any ideaas then? different web process, however I believe the DOS risk is sufficiently small that this design is still widely applicable. You are pushing a cookie to the client, the wrong client can regenerate multiple cookies that in turn drive the caching mechanism in your architecture right? Then, it's easy to flood the cache architecture from the client since every request is valid. I agree What's your DOS answer? -- Regards, Alvin Bruney ------------------------------------------------------ Shameless author plug Excel Services for .NET is coming... OWC Black book on Amazon and www.lulu.com/owc "Martin" <x@y.z> wrote in message news:%23KIyBnNLHHA.4460 (AT) TK2MSFTNGP03 (DOT) phx.gbl... Hello all, We know that designing a web application that is both scaleable and high performance is difficult. Scalability implies lots of web servers all referring back to a central SQL server, which in turn implies limited caching which in turn hurts performance opportunities. Clearly there is no right answer for all scenarios, but I have been thinking over a particular design which I would like to get your views on... This scenario involves a collection of data which is concerned with an overall user operation. The data is persisted across multiple tables, but the primary keys are hierarchical in nature. Eg house relates to rooms, relates to furniture. The house primary key forms part of the room and furniture primary keys. I would like to use typed datasets for all the benefits they have, and I would like to use timestamps to assist in concurrent edit checking. One dataset would hold the data for one house (plus related tables). I would like to cache the datasets in the asp.net application. If it times out, so be it I can go fetch a new version. I would expect data edits to be applied to the database as part of the web request operation, so dataset and database remain in sync. I would not anticipate using Session state for this application. I would cache data key in a client side cookie. I require affinity to the specific cache and therefore web process (across multiple servers and CPUs). *In my view getting into cache synchronisation across web servers will hurt the very performance gains we are trying to get via caching in the first place.* As a user becomes interested in a specific set of data (house) the datakey cookie would be set, and this would drive the selection of web process that is best suited to serve the request. Consequently as the user works with the site, different requests may be served by different web processes/servers. If the datakey cookie is not set, then no cache affinity is required. I have looked for some extension to Microsoft's Network Load Balancer using a provider pattern to allow me to control the selection criteria of a specific web process, but without success. I want to take advantage of the NLB heart beat facility. The scenario I imagine is say a collection of four web processes (spread say across two servers each with dual processor). I *think* I can give each web process a distinct url by using application pool configuration, but I haven't confirmed this yet. So I would expect my web process selection algorithm to be driven by the value of the cookie holding a datakey. The algorithm would distribute the requests according to the data keys. I was thinking something simple like modulo 4 of the house ID in this scenario. When a server goes down NLB should know this, and expose this to my provider code. My web process selection algorithm would check the required web process is alive (refering to NLB API), and make an alternate selection if necesary. So far as I am aware, the piece of the picture that is missing is a provider pattern API in NLB to facilitate this. I wonder if this is something that is on the drawing board at Microsoft (or a third party supplier for that matter). Apart from that piece missing, the main disadvantage I can see in this design is it's defence against denial of service attack. Theoretically attackers would need to select just four distinct IDs that each hit a different web process, however I believe the DOS risk is sufficiently small that this design is still widely applicable. Other issues that I know come into play include: security of datakey overhead of establishing potential ssl sessions on each new web process, as datakey changes (I think this is relatively infrequent) authentication cookies would need to apply to scope of entire web farm. authorisation to access data would need to occur in the web application, not just the database. NB this design does not preclude distributed web farm clusters on different continents (each cluster potentially caching the same data), because at the end of the day if concurrent data edits are detected, the dataset can be refreshed from the database, and the user can reconfirm their edit operation. Also in this scenario, there are likely to be multiple databases synchronised using replication. Typically the set of editable data would configured for each database. I would welcome a lively discussion on the viability on the design. Thanks very much for your time. Martin |
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
| |