共享“会话”的最佳方式是什么?拥有 40 台服务器的 4 个数据中心之间的信息?

发布于 2024-09-11 04:50:54 字数 182 浏览 4 评论 0原文

目前,我们的 DNS 将用户路由到正确的数据中心,然后我们对服务器进行循环处理。目前,我们将会话信息存储在 cookie 中,但它变得太大,因此我们希望将其从浏览器中移出并移至数据库中。我担心如果我们创建一个中间的盒子,他们都会点击响应时间会受到影响。存储所有机器的会话信息是不可行的,因为我们谈论的是每月超过 2 亿个唯一会话。有什么建议、想法吗?

Currently our DNS routes the user to the correct datacenter and then we have a round-robin situation for the servers. We currently store the session information in the cookie but it's grown too large so we want to move it out of the browser and into a database. I'm worried that if we create a midteir box that they all hit that the response times will be affected. It's not feasible to store the session info all all machines because we're talking about 200M+ unique sessions a month. Any suggestions, thoughts?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

烦人精 2024-09-18 04:50:54

memcached 的作业,或者,如果您想将会话数据保存到磁盘,memcacheddb

Memached 是一款免费且易于使用的软件。开源、高性能、
分布式内存对象缓存
系统,本质上是通用的,但是
旨在用于加速
动态网络应用程序
减轻数据库负载。

Memcached 是内存中的键值对
存储任意小块
结果中的数据(字符串、对象)
数据库调用、API 调用或页面
渲染。

Memcached 简单但功能强大。它是
简单的设计促进快速
部署、易于开发,以及
解决了大数据面临的诸多问题
缓存。它的API适用于大多数
流行语言。

A job for memcached or, if you want to save session data to disk, memcacheddb

Memached is a free & open source, high-performance,
distributed memory object caching
system, generic in nature, but
intended for use in speeding up
dynamic web applications by
alleviating database load.

Memcached is an in-memory key-value
store for small chunks of arbitrary
data (strings, objects) from results
of database calls, API calls, or page
rendering.

Memcached is simple yet powerful. Its
simple design promotes quick
deployment, ease of development, and
solves many problems facing large data
caches. Its API is available for most
popular languages.

穿透光 2024-09-18 04:50:54

让我们了解基于浏览器的 Cookie 的作用

  • Cookie 是按浏览器存储的
    轮廓。
  • 同一用户从不同计算机或浏览器登录
    考虑不同的用户。
  • 状态 cookie 与用户 cookie 混合在一起

隔离 cookie。

  • 长期状态cookie,例如当前记住的userId。
  • 会话状态 cookie
  • 用户 cookie

了解到您的站点才刚刚开始考虑服务器端 cookie,这意味着 cookie 的隔离尚未完成。用户cookie应尽可能存储在服务器上,以便当用户在另一台计算机或浏览器上登录时,首选项和购物车得以保留。您的开发团队必须决定将某些 cookie(例如购物车)置于会话状态 cookie 或用户信息 cookie 之间。

用户cookie
无论用户在何处登录,都需要可以在整个网站上进行访问。您的开发人员必须决定,当用户更新首选项或购物车时,如果相同的用户 ID 在其他位置登录,则该更改应该多快可见。

这意味着您必须实现分布式数据库系统。您有一个主数据库服务器。假设您有 20 台 Web 服务器,每台服务器都有自己的数据库。

仅将经常更改的 cookie 存储在本地数据库上,并将不经常更改的 cookie 保留在主数据库上。

每次在本地数据库更新 cookie 时,更新的标志都会排队等待更新到主数据库。主服务器中的 cookie 记录不会更新,仅使用新数据所在的位置编号标记为过时。因此,如果该用户 ID 以某种方式同时在 3000 英里外被激活,该会话将找出陈旧记录并触发请求,将这些记录从新位置复制到其自己的本地数据库和主数据库,并且记录不再标记为主数据库上已过时。

然后,您安排定期同步最常用的 Cookie。同步频率可以是每晚或者取决于 cookie 修改的特征结果。

首先,您的程序员需要编写一个例程来记录所有 cookie 读/写。您应该收集一周的 cookie 读/写活动来执行初始组件分析。

您可以对每个 cookie、用户 ID 和更改频率执行简单的统计特征分析。然后,您可以根据自己的偏好来决定将哪个 cookie 推送到所有本地数据库,哪个保留在主数据库上。该决策在本地数据库上的 cookie 块的大小和您愿意允许的数据库同步频率之间进行权衡。这意味着并非每个用户都传播相同的 cookie 集。当然,您的程序员需要编写例程来自动执行定期重新表征。您可能希望通过使用集群分析对用户进行分组来减轻 cookie 传播的处理负载,而不是针对每个用户。您网站的用户分组可能非常明显,因此您无需执行聚类分析。

您可能会惊讶地发现大多数 cookie 可能属于比每周更新时间更长的存储桶。或者更糟糕的情况是每日更新。您应该接受的最坏情况是每小时更新一次未推送到本地数据库的 cookie 字段。您希望增加 cookie 访问发生在本地数据库上而不是从主数据库中提取的机会。因此,当用户决定点击很少更改的“首选项”时,您会先发制人地从主服务器中提取首选项记录,同时用一些装饰来分散用户的注意力,例如“您是否考虑过预览我们的新服务?”,“您想回答吗?”我们的可用性调查?”、“新的 Gibson 咆哮,你会发表评论吗?”等等,直到“首选项”cookie 被复制过来。

cookie 的特征可以根据用户 ID 或用户集群来确定将哪个 cookie 字段推送到本地数据库。

描述每个用户 ID 的特征更为简单,因为它几乎不涉及程序员的任何统计分析技能。缺点是 Web 服务器必须为 2 亿用户中的每一个执行决策。数据库 cookie 表将是

Cookie[id, param, value, expectedMutationInterval].

您的 Web 服务器将根据阈值时间决定每个用户定期推送哪个 cookie。

SELECT param, value
WHERE expectedMutationInterval < $thresholdTime
  AND id = UserId

您必须定期对 cookie 进行重新定性,以更新每个用户每个 cookie 的预期突变间隔 (expectedMutationInterval)。一个简单的 SQL 查询就能够执行预期突变间隔的更新。可以执行更复杂的分析来生成预期突变间隔值。

如果每个 cookie 字段更改均按时间、用户 ID 和 ipaddr 记录,那么您的 Cookie 日志表将帮助

CookieLog[id, time, ipaddr, param, value].

您的自动重新表征例程根据星期/月份/季节和位置/区域/ipaddr 决定推送哪些字段。

然后,从浏览器中删除用户信息 cookie 后,如果您仍然发现会话 cookie 溢出,您现在可以决定将哪些会话 cookie 推送到浏览器以及哪些保留在本地服务器上。您使用相同的主本地数据库分析技术,但现在用于在本地数据库和推送到浏览器之间做出决定。您可以将最不常访问的会话 cookie 作为会话属性或内存数据库留在本地服务器上。因此,当客户端发现 cookie 丢失时,它会向服务器请求 cookie,同时牺牲浏览器上一些最近/不常使用的 cookie 空间来容纳新 cookie 的放置。

由于这些是会话 cookie,因此需要将它们传播到其他位置,因为如果同一个用户 ID 在 3000 英里之外登录,则它应该有自己的一组会话 cookie。

浏览器 cookie 的特征很讽刺,因为对于 AJAX 应用程序,客户端在不让服务器知道的情况下访问 cookie。让服务器知道可能会破坏在浏览器中放置 cookie 的初衷。因此,您必须选择空闲时间将 cookie 访问发送到服务器进行记录 - 出于表征目的。

这种粒度级别对于长度较短(参数值 + 参数名称)的 cookie 很有用,无论是基于会话的 cookie 还是基于用户的 cookie。

因此,如果您的参数名称和 cookie 字段值很长,您可能会寻求对它们进行量化。
然而,量化有点复杂。浏览器 cookie 有很多共性。就像任何量化/压缩方法一样,您寻找共性簇并为每个共性块分配一个签名。然后,cookie 以量化签名的形式存储。

如何促进基于浏览器的 cookie 的量化?以 GWT 为例,使用 Dictionary 或 Map 类。

例如,cookie“%1”=“^$Kdm3i”可能会转换为 LastConnectedFriend=MohammadAli@jinnah。

您不需要执行特征描述,例如,当您可以将 cookie 映射到“%1”时,为什么将 cookie 存储为“LastConnectedFriend”?当用户登录时,为什么不映射最常访问的朋友等,并将该映射放在 GWT/AJAX 启动页面上?这样您就可以缩短会话 cookie 的长度。

那么,贵公司正在寻找统计程序员吗?免责声明是,这是即兴写的,可能需要一些事实调整。

Let's understand the role of browser-based cookies

  • Cookies are stored per browser
    profile.
  • The same user logged on from different computers or browsers is
    considered different users.
  • State cookies are mixed with user cookies

Segregate the cookies.

  • Long-term state cookies, e.g. the currently-remembered userId.
  • session state cookies
  • user cookies

Reading that your site is only beginning to consider server-side cookies implies that a segregation of cookies has not yet been done. User cookies should be stored on server as much as possible, so that when a user logs on at another computer or browser, the preferences and shopping carts are preserved. Your development team has to decide for some cookies, for example shopping carts, to be between being session-state or user info cookies.

User cookies
Need to be accessible across the web site, regardless where the user logs in. Your developers have to decide, when a user updates a preference or shopping cart, how immediate should that change be visible if the same userId is logged in at another location.

Which means you have to implement a distributed database system. You have a master db server. Let us say you have 20 web servers, each server with its own database.

Store only frequently changed cookies on the local db and leave the infrequently changing cookies on the master.

Everytime a cookie is updated at a local db, a updated flag is queued for update to the master. The cookie record in the master is not updated, only marked as stale with the location number where the fresh data resides. So that if that userid somehow gets activated 3000 miles away simultaneously, that session would find out the stale records and trigger a request to copy from those records from the fresh location to its own local db and the master db and the records no longer marked as stale on the master db.

Then you schedule a regular sync of most frequently used cookies. The frequency of sync could be nightly or depends on the result of characterization of cookie modification.

First, your programmers would need to write a routine to log all cookie read/writes. You should collect a week's worth of cookie read/write activity to perform your initial component analysis.

You perform simple statistical characterization per cookie, userid and frequency of change. Then you slide along your preferences deciding which cookie is pushed to all the local dbs and which stays on the master. The decision balances between the size of the cookie block on the local dbs and the frequency of database sync you are willing to allow. Which means not every user have the same set of cookies propagated. of course, your programmers would need to write routines to automate the regular recharacterization. Rather than per user, you might wish to lighten the processing load of cookie propagation by grouping your users using cluster analysis. May be the grouping of users for your site is so obvious that you need not perform cluster analysis.

You might be surprised to find that most of the cookies could fall into the longer-than-weekly-update bucket. Or the worse case, daily-update. and the worst case you should accept is hourly update for cookie fields which are not pushed onto the local dbs. You want to increase the chances that a cookie access occurs on the local db rather than being pulled from the master database. So when a user decides to click on "preferences" which is seldom changed, you preemptively pull the preferences records from the master while distracting the user with some frills like "have you considered preview our new service?", "would you like to answer our usability survey?", "new Gibson rant, would you comment?", etc until the "preferences" cookies are copied over.

The characterization of cookies could be done per userid, or per cluster of users to decide which cookie field to push around to local dbs.

It is more simplistic to characterize per userid because it barely involves any statistical analysis skills on the part of the programmer. The disadvantage is that the web server would have to perform decisions for each of 200 million users. The database cookie table would be

Cookie[id, param, value, expectedMutationInterval].

You web server would decide per user which cookie push regularly by the threshold time.

SELECT param, value
WHERE expectedMutationInterval < $thresholdTime
  AND id = UserId

You have to perform a regular recharacterization of cookies to update expectedMutationInterval per user per cookie. A simple SQL query would be able to perform the update of expectedMutationInterval. A more complex analysis could be performed to produce the value expectedMutationInterval.

If each cookie field change is logged by time, userid and ipaddr then your Cookie log table would be

CookieLog[id, time, ipaddr, param, value].

which would help your automated recharacterization routine decide what fields to push depending on the dayofweek/month/season and location/region/ipaddr.

Then after removing user info cookies from the browser, if you still find your sessison cookies overflowing, you now decide which session cookies to push to the browser and which stays on the local server. You use the same master-local db analysis technique but now used to decide between local db and pushing to browser. You leave your least frequently accessed session cookies on the local server, either as session attributes or on in-memory db. So when a client finds a cookie is missing, it makes are request to the server for the cookie while sacrificing some least recently/frequently used cookie space on the browser to accommodate placing of that fresh cookie.

Since these are session cookies, they need be propagated to other locations because if a same userid is logged on 3000 miles away, it should have its own set of session cookies.

Characterization of browser cookies are an irony because, for AJAX apps, the client accesses the cookies without letting the server know. Letting the server know might defeat the purpose of placing the cookies in the browser in the first place. So you would have to choose idle times to send cookie accesses to the server to log - for characterization purposes.

Such level of granularity is good for cookies that are short in lengths (parameter value + parameter name), be it session based or user based cookies.

Therefore, if your parameter names and values of cookie fields are long, you might seek to quantize them.
However, quantization is a little more complex. Browser cookies have a lot of commonality. Just like any quantization/compression method, you look for the clusters of commonalities and assign each commonality block a signature. Then the cookies are stored in terms of the quantized signature.

How do you facilitate quantization of browser-based cookies? Using GWT as an example, use the Dictionary or Map class.

e.g., the cookie "%1"="^$Kdm3i" might translate to LastConnectedFriend=MohammadAli@jinnah.

You should not need to perform characterization, for example, why store your cookie as "LastConnectedFriend" when you could map it to "%1"? When a user logs in, why not map the most frequently accessed friends, etc, and place that map on the GWT/AJAX launching page? In that way you could shorten your session cookie lengths.

So, is your company looking for a statistical programmer? Disclaimer is, this is written off-the-cuff and might need some factual realignment.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文