巨大的 Django 会话表,正常行为还是错误?

发布于 2024-10-08 02:58:01 字数 502 浏览 5 评论 0原文

也许这是完全正常的行为,但我觉得 django_session 表比应有的大得多。

首先,我每天运行以下清理命令,因此大小不会过期会话引起:

DELETE FROM %s WHERE expire_date < NOW()

数字:

  • 我们有大约 5000 个唯一访问者(不包括机器人)每天。
  • SESSION_COOKIE_AGE 设置为默认值,2 周
  • 该表有略多于 1,000,000 行

因此,我猜测 Django 还会为访问该表的所有机器人生成会话密钥站点并且机器人不存储 cookie,因此它会不断生成新的 cookie。

但是……这是正常行为吗?是否有一个设置可以让 Django 不会为匿名用户生成会话,或者至少...不会为不使用会话的用户生成会话?

Perhaps this is completely normal behaviour, but I feel like the django_session table is much larger than it should have to be.

First of all, I run the following cleanup command daily so the size is not caused by expired sessions:

DELETE FROM %s WHERE expire_date < NOW()

The numbers:

  • We've got about 5000 unique visitors (bots excluded) every day.
  • The SESSION_COOKIE_AGE is set to the default, 2 weeks
  • The table has a little over 1,000,000 rows

So, I'm guessing that Django also generates session keys for all bots that visits the site and that the bots don't store the cookies so it continuously generates new cookies.

But... is this normal behaviour? Is there a setting so Django won't generate sessions for anonymous users, or atleast... no sessions for users that aren't using sessions?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

漆黑的白昼 2024-10-15 02:58:02

机器人是否可以访问您在用户会话中设置任何内容的页面(即使是匿名用户),或者您使用 session.set_test_cookie() 的任何页面(例如 Django 的默认登录视图)调用这个方法)?在这两种情况下都会创建一个新的会话对象。在 robots.txt 中排除此类 URL 应该会有所帮助。

Is it possible for robots to access any page where you set anything in a user session (even for anonymous users), or any page where you use session.set_test_cookie() (for example Django's default login view in calls this method)? In both of these cases a new session object is created. Excluding such URLs in robots.txt should help.

月隐月明月朦胧 2024-10-15 02:58:02

就我而言,我在 settings.py 中错误地设置了 SESSION_SAVE_EVERY_REQUEST = True ,而没有理解确切的含义。

然后,对 django 服务的每个请求都会生成一个会话条目,尤其是来自上游负载均衡器的心跳测试请求。经过几天的运行,django_session 表变成了一个巨大的表。

For my case, I wrongly set SESSION_SAVE_EVERY_REQUEST = True in settings.py without understanding the exact meaning.

Then every request to my django service would generate a session entry, especially the heartbeat test request from upstream load balancers. After several days' running, django_session table turned to a huge one.

聽兲甴掵 2024-10-15 02:58:02

Django 提供了管理命令来清理这些会话已过期!

Django offers a management command to cleanup these expired sessions!

思念绕指尖 2024-10-15 02:58:01

经过一些调试后,我成功地追踪了问题的原因。
我的一个中间件(以及我的大多数视图)中有一个 request.user.is_authenticated()

django.contrib.auth 中间件将 request.user 设置为 LazyUser()

来源:http://code.djangoproject.com/browser/django/trunk/django/contrib /auth/middleware.py?rev=14919#L13 (我不明白为什么那里有一个 return None,但是好吧......)

class AuthenticationMiddleware(object):
    def process_request(self, request):
        assert hasattr(request, 'session'), "The Django authentication middleware requires session middleware to be installed. Edit your MIDDLEWARE_CLASSES setting to insert 'django.contrib.sessions.middleware.SessionMiddleware'."
        request.__class__.user = LazyUser()
        return None

LazyUser 调用 get_user(request) 来获取用户:

来源:http://code.djangoproject.com/browser/django/trunk/django/contrib/auth/middleware.py?rev=14919#L5

class LazyUser(object):
    def __get__(self, request, obj_type=None):
        if not hasattr(request, '_cached_user'):
            from django.contrib.auth import get_user
            request._cached_user = get_user(request)
       return request._cached_user

get_user(request) 方法执行 user_id = request.session[SESSION_KEY]

来源:http://code.djangoproject.com/browser/django/trunk/django/contrib/auth/init .py?rev=14919#L100

def get_user(request):
    from django.contrib.auth.models import AnonymousUser
    try:
        user_id = request.session[SESSION_KEY]
        backend_path = request.session[BACKEND_SESSION_KEY]
        backend = load_backend(backend_path)
        user = backend.get_user(user_id) or AnonymousUser()
    except KeyError:
        user = AnonymousUser()
    return user

访问会话时,将 accessed 设置为 true:

来源:http://code.djangoproject.com/browser/django/trunk/django/contrib/sessions /backends/base.py?rev=14919#L183

def _get_session(self, no_load=False):
    """
    Lazily loads session from storage (unless "no_load" is True, when only
    an empty dict is stored) and stores it in the current instance.
    """
    self.accessed = True
    try:
        return self._session_cache
    except AttributeError:
        if self._session_key is None or no_load:
            self._session_cache = {}
        else:
            self._session_cache = self.load()
    return self._session_cache

这会导致会话初始化。该错误是由错误的会话后端引起的,当 accessed 设置为 true 时,该后端也会生成会话...

After a bit of debugging I've managed to trace cause of the problem.
One of my middlewares (and most of my views) have a request.user.is_authenticated() in them.

The django.contrib.auth middleware sets request.user to LazyUser()

Source: http://code.djangoproject.com/browser/django/trunk/django/contrib/auth/middleware.py?rev=14919#L13 (I don't see why there is a return None there, but ok...)

class AuthenticationMiddleware(object):
    def process_request(self, request):
        assert hasattr(request, 'session'), "The Django authentication middleware requires session middleware to be installed. Edit your MIDDLEWARE_CLASSES setting to insert 'django.contrib.sessions.middleware.SessionMiddleware'."
        request.__class__.user = LazyUser()
        return None

The LazyUser calls get_user(request) to get the user:

Source: http://code.djangoproject.com/browser/django/trunk/django/contrib/auth/middleware.py?rev=14919#L5

class LazyUser(object):
    def __get__(self, request, obj_type=None):
        if not hasattr(request, '_cached_user'):
            from django.contrib.auth import get_user
            request._cached_user = get_user(request)
       return request._cached_user

The get_user(request) method does a user_id = request.session[SESSION_KEY]

Source: http://code.djangoproject.com/browser/django/trunk/django/contrib/auth/init.py?rev=14919#L100

def get_user(request):
    from django.contrib.auth.models import AnonymousUser
    try:
        user_id = request.session[SESSION_KEY]
        backend_path = request.session[BACKEND_SESSION_KEY]
        backend = load_backend(backend_path)
        user = backend.get_user(user_id) or AnonymousUser()
    except KeyError:
        user = AnonymousUser()
    return user

Upon accessing the session sets accessed to true:

Source: http://code.djangoproject.com/browser/django/trunk/django/contrib/sessions/backends/base.py?rev=14919#L183

def _get_session(self, no_load=False):
    """
    Lazily loads session from storage (unless "no_load" is True, when only
    an empty dict is stored) and stores it in the current instance.
    """
    self.accessed = True
    try:
        return self._session_cache
    except AttributeError:
        if self._session_key is None or no_load:
            self._session_cache = {}
        else:
            self._session_cache = self.load()
    return self._session_cache

And that causes the session to initialize. The bug was caused by a faulty session backend that also generates a session when accessed is set to true...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文