VBScript - 在 SQL Server 中将 SHA1 存储为数字或二进制值

发布于 2024-11-03 03:00:19 字数 2385 浏览 0 评论 0原文

我当前将 SHA1 值存储在 SQL Server 中作为 char(40)。我的印象是,通过将此字段更改为数值,可以提高查找速度。但是,我不确定使用什么字段/数据类型将其存储在 SQL Server 中以及如何在 VBScript 中转换它。我应该使用数字还是小数以及需要使用多少位数字?

我在某处读到建议使用 Binary(20) 。然而,在 VBScript 中使用二进制值似乎不太容易,因此我假设使用数值会更好。

目前这是我的 SHA1 函数。我将它返回的字符串值存储在数据库的 char(40) 字段中,并使用下面的第二位代码执行查找。

Private Function SHA1(s)
    Dim asc, enc, bytes, outstr, pos
    Set asc = CreateObject("System.Text.UTF8Encoding")
    Set enc = CreateObject("System.Security.Cryptography.SHA1CryptoServiceProvider")
    'Convert the string to a byte array and hash it
    bytes = asc.GetBytes_4(s) 'This is how you use .Net overloaded methods in VBScript
    bytes = enc.ComputeHash_2((bytes))
    outstr = ""
    'Convert the byte array to a hex string
    For pos = 1 To Lenb(bytes)
        outstr = outstr & LCase(Right("0" & Hex(Ascb(Midb(bytes, pos, 1))), 2))
    Next
    SHA1 = outstr
    Set asc = Nothing
    Set enc = Nothing
End Function

这是我的查找功能。它的运行速度已经相当快了,但我正在寻找任何可以优化我的代码的方法。如果我确实使用二进制来存储数据,那么当我查找数据时我也必须使用它。我想我可以使用存储过程,它允许我使用 SQL Server 函数来来回转换。也许那会是一条更好的路线。请指教。

Function GetHTTPRefererIDBySHA1(s)
    Dim r
    Set r = Server.CreateObject("ADODB.Recordset")      
    r.open "SELECT httprefererid FROM httpreferer " & _
            "WHERE sha1 = '" & s & "'", con, adOpenForwardOnly, adLockReadOnly
    If Not (r.eof and r.bof) then
        GetHTTPRefererIDBySHA1 = r("httprefererid")
    End If
    r.close
    set r = nothing
End Function

编辑:
感谢 ScottE 和 Google,我能够显着加快查询速度。以下是有关我的解决方案的一些信息。
1)我创建了一个名为 SHA1Bin 的字段。它是一个二进制(20)类型的字段。
2)当我插入新记录时,我使用存储过程。因为我不太关心空间,所以我将原始 httpreferer 值及其 SHA1 二进制值保存在同一个表和同一行中。我的存储过程使用 HashBytes 函数 (SQL Server 2008) 将原始值转换为 SHA1 二进制文件。
3) 我在 VBScript 中的 SHA1 函数与上面相同,但我现在在查找时使用它。这是 GetReferer 函数的修改版本:

Function GetHTTPRefererIDBySHA1(s)
    Dim r
    Set r = Server.CreateObject("ADODB.Recordset")      
    r.open "SELECT httprefererid FROM httpreferer WHERE " & _
            "sha1bin = CONVERT(binary(20), 0x" & SHA1(s) & ")", _
            tcon, adOpenForwardOnly, adLockReadOnly

    If Not (r.eof and r.bof) then
        GetHTTPRefererIDBySHA1 = r("httprefererid")
    Else
        '//Insert new record code intentionally omitted
    End If
    r.close
    set r = nothing
End Function

I'm currently storing my SHA1 value in SQL Server as char(40). I'm under the impression that I could possible increase the speed of my lookups by changing this field to a numeric value. However, I'm uncertain of what field/data type to use to store this in SQL Server and how to convert it in VBScript. Should I use number or decimal and how many digits do I need to use?

I have read somewhere that using Binary(20) is recommended. However, working with Binary values in VBScript doesn't seem to be too easy so I'm assuming that I'll be better off using a numeric value instead.

Currently this is my SHA1 function. I store the string value it returns in my char(40) field in the database and perform my lookups using the second bit of code below.

Private Function SHA1(s)
    Dim asc, enc, bytes, outstr, pos
    Set asc = CreateObject("System.Text.UTF8Encoding")
    Set enc = CreateObject("System.Security.Cryptography.SHA1CryptoServiceProvider")
    'Convert the string to a byte array and hash it
    bytes = asc.GetBytes_4(s) 'This is how you use .Net overloaded methods in VBScript
    bytes = enc.ComputeHash_2((bytes))
    outstr = ""
    'Convert the byte array to a hex string
    For pos = 1 To Lenb(bytes)
        outstr = outstr & LCase(Right("0" & Hex(Ascb(Midb(bytes, pos, 1))), 2))
    Next
    SHA1 = outstr
    Set asc = Nothing
    Set enc = Nothing
End Function

Here's my lookup function. It operates quite quickly already but I'm looking for any way I can to optimize my code. If I do use binary to store the data I'm going to have to use it when I look it up too. I suppose I could possibly use stored procedures which would allow me to use SQL Server functions to convert back and forth. Maybe that would be a better route. Please advise.

Function GetHTTPRefererIDBySHA1(s)
    Dim r
    Set r = Server.CreateObject("ADODB.Recordset")      
    r.open "SELECT httprefererid FROM httpreferer " & _
            "WHERE sha1 = '" & s & "'", con, adOpenForwardOnly, adLockReadOnly
    If Not (r.eof and r.bof) then
        GetHTTPRefererIDBySHA1 = r("httprefererid")
    End If
    r.close
    set r = nothing
End Function

Edit:
Thanks to ScottE and Google I was able to speed up my queries noticeably. Here's a little information on my solution.
1) I created a field called SHA1Bin. It's a field of type binary(20).
2) When I insert a new record I use a stored procedure. Because I'm not overly concerned about space, I save the raw httpreferer value and the SHA1 binary value of it in the same table and same row. My stored procedure converts the raw value to SHA1 binary using the HashBytes function (SQL Server 2008).
3) My SHA1 function in VBScript remains the same as above but I now use it when I do lookups. Here's a modified version of the GetReferer function:

Function GetHTTPRefererIDBySHA1(s)
    Dim r
    Set r = Server.CreateObject("ADODB.Recordset")      
    r.open "SELECT httprefererid FROM httpreferer WHERE " & _
            "sha1bin = CONVERT(binary(20), 0x" & SHA1(s) & ")", _
            tcon, adOpenForwardOnly, adLockReadOnly

    If Not (r.eof and r.bof) then
        GetHTTPRefererIDBySHA1 = r("httprefererid")
    Else
        '//Insert new record code intentionally omitted
    End If
    r.close
    set r = nothing
End Function

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

柒夜笙歌凉 2024-11-10 03:00:19

我认为你的方向相对正确;但是,您可以采取一些措施来加快速度。

SHA1 背景

无论您在哪里读到SHA1 使用二进制文件(20),都几乎是正确的。 SHA1 是一个 160 位消息(20 个字节),我们通常以原始格式使用它 - 正如您已经知道的,因为您的函数将原始二进制文件转换为字符串。

转换为 NUMERIC

所以无论如何,20 字节就是 20 字节。您无法将其转换为其他内容以使其对数据库执行得更快。尝试将其转换为数字将不会成功,因为您将收到算术溢出错误(数字只有 17 个字节的空间)。

如何让它变得更好

您已经成功了一半。如果在 VBScript 中更容易使用,则可以将数据保留为字符数据类型。或者,您可以将其存储为 BINARY(20);这是我在数据仓库项目中采用的方法。如果要将其保留为字符串,请将其设为 CHAR(20) 而不是 CHAR(40)。 CHAR 数据类型存储指定的字节数,即使其中一半是空的(这几乎就是您的情况)。其中的一个“陷阱”是,您的函数会在字符串的前面呈现一个“0x...”,从技术上讲,这不是该值的一部分,但在构造您的函数时有必要表明该值是二进制的SQL 语句。因此,您可以使用 CHAR(22) 或仅在必要时进行串联。无论哪种情况,通过减少字段定义中的字符数,SQL 都会执行更少的读取来获取数据,从而加快速度。另一种数据类型替代方案是 VARCHAR,它将修剪字符串末尾的空格(同样,更少的读取会带来愉快的查询)。

除此之外,像您所做的那样对其进行索引。如果您还没有这样做,请在 SHA1 列上创建索引并在索引中包含 httprefererid,您的查询将仅使用索引来执行选择,并且将是最快的,因为仅需要必要的数据元素将被阅读。这称为覆盖索引(因为它覆盖了您的过滤器和选定的列)。该索引看起来像:

create index ix_httpreferer_sha1 on dbo.httpreferer (sha1) include (httprefererid);

希望有帮助!

I think that you're relatively on the right track; however, there are a couple of things that you can do to make this a tad faster.

SHA1 Background

Wherever you read that SHA1 was using binary(20) is pretty much dead on. SHA1 is a 160-bit message (20 bytes) that we usually play with in it's raw format - as you already know since you're function converts that raw binary into a string.

Converting to NUMERIC

So regardless, 20 bytes is 20 bytes. You can't convert it to something else to make it perform faster for the database. Trying to convert it to a numeric will be unsuccessful as you will get an arithmetic overflow error (numeric only has space for 17 bytes).

How to Make it Better

You have half the battle done. You can keep the data as a character data type if it is easier to work with in VBScript. Alternatively, you could store it as a BINARY(20); this is the approach I take for my data warehouse projects. If you are going to keep it as a string, make it a CHAR(20) rather than a CHAR(40). The CHAR data type stores the number of bytes specified, even if half of them are empty (which is nearly the case for you). The one "gotcha" in this is that your function will like render a "0x..." at the front of the string which is technically not part of the value, but is necessary to indicate that the value is a binary when constructing your SQL statement. As such, you could use a CHAR(22) or just do the concatenation where necessary. In either case, by reducing the number of characters in the field definition, SQL performs fewer reads to get at your data, which will speed things up. Another data type alternative would be a VARCHAR, which will trim the whitespace at the end of the string (again, fewer reads makes for a happy query).

Aside from that, index it just as you have done. If you've not done so already, create an index on your SHA1 column and include the httprefererid in the index, your query will use only the index to do your select and will be the fastest that it can be as only the data elements necessary will have been read. This is called a covering index (because it covers your filter plus selected columns). That index would look something like:

create index ix_httpreferer_sha1 on dbo.httpreferer (sha1) include (httprefererid);

Hope that helps!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文