PreparedStatement IN 子句替代方案?
将 SQL IN
子句与 java.sql.PreparedStatement
实例结合使用的最佳解决方法是什么,由于 SQL 注入攻击安全问题,不支持多个值: 一?
占位符表示一个值,而不是值列表。
考虑以下 SQL 语句:
SELECT my_column FROM my_table where search_column IN (?)
使用 preparedStatement.setString( 1, "'A', 'B', 'C'" );
本质上是一种无效的尝试,旨在解决使用的原因?
首先。
有哪些可用的解决方法?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(30)
SetArray 是最好的解决方案,但它不适用于许多旧驱动程序。在 java8 中可以使用以下解决方法
该解决方案比其他丑陋的 while 循环解决方案更好,其中查询字符串是通过手动迭代构建的
SetArray is the best solution but its not available for many older drivers. The following workaround can be used in java8
This solution is better than other ugly while loop solutions where the query string is built by manual iterations
我刚刚为此制定了一个特定于 PostgreSQL 的选项。它有点像 hack,有自己的优点、缺点和限制,但它似乎有效,并且不限于特定的开发语言、平台或 PG 驱动程序。
当然,技巧是找到一种方法将任意长度的值集合作为单个参数传递,并使数据库将其识别为多个值。我正在工作的解决方案是从集合中的值构造一个分隔字符串,将该字符串作为单个参数传递,并使用 string_to_array() 和 PostgreSQL 所需的转换来正确使用它。
因此,如果您想搜索“foo”、“blah”和“abc”,您可以将它们连接成一个字符串:“foo,blah,abc”。这是直接的 SQL:
显然,您可以将显式转换更改为您希望结果值数组成为的任何内容——int、text、uuid 等。并且因为该函数采用单个字符串值(或者我认为是两个字符串值,如果您也想自定义分隔符),您可以将其作为准备好的语句中的参数传递:
这甚至足够灵活,可以支持 LIKE 比较之类的内容:
同样,毫无疑问这是一个 hack,但它可以工作并允许您仍然使用预编译准备好的语句采用*ahem*离散参数,并具有随之而来的安全性和(可能)性能优势。它是可取的并且实际上有效吗?当然,这取决于您在查询运行之前就已经进行了字符串解析和可能的转换。如果您希望发送三个、五个或几十个值,当然,这可能没问题。几千?是的,也许没那么多。 YMMV,限制和排除适用,没有明示或暗示的保证。
但它有效。
I just worked out a PostgreSQL-specific option for this. It's a bit of a hack, and comes with its own pros and cons and limitations, but it seems to work and isn't limited to a specific development language, platform, or PG driver.
The trick of course is to find a way to pass an arbitrary length collection of values as a single parameter, and have the db recognize it as multiple values. The solution I have working is to construct a delimited string from the values in the collection, pass that string as a single parameter, and use string_to_array() with the requisite casting for PostgreSQL to properly make use of it.
So if you want to search for "foo", "blah", and "abc", you might concatenate them together into a single string as: 'foo,blah,abc'. Here's the straight SQL:
You would obviously change the explicit cast to whatever you wanted your resulting value array to be -- int, text, uuid, etc. And because the function is taking a single string value (or two I suppose, if you want to customize the delimiter as well), you can pass it as a parameter in a prepared statement:
This is even flexible enough to support things like LIKE comparisons:
Again, no question it's a hack, but it works and allows you to still use pre-compiled prepared statements that take *ahem* discrete parameters, with the accompanying security and (maybe) performance benefits. Is it advisable and actually performant? Naturally, it depends, as you've got string parsing and possibly casting going on before your query even runs. If you're expecting to send three, five, a few dozen values, sure, it's probably fine. A few thousand? Yeah, maybe not so much. YMMV, limitations and exclusions apply, no warranty express or implied.
But it works.
似乎没有其他人建议使用现成的查询生成器,例如 jOOQ 或 QueryDSL 甚至 管理 ="nofollow noreferrer">开箱即用的动态
IN
列表,可能包括对可能出现的所有边缘情况的管理,例如:IN
列表(无论绑定值的数量)(免责声明:我在 jOOQ 背后的公司工作)
No one else seems to have suggested using an off-the-shelf query builder yet, like jOOQ or QueryDSL or even Criteria Query that manage dynamic
IN
lists out of the box, possibly including the management of all edge cases that may arise, such as:IN
list (irrespective of the number of bind values)IN
list padding)(Disclaimer: I work for the company behind jOOQ)
只是为了完整性:只要值集不是太大,您也可以简单地字符串构造一个语句,
然后您可以将其传递给prepare(),然后在中使用setXXX()设置所有值的循环。这看起来很恶心,但许多“大型”商业系统通常会执行此类操作,直到达到特定于数据库的限制,例如 Oracle 中的语句为 32 KB(我认为是)。
当然,您需要确保该集合永远不会过大,或者在出现过大的情况下进行错误捕获。
Just for completeness: So long as the set of values is not too large, you could also simply string-construct a statement like
which you could then pass to prepare(), and then use setXXX() in a loop to set all the values. This looks yucky, but many "big" commercial systems routinely do this kind of thing until they hit DB-specific limits, such as 32 KB (I think it is) for statements in Oracle.
Of course you need to ensure that the set will never be unreasonably large, or do error trapping in the event that it is.
遵循亚当的想法。让您准备好的语句类似于 select my_column from my_table where search_column in (#)
创建一个字符串x并用多个“?,?,?”填充它取决于您的值列表
然后只需更改查询中的 # 即可填充新字符串 x
Following Adam's idea. Make your prepared statement sort of select my_column from my_table where search_column in (#)
Create a String x and fill it with a number of "?,?,?" depending on your list of values
Then just change the # in the query for your new String x an populate
我们可以对PreparedStatement 中的IN 子句使用不同的替代方法。
在PreparedStatement查询中使用NULL - 最佳性能,当您知道 IN 子句参数的限制时效果很好。如果没有限制,则可以批量执行查询。
示例代码片段是;
您可以在此处查看有关这些替代方法的更多详细信息。
There are different alternative approaches that we can use for IN clause in PreparedStatement.
Use NULL in PreparedStatement queries - Optimal performance, works great when you know the limit of IN clause arguments. If there is no limit, then you can execute queries in batch.
Sample code snippet is;
You can check more details about these alternative approaches here.
对于某些情况,正则表达式可能会有所帮助。
这是我在 Oracle 上检查过的一个示例,它有效。
但它有许多缺点:
For some situations regexp might help.
Here is an example I've checked on Oracle, and it works.
But there is a number of drawbacks with it:
在检查了不同论坛中的各种解决方案并且没有找到好的解决方案之后,我觉得我想出的以下 hack 是最容易遵循和编码的:
示例:假设您有多个参数要在“IN”子句中传递。只需在“IN”子句中放置一个虚拟字符串,例如“PARAM”确实表示将代替该虚拟字符串的参数列表。
您可以将所有参数收集到 Java 代码中的单个字符串变量中。这可以按如下方式完成:
您可以将所有用逗号分隔的参数附加到单个字符串变量“param1”中,在我们的例子中。
将所有参数收集到单个字符串中后,您可以将查询中的虚拟文本(即本例中的“PARAM”)替换为参数字符串(即 param1)。以下是您需要执行的操作:
您现在可以使用executeQuery() 方法执行查询。只需确保您的查询中任何地方都没有“PARAM”一词。您可以使用特殊字符和字母的组合来代替单词“PARAM”,以确保查询中不可能出现此类单词。希望你能找到解决方案。
注意:虽然这不是一个准备好的查询,但它完成了我希望我的代码完成的工作。
After examining various solutions in different forums and not finding a good solution, I feel the below hack I came up with, is the easiest to follow and code:
Example: Suppose you have multiple parameters to pass in the 'IN' clause. Just put a dummy String inside the 'IN' clause, say, "PARAM" do denote the list of parameters that will be coming in the place of this dummy String.
You can collect all the parameters into a single String variable in your Java code. This can be done as follows:
You can append all your parameters separated by commas into a single String variable, 'param1', in our case.
After collecting all the parameters into a single String you can just replace the dummy text in your query, i.e., "PARAM" in this case, with the parameter String, i.e., param1. Here is what you need to do:
You can now execute your query using the executeQuery() method. Just make sure that you don't have the word "PARAM" in your query anywhere. You can use a combination of special characters and alphabets instead of the word "PARAM" in order to make sure that there is no possibility of such a word coming in the query. Hope you got the solution.
Note: Though this is not a prepared query, it does the work that I wanted my code to do.
只是为了完整起见,并且因为我没有看到其他人提出这样的建议:
在实施上述任何复杂的建议之前,请考虑 SQL 注入是否确实是您的场景中的问题。
在许多情况下,提供给 IN (...) 的值是一个 id 列表,这些 ID 的生成方式可以确保不可能进行注入...(例如,之前 select some_id from some_table 的结果,其中some_condition。)
如果是这种情况,您可能只是连接此值,而不使用服务或准备好的语句,或者将它们用于此查询的其他参数。
Just for completeness and because I did not see anyone else suggest it:
Before implementing any of the complicated suggestions above consider if SQL injection is indeed a problem in your scenario.
In many cases the value provided to IN (...) is a list of ids that have been generated in a way that you can be sure that no injection is possible... (e.g. the results of a previous select some_id from some_table where some_condition.)
If that is the case you might just concatenate this value and not use the services or the prepared statement for it or use them for other parameters of this query.
PreparedStatement没有提供任何好的方法来处理SQL IN子句。每 http://www.javaranch.com/journal/200510/Journal200510.jsp #a2 “您不能替换本应成为 SQL 语句一部分的内容。这是必要的,因为如果SQL本身可以改变,驱动程序无法预编译语句,它还有一个很好的副作用,可以防止SQL注入攻击。”我最终使用了以下方法:
PreparedStatement doesn't provide any good way to deal with SQL IN clause. Per http://www.javaranch.com/journal/200510/Journal200510.jsp#a2 "You can't substitute things that are meant to become part of the SQL statement. This is necessary because if the SQL itself can change, the driver can't precompile the statement. It also has the nice side effect of preventing SQL injection attacks." I ended up using following approach:
好的,所以我记不清之前是如何(或在哪里)执行此操作的,因此我来到堆栈溢出以快速找到答案。我很惊讶我不能。
所以,很久以前我是如何解决 IN 问题的,是用这样的语句:
where myColumn in ( select regexp_substr(:myList,'[^,]+', 1, level) from Dual connect by regexp_substr (:myList, '[^,]+', 1, level) is not null)
将 myList 参数设置为逗号分隔的字符串:A,B,C,D...
注意:您必须设置参数两次!
OK, so I couldn't remember exactly how (or where) I did this before so I came to stack overflow to quickly find the answer. I was surprised I couldn't.
So, how I got around the IN problem a long time ago was with a statement like this:
where myColumn in ( select regexp_substr(:myList,'[^,]+', 1, level) from dual connect by regexp_substr(:myList, '[^,]+', 1, level) is not null)
set the myList parameter as a comma delimited string: A,B,C,D...
Note: You have to set the parameter twice!
这不是理想的做法,但它很简单并且在大多数情况下对我来说效果很好。
那你就穿越了? ID 的格式如下: |1|,|2|,|3|,...|
This is not the ideal practice, yet it's simple and works well for me most of the time.
Then you pass through ? the IDs in this way: |1|,|2|,|3|,...|
Jeanne Boyarsky 的 Batching Select statements in JDBC 条目。
建议的选项是:
SELECT my_column FROM my_table WHERE search_column = ? ;从 my_table 中选择 my_column WHERE search_column = ? ; ...
并执行它。 [或使用UNION ALL
代替这些分号。 --ed] 每个 size-of-IN-list 需要一个准备好的语句。愚蠢地慢,严格来说比WHERE search_column IN (?,?,?)
更糟糕,所以我不知道为什么博主甚至建议它。这些选项都不是理想的。
如果您使用 JDBC4 和支持
x = ANY(y)
的服务器,最好的选择是使用PreparedStatement.setArray
,如 鲍里斯的回答。不过,似乎没有任何方法可以使
setArray
与 IN 列表一起使用。有时,SQL 语句在运行时加载(例如,从属性文件),但需要数量可变的参数。在这种情况下,首先定义查询:
然后加载查询。然后在运行之前确定参数的数量。知道参数计数后,运行:
例如:
对于某些不支持通过 JDBC 4 规范传递数组的数据库,此方法可以帮助将慢速
= ?
转换为更快的IN (?)
子句条件,然后可以通过调用any
方法来扩展。An analysis of the various options available, and the pros and cons of each is available in Jeanne Boyarsky's Batching Select Statements in JDBC entry on JavaRanch Journal.
The suggested options are:
SELECT my_column FROM my_table WHERE search_column = ?
, execute it for each value and UNION the results client-side. Requires only one prepared statement. Slow and painful.SELECT my_column FROM my_table WHERE search_column IN (?,?,?)
and execute it. Requires one prepared statement per size-of-IN-list. Fast and obvious.SELECT my_column FROM my_table WHERE search_column = ? ; SELECT my_column FROM my_table WHERE search_column = ? ; ...
and execute it. [Or useUNION ALL
in place of those semicolons. --ed] Requires one prepared statement per size-of-IN-list. Stupidly slow, strictly worse thanWHERE search_column IN (?,?,?)
, so I don't know why the blogger even suggested it.SELECT my_column FROM my_table WHERE search_column IN (1,2,3,4,5,6,6,6,6,6)
. Any decent server will optimize out the duplicate values before running the query.None of these options are ideal.
The best option if you are using JDBC4 and a server that supports
x = ANY(y)
, is to usePreparedStatement.setArray
as described in Boris's anwser.There doesn't seem to be any way to make
setArray
work with IN-lists, though.Sometimes SQL statements are loaded at runtime (e.g., from a properties file) but require a variable number of parameters. In such cases, first define the query:
Next, load the query. Then determine the number of parameters prior to running it. Once the parameter count is known, run:
For example:
For certain databases where passing an array via the JDBC 4 specification is unsupported, this method can facilitate transforming the slow
= ?
into the fasterIN (?)
clause condition, which can then be expanded by calling theany
method.PostgreSQL 的解决方案:
或
Solution for PostgreSQL:
or
没有简单的方法据我所知。
如果目标是保持较高的语句缓存比率(即不为每个参数计数创建一个语句),您可以执行以下操作:
创建一个包含几个(例如 10 个)参数的语句:
... WHERE A IN (?,?,?,?,?,?,?,?,?,?) ...
绑定所有实际参数
setString(1,“foo”);
setString(2,"bar");
将其余部分绑定为 NULL
setNull(3,Types.VARCHAR)
...
setNull(10,Types.VARCHAR)
NULL 永远不会匹配任何内容,因此它会被 SQL 计划生成器优化。
当您将 List 传递到 DAO 函数时,逻辑很容易自动化:
No simple way AFAIK.
If the target is to keep statement cache ratio high (i.e to not create a statement per every parameter count), you may do the following:
create a statement with a few (e.g. 10) parameters:
... WHERE A IN (?,?,?,?,?,?,?,?,?,?) ...
Bind all actuall parameters
setString(1,"foo");
setString(2,"bar");
Bind the rest as NULL
setNull(3,Types.VARCHAR)
...
setNull(10,Types.VARCHAR)
NULL never matches anything, so it gets optimized out by the SQL plan builder.
The logic is easy to automate when you pass a List into a DAO function:
您可以使用
Collections.nCopies
生成占位符集合,并使用String.join
将它们连接起来:You can use
Collections.nCopies
to generate a collection of placeholders and join them usingString.join
:一个令人不愉快但确实可行的解决方法是使用嵌套查询。创建一个临时表 MYVALUES,其中包含一列。将值列表插入 MYVALUES 表中。然后执行
Ugly,但如果您的值列表非常大,那么这是一个可行的替代方案。
此技术的另一个优点是优化器可能会提供更好的查询计划(检查页面中的多个值、仅表扫描一次而不是每个值一次等),如果您的数据库不缓存准备好的语句,则可能会节省开销。您的“插入”需要批量完成,并且可能需要调整 MYVALUES 表以具有最小的锁定或其他高开销保护。
An unpleasant work-around, but certainly feasible is to use a nested query. Create a temporary table MYVALUES with a column in it. Insert your list of values into the MYVALUES table. Then execute
Ugly, but a viable alternative if your list of values is very large.
This technique has the added advantage of potentially better query plans from the optimizer (check a page for multiple values, tablescan only once instead once per value, etc) may save on overhead if your database doesn't cache prepared statements. Your "INSERTS" would need to be done in batch and the MYVALUES table may need to be tweaked to have minimal locking or other high-overhead protections.
in() 运算符的局限性是万恶之源。
它适用于简单的情况,您可以通过“自动生成准备好的语句”来扩展它,但它总是有其局限性。
in() 方法对于某些情况来说足够好,但不是火箭般的
:)防火箭的解决方案是在单独的调用中传递任意数量的参数(例如,通过传递参数的 clob),然后使用视图(或任何其他方式)在 SQL 中表示它们并在您的 where 条件中使用。
暴力变体在这里 http://tkyte.blogspot.hu/ 2006/06/varying-in-lists.html
但是,如果您可以使用 PL/SQL,这种混乱会变得非常整洁。
然后,您可以在参数中传递任意数量的逗号分隔的客户 ID,并且:
这里的技巧是:
该视图如下所示:
其中 aux_in_list.getpayload 指原始输入字符串。
一种可能的方法是传递 pl/sql 数组(仅 Oracle 支持),但是您不能在纯 SQL 中使用这些数组,因此始终需要转换步骤。这种转换无法在 SQL 中完成,所以毕竟,传递一个包含字符串中所有参数的 clob 并在视图中转换它是最有效的解决方案。
Limitations of the in() operator is the root of all evil.
It works for trivial cases, and you can extend it with "automatic generation of the prepared statement" however it is always having its limits.
The in() approach can be good enough for some cases, but not rocket proof :)
The rocket-proof solution is to pass the arbitrary number of parameters in a separate call (by passing a clob of params, for example), and then have a view (or any other way) to represent them in SQL and use in your where criteria.
A brute-force variant is here http://tkyte.blogspot.hu/2006/06/varying-in-lists.html
However if you can use PL/SQL, this mess can become pretty neat.
Then you can pass arbitrary number of comma separated customer ids in the parameter, and:
The trick here is:
The view looks like:
where aux_in_list.getpayload refers to the original input string.
A possible approach would be to pass pl/sql arrays (supported by Oracle only), however you can't use those in pure SQL, therefore a conversion step is always needed. The conversion can not be done in SQL, so after all, passing a clob with all parameters in string and converting it witin a view is the most efficient solution.
这是我在自己的应用程序中解决该问题的方法。理想情况下,您应该使用 StringBuilder 而不是对字符串使用 +。
如果您决定稍后更改查询,那么使用上面的 x 这样的变量而不是具体的数字会很有帮助。
Here's how I solved it in my own application. Ideally, you should use a StringBuilder instead of using + for Strings.
Using a variable like x above instead of concrete numbers helps a lot if you decide to change the query at a later time.
我从未尝试过,但是 .setArray() 会做你正在寻找的事情吗?
更新:显然不是。 setArray 似乎仅适用于来自您从先前查询中检索的 ARRAY 列或具有 ARRAY 列的子查询的 java.sql.Array。
I've never tried it, but would .setArray() do what you're looking for?
Update: Evidently not. setArray only seems to work with a java.sql.Array that comes from an ARRAY column that you've retrieved from a previous query, or a subquery with an ARRAY column.
我的解决方法是:
现在您可以使用一个变量来获取表中的一些值:
因此,准备好的语句可以是:
问候,
哈维尔·伊巴内斯
My workaround is:
Now you can use one variable to obtain some values in a table:
So, the prepared statement could be:
Regards,
Javier Ibanez
我想您可以(使用基本的字符串操作)在
PreparedStatement
中生成查询字符串,以使许多?
与列表中的项目数量相匹配。当然,如果您这样做,那么您距离在查询中生成巨大的链式
OR
仅有一步之遥,但查询中没有正确数量的?
字符串,我不知道你还能如何解决这个问题。I suppose you could (using basic string manipulation) generate the query string in the
PreparedStatement
to have a number of?
's matching the number of items in your list.Of course if you're doing that you're just a step away from generating a giant chained
OR
in your query, but without having the right number of?
in the query string, I don't see how else you can work around this.Spring允许 传递java.util.Lists 到 NamedParameterJdbcTemplate ,它根据参数数量自动生成 (?, ?, ?, ..., ?)。
对于 Oracle,此博文讨论 oracle.sql.ARRAY 的使用(Connection.createArrayOf 不适用于 Oracle)。为此,您必须修改 SQL 语句:
oracle 表函数< /a> 将传递的数组转换为可在
IN
语句中使用的类似表格的值。Spring allows passing java.util.Lists to NamedParameterJdbcTemplate , which automates the generation of (?, ?, ?, ..., ?), as appropriate for the number of arguments.
For Oracle, this blog posting discusses the use of oracle.sql.ARRAY (Connection.createArrayOf doesn't work with Oracle). For this you have to modify your SQL statement:
The oracle table function transforms the passed array into a table like value usable in the
IN
statement.您可以使用 此 javadoc:
You could use setArray method as mentioned in this javadoc:
这是一个完整的 Java 解决方案,可以为您创建准备好的语句:
Here's a complete solution in Java to create the prepared statement for you:
尝试使用 instr 函数?
那么
不可否认,这是一个有点肮脏的黑客,但它确实减少了 sql 注入的机会。无论如何,在甲骨文中工作。
try using the instr function?
then
Admittedly this is a bit of a dirty hack, but it does reduce the opportunities for sql injection. Works in oracle anyway.
Sormula 通过允许您提供 java.util.Collection 对象作为参数来支持 SQL IN 运算符。它创建一个带有 ? 的准备好的语句。对于集合中的每个元素。请参阅 示例 4(示例中的 SQL 是一条注释,用于澄清 Sormula 创建但不使用的内容) 。
Sormula supports SQL IN operator by allowing you to supply a java.util.Collection object as a parameter. It creates a prepared statement with a ? for each of the elements the collection. See Example 4 (SQL in example is a comment to clarify what is created but is not used by Sormula).
在PreparedStatement 中生成查询字符串,使多个“?”与列表中的项目数相匹配。这是一个例子:
Generate the query string in the PreparedStatement to have a number of ?'s matching the number of items in your list. Here's an example:
不要使用
Sql 语句作为
或
使用存储过程,这将是最好的解决方案,因为 sql 语句将被编译并存储在数据库服务器中
instead of using
use the Sql Statement as
and
or use a stored procedure this would be the best solution, since the sql statements will be compiled and stored in DataBase server
我遇到了与准备好的语句相关的许多限制:
在建议的解决方案中,我会选择不会降低查询性能且查询次数较少的解决方案。这将是来自 @Don 链接的#4(批处理几个查询)或为不需要的“?”指定 NULL 值由 @Vladimir Dyuzhev 提议的标记
I came across a number of limitations related to prepared statement:
Among the proposed solutions I would choose the one that doesn't decrease the query performance and makes the less number of queries. This will be the #4 (batching few queries) from the @Don link or specifying NULL values for unneeded '?' marks as proposed by @Vladimir Dyuzhev