从查询中获取表架构
根据 MSDN,SqlDataReader.GetSchemaTable
返回所执行查询的列元数据。我想知道是否有类似的方法可以为给定的查询提供表元数据?我的意思是涉及哪些表以及它有哪些别名。
在我的应用程序中,我收到查询,并且需要以编程方式附加 where
子句。使用GetSchemaTable()
,我可以获得列元数据及其所属的表。但即使表有别名,它仍然返回真实的表名。有没有办法获取该表的别名?
以下代码显示获取列元数据。
const string connectionString = "your_connection_string";
string sql = "select c.id as s,c.firstname from contact as c";
using(SqlConnection connection = new SqlConnection(connectionString))
using(SqlCommand command = new SqlCommand(sql, connection))
{
connection.Open();
SqlDataReader reader = command.ExecuteReader(CommandBehavior.KeyInfo);
DataTable schema = reader.GetSchemaTable();
foreach (DataRow row in schema.Rows)
{
foreach (DataColumn column in schema.Columns)
{
Console.WriteLine(column.ColumnName + " = " + row[column]);
}
Console.WriteLine("----------------------------------------");
}
Console.Read();
}
这将正确地为我提供列的详细信息。但是,当我看到 Id
列的 BaseTableName
时,它给出的是 contact
而不是别名 c
。有什么方法可以从上面这样的查询中获取表架构和别名吗?
任何帮助都会很棒!
编辑
虽然我可以使用 Rob 建议的执行计划,但我会很感激任何替代的简单方法。
回答tomekszpakowicz的问题
您(或您的应用程序)是来源吗 有问题的查询?在这种情况下 你应该知道别名。
我不是查询的作者。我们有一个系统,用户可以在其中输入查询。我们使用我上面解释的方法从中构建列。这些详细信息将被保留,其他用户可以使用它,例如添加新条件等。因此,我们需要根据我们拥有的信息动态构建 SQL。因此,当列有别名并且我们没有获取别名时,构造的 where 子句将无效。
谢谢
As per MSDN, SqlDataReader.GetSchemaTable
returns column metadata for the query executed. I am wondering is there a similar method that will give table metadata for the given query? I mean what tables are involved and what aliases it has got.
In my application, I get the query and I need to append the where
clause programically. Using GetSchemaTable()
, I can get the column metadata and the table it belongs to. But even though table has aliases, it still return the real table name. Is there a way to get the aliase name for that table?
Following code shows getting the column metadata.
const string connectionString = "your_connection_string";
string sql = "select c.id as s,c.firstname from contact as c";
using(SqlConnection connection = new SqlConnection(connectionString))
using(SqlCommand command = new SqlCommand(sql, connection))
{
connection.Open();
SqlDataReader reader = command.ExecuteReader(CommandBehavior.KeyInfo);
DataTable schema = reader.GetSchemaTable();
foreach (DataRow row in schema.Rows)
{
foreach (DataColumn column in schema.Columns)
{
Console.WriteLine(column.ColumnName + " = " + row[column]);
}
Console.WriteLine("----------------------------------------");
}
Console.Read();
}
This will give me details of columns correctly. But when I see BaseTableName
for column Id
, it is giving contact
rather than the alias name c
. Is there any way to get the table schema and aliases from a query like the above?
Any help would be great!
Edit
While I could use the execution plan suggested by Rob, I'd appreciate any alternative simple approaches.
Answering questions by tomekszpakowicz
Are you (or your application) source
of the query in question? In that case
you should know the aliases.
I am not the author of queries. We have a system where users can enter the query. We build columns out of it using the method I explained above. These details will be persisted and another user can use this like adding new criteria etc. So we need to build the SQL dynamically from the information we have. So when a column is aliased and we are not getting alias name, then the where clause constructed will be invalid.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
简短回答
这行不通。根据设计,您无法从结果模式中获取表别名。并且您不能依赖能够从查询执行计划中获取它们。
长答案
当您获得 SQL 查询的结果时,该查询已经被解析、验证、优化、编译成某种内部表示并执行。别名是查询“源代码”的一部分,通常会在步骤 1 和 2 附近丢失。
执行查询后,唯一可以视为表的内容是 a) 真实的物理表和 b) 视为单个匿名表的返回数据。两者之间的一切都可以转变或完全优化。
如果 DBMS 需要保留别名,那么优化复杂查询实际上是不可能的。
可能的解决方案
我建议重述一个问题:
您(或您的应用程序)是相关查询的来源吗?在这种情况下,您应该知道别名。
如果您收到其他人提供的查询...嗯...这取决于您为什么添加原因。
在最坏的情况下,您必须自己解析查询。
在最好的情况下,您可以让他们访问视图而不是真实的表,并将 where 子句放入视图中。
简单而丑陋的解决方案
如果我正确理解您的要求:
用户A在您的程序中输入查询。
用户 B 可以运行它(但无法编辑它)并查看返回的数据。
此外,她可以使用您提供的某种小部件基于返回的列添加过滤器。
您不想在应用程序内部应用过滤器,而是将它们添加到查询中,以避免从数据库中获取不必要的数据。
在这种情况下:
当 A 编辑查询时,尝试运行它并收集返回列的元数据。
如果 ColumnName 不唯一,请向作者投诉。
使用查询存储元数据。
当 B 添加过滤器(基于查询元数据)时,存储两个列名称
和条件。
执行时:
检查过滤列是否仍然有效(A 可能已更改查询)。
如果不删除无效过滤器和/或通知 B.
执行查询,如下所示:
<前><代码>选择*
from ({A 输入的查询}) x
其中 x.Column1 op1 Value1
和 x.Column2 op2 Value2
如果您想优雅地处理数据库架构更改,您需要添加一些额外的检查以确保元数据与查询真正返回的内容一致。
安全说明
您的程序将把用户 A 编写的查询直接传递到数据库。
至关重要的是,您使用数据库连接来执行此操作的权限不超过 A 的数据库权限。
否则,您将要求基于 SQL 注入的漏洞利用。
推论
如果出于安全原因,用户 A 无法直接访问数据库,则不能使用上述解决方案。
在这种情况下,确保其安全的唯一方法是确保您的应用程序理解 100% 的查询,这意味着在您的程序中解析查询并仅允许您认为安全的操作。
Short answer
This won't work. You cannot, by design, get table aliases from result schema. And you cannot rely on being able to get them from query execution plan.
Long answer
When you get result of a SQL query, the query has already been parsed, validated, optimized, compiled into some internal representation and executed. Aliases are part of query's "source code" and are usually lost somewhere around step 1 and 2.
After query is executed the only things that can be seen as tables are a) real physical tables and b) returned data seen as single anonymous table. Everything between can be transformed or completely optimized out.
If DBMSes were required to retain aliases it would be practically impossible to optimize complex queries.
Possible solutions
I suggest restating a problem:
Are you (or your application) source of the query in question? In that case you should know the aliases.
If you get queries provided by someone else... Well... That depends on why are you adding where causes.
In the worst case, you'll have to parse queries yourself.
In the best case, you could give them access to views instead of real tables and put where clauses in the views.
Simple and ugly solution
If I understand your requirements correctly:
User A enters query into your program.
User B can run it (but cannot edit it) and sees returned data.
Additionally she can add filters based on returned columns using some kind of widget provided by you.
You don't want to apply filter inside application but instead add them to the query, in order to avoid fetching unnecessary data from database.
In that case:
When A edits query try to run it and gather metadata for returned columns.
If
ColumnName
s are not unique, complain to the author.Store metadata with query.
When B adds filter (based on query metadata), store both columns names
and conditions.
At execution:
Check if filter columns are still valid (A might have changed query).
If not remove invalid filters and/or inform B.
Execute query as something like:
If you want to gracefully handle database schema changes you need to add some additional checks to make sure metadata is consistent with what query really returns.
Security note
Your program is going to pass a query written by user A straight to database.
It is crucial that you do it using database connection with permissions which do not exceed A's database permissions.
Otherwise you are asking for SQL injection based exploits.
Corollary
If user A doesn't have direct access to the database out of security reasons, you cannot use above solution.
In that case the only way to make it safe is to make sure your application understands 100% of the query which means parsing it in your program and allowing only operations you consider safe.
您可以获得查询的执行计划,然后分析返回的 XML。这类似于使用 Management Studio 中的“显示估计计划”选项。
You could get the execution plan for the query, and then analyse the XML that's returned. This is like using the "Show Estimated Plan" option in Management Studio.
这几乎就像您需要一个解析器来解析 SQL,然后从解析的查询中创建一个别名的符号表及其引用的表。然后将其与 GetSchemaTable() 的结果结合起来,以便您可以将列映射到适当的别名。
无论如何,请参阅问题 Parsing SQL code in C# 来了解一些解析器。我没有详细研究过它们,但也许其中之一就是您所需要的。如果您只执行 select 语句,请查看 ANTLR 链接和语法 http:// www.antlr.org/grammar/1062280680642/MS_SQL_SELECT.html。
如果您的查询很简单,您可能可以使用正则表达式或您自己的自定义语法从查询中解析出别名和表名。这可能是最简单的解决方案。
最可靠的解决方案可能是购买其他人的解析器来处理完整的 SQL 并将其分解为解析树或其他可以查询的东西。我不确定每一种的优点以及价格/稳健性比。但其中一些非常昂贵......我想说,如果您自己无法做到这一点,请探索 ANTLR 语法(因为它是免费的),假设您只需要 select 语句。否则你可能需要付费......
实际上假设你的用户不是疯狂的 SQL 天才并使用子查询/等。我不明白为什么你不能使用模式视图中的表名,你说你必须在查询中找到它们,然后找到别名作为表名别名或表名作为别名。这可能适用于许多情况......但对于完整的一般情况,你需要一个完整的解析器......
It's almost like you need a parser to parse the SQL and then from the parsed query make a symbol table of aliases and the tables they refer to. Then combine that with the results of GetSchemaTable() so that you can map columns to the appropriate alias.
Anyway see the question Parsing SQL code in C# for some parsers. I haven't looked at them in detail but maybe one of them is what you need. If you are only doing select statements check out the ANTLR link and the grammar for http://www.antlr.org/grammar/1062280680642/MS_SQL_SELECT.html.
If your queries are simple, you could probably use regular expressions or your own custom grammar to parse out the aliases and table names from the query. This would probably be the easiest solution.
The most robust solution is probably to pay for someone else's parser that handles full SQL and breaks it up into a parse tree or something else where you can query it. I'm not sure the merits of each one and the price/robustness ratio. But some of them are ultra expensive.... I would say if you can't do it yourself explore the ANTLR grammar (because it is free) assuming you just need select statements. Otherwise you may have to pay....
Actually assuming your users aren't crazy SQL geniuses and using subqueries/etc. I don't see why you can't use the table names from the schema view that you said you got to find them in the query and then find the alias as either tablename alias or tablename as alias. That might work for many of the cases.... But for the full general case you'd need a full parser.....
我认为 Rob Farley 的 showplan xml 适合您(假设您正在运行一个足够晚的具有此功能的 SQL Server)。
每列似乎都有
对于每个选定的列。假设每个表中至少有一列,那么在别名和表之间建立映射应该很简单。I think Rob Farley's showplan xml will work for you (assuming you are running a late enough SQL Server that has this feature).
Each column seems to have
<ColumnReference Server="" Database="" Schema="" Table="" Alias="" Column=""/>
for each of the selected columns. Assuming you have at least one column from each table, it should be trivial to make a mapping between alias and table.事实上,你可以。在这里查看我的答案: https://stackoverflow.com/a/19537164/88409
你需要做的是使用
set showplan_xml on
运行所有静态查询,解析返回的 XML,您找到的第一个
将是顶级输出列。只要在第一次引用查询中的表时为它们分配一个别名,这些别名就会传递到输出列。更进一步,我不得不推测此类别名无法被优化,因为引擎必须使用它们来区分同一个表中同一列的不同实例。
事实上,如果您运行这样的查询:
select * from Lessons, Lessons
,引擎基本上会告诉您以下消息:例如,如果您运行类似 'set showplan_xml on; 的内容;
select * from Lessons a, Lessons b, Lessons c, (select * from Lessons d) subquery_aliases_wont_stick_like_table_aliases`
您将得到如下输出:
Actually, you can. See my answer here: https://stackoverflow.com/a/19537164/88409
What you'll have to do is run all your static queries once with
set showplan_xml on
, parse the returned XML, and the very first<OutputList>
you find will be the top-level output columns. As long as you assign an alias to the tables in your queries when they are first referenced, those aliases will carry through to the output column.To go even further, I would have to surmise that such aliases CANNOT be optimized out, because the engine would have to use them to differentiate between different instances of the same column from the same table.
In fact, if you run a query like this:
select * from Lessons, Lessons
, the engine basically tells you as much with the message:For example, if you run something like 'set showplan_xml on;
select * from Lessons a, Lessons b, Lessons c, (select * from Lessons d) subquery_aliases_wont_stick_like_table_aliases`
You'll get output like this:
您可以使用SqlParser“Carbunql”获取架构名称、表名称和别名。
https://github.com/mk3008/Carbunql
结果
注意
物理表并不是唯一可以实现的东西在 FROM 子句中指定。还可以指定子查询。
在上面的示例代码中,如果指定了物理表名称以外的任何内容,则类型转换为“PhysicalTable”类将失败,因此实际使用时需要改进。
由于是解析SQL语句,如果select查询中省略了schema名,则无法获取schema名。
You can get schema name, table name and alias name by using SqlParser "Carbunql".
https://github.com/mk3008/Carbunql
Result
Attention
Physical tables are not the only things that can be specified in the FROM clause. A subquery may also be specified.
In the above sample code, if anything other than the physical table name is specified, the type conversion to the "PhysicalTable" class will fail, so improvement is required to actually use it.
Since the SQL statement is parsed, if the schema name is omitted in the select query, the schema name cannot be obtained.