如何通过 DBI 重用 WHERE 子句逻辑?
免责声明:我第一次使用 DBI。
我有一个 MySQL 表,其中包含许多索引字段(f1、f2、f3 等),这些字段用于通过长时间运行的进程生成 WHERE 子句,这些进程迭代数据库块,执行各种清理和测试操作。
该代码的当前版本的工作原理如下:
sub get_list_of_ids() {
my ($value1, $value2, $value3...) = @_;
my $stmt = 'SELECT * FROM files WHERE 1';
my @args;
if (defined($value1)) {
$stmt .= ' AND f1 = ?';
push(@args, $value1);
}
# Repeat for all the different fields and values
my $select_sth = $dbh->prepare($stmt) or die $dbh->errstr;
$select_sth->execute(@args) or die $select_sth->errstr;
my @result;
while (my $array = $select_sth->fetch) {
push(@result, $$array[0]);
}
return \@result;
}
sub function_A() {
my ($value1, $value2, $value3...) = @_;
my $id_aref = get_list_of_ids($value1, $value2, $value3...);
foreach my $id (@$id_aref) {
# Do something with $id
# And something else with $id
}
}
sub function_B() {
my ($value1, $value2, $value3...) = @_;
my $id_aref = get_list_of_ids($value1, $value2, $value3...);
foreach my $id (@$id_aref) {
# Do something different with $id
# Maybe even delete the row
}
}
无论如何,我即将在数据库中转储更多的行,并且我很清楚上面的代码不会扩展。我可以根据其他语言想出几种修复方法。在 Perl 中处理它的最佳方法是什么?
需要注意的要点是 get_list_of_ids()
中的逻辑太长,无法在每个函数中复制;并且对所选行的操作多种多样。
提前致谢。
Disclaimer: first time I've used DBI.
I have a MySQL table with a lot of indexed fields (f1, f2, f3, etc) that are used to generate WHERE clauses by long-running processes that iterate over chunks of the database performing various cleaning and testing operations.
The current version of this code works something like this:
sub get_list_of_ids() {
my ($value1, $value2, $value3...) = @_;
my $stmt = 'SELECT * FROM files WHERE 1';
my @args;
if (defined($value1)) {
$stmt .= ' AND f1 = ?';
push(@args, $value1);
}
# Repeat for all the different fields and values
my $select_sth = $dbh->prepare($stmt) or die $dbh->errstr;
$select_sth->execute(@args) or die $select_sth->errstr;
my @result;
while (my $array = $select_sth->fetch) {
push(@result, $array[0]);
}
return \@result;
}
sub function_A() {
my ($value1, $value2, $value3...) = @_;
my $id_aref = get_list_of_ids($value1, $value2, $value3...);
foreach my $id (@$id_aref) {
# Do something with $id
# And something else with $id
}
}
sub function_B() {
my ($value1, $value2, $value3...) = @_;
my $id_aref = get_list_of_ids($value1, $value2, $value3...);
foreach my $id (@$id_aref) {
# Do something different with $id
# Maybe even delete the row
}
}
Anyway, I'm about to dump an awful lot more rows in the database, and am well aware that the code above wont scale up. I can think of several ways to fix it based on other languages. What is the best way to handle it in Perl?
Key points to note are that the logic in get_list_of_ids()
is too long to replicate in each function; and that the operations on the selected rows are very varied.
Thanks in advance.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为“扩大规模”是指维护方面而不是性能方面。
对代码的关键更改是将参数作为列/值对传递,而不是带有假定的列集的值列表。这将允许您的代码处理您可能添加的任何新列。
DBI->selectcol_arrayref
使用 C 语言编写,既方便又更快。如果您在
connect
调用中打开RaiseError
,DBI会在错误时抛出异常,而不必一直编写or die ...
。你应该这样做。最后,由于我们是根据可能不受信任的用户输入编写 SQL,因此我小心地转义了列名。
其余的在 这个 Etherpad 中进行了解释,您可以观看代码转换步骤一步一步。
如果您希望 get_ids 返回一个巨大的列表,太多而无法保留在内存中,那么您可以返回语句句柄并对其进行迭代,而不是取出整个数组并将其存储在内存中。
您可以通过在数组上下文中返回 ID 列表或在标量中返回语句句柄来组合这两种方法。
最终,您将希望停止手动编码 SQL 并使用对象关系映射器 (ORM),例如 DBIx: :类。 ORM 的主要优点之一是它非常灵活,可以为您执行上述操作。 DBIx::Class 可以返回简单的结果列表,或者非常强大的迭代器。迭代器是惰性的,在您开始获取行之前它不会执行查询,允许您根据需要更改查询,而不必使获取例程复杂化。
I presume by "scale up" you mean in maintenance terms rather than performance.
The key change to your code is to pass in your arguments as column/value pairs rather than a list of values with an assumed set of columns. This will allow your code to handle any new columns you might add.
DBI->selectcol_arrayref
is both convenient and a bit faster, being written in C.If you turn on
RaiseError
in yourconnect
call, DBI will throw an exception on errors rather than having to writeor die ...
all the time. You should do that.Finally, since we're writing SQL from possibly untrusted user input, I've taken care to escape the column name.
The rest is explained in this Etherpad, you can watch your code be transformed step by step.
If you expect
get_ids
to return a huge list, too much to keep in memory, then instead of pulling out the whole array and storing it in memory you can return the statement handle and iterate with that.You can combine both approaches by returning a list of IDs in array context, or a statement handle in scalar.
Eventually you will want to stop hand coding SQL and use an Object Relation Mapper (ORM) such as DBIx::Class. One of the major advantages of an ORM is it is very flexible and can do the above for you. DBIx::Class can return a simple list of results, or very powerful iterator. The iterator is lazy, it will not perform the query until you start fetching rows, allowing you to change the query as needed without having to complicate your fetch routine.