在 drupal 中每天更新 10,000 个 cck 字段

发布于 2024-09-28 02:45:16 字数 666 浏览 3 评论 0原文

我们的一个站点拥有大约 10,000 个节点。在每个节点中,都有一个简单的 cck 文本/整数字段。这个整数每天都会变化,所以它们需要每天更新。整数范围从 1 到 20000000。 cck 字段跨越所有内容类型,因此它在数据库中有自己的表。我们不使用修订版。我选择让它读取 csv 文件,因为这个表非常简单,有 3 个字段。所有整数。我不需要进行 php 数组类型导入的所有灵活性。

我创建了一个 cron 作业来每天执行一个 php 脚本,其中包含类似的内容:


LOAD DATA LOCAL  INFILE 'file.csv'
REPLACE INTO TABLE content_field_mycckfield
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
(vid, nid, field_mycckfield_value);

在脚本末尾,它会计算导入的记录数量,报告成功和错误。 该文件是公开的,并且所有爵士乐都在下面。

我还缺少其他步骤吗?有什么我应该注意或谨慎的吗?

我应该在每次运行后优化该表或对其进行碎片整理吗?或者每(x)次运行?

我是否应该首先将其导入到 temp_ 表中以标准化数据,然后将其复制/移动到 TABLE content_field_mycckfield 中?

One of our sites has around 10,000 nodes. In each node, there is a simple cck text/integer field. This integer changes daily, so they need to be updated every day. The integer ranges from 1 to 20000000. The cck field is across all content types, so it has its own table in the database. We don't use revisions. I chose to have it read a csv file because this table is a very simple with 3 fields. All integers. I didn't need all the flexibility of doing a php array type import.

I created a cron job to execute a php script everyday which holds something similar to:


LOAD DATA LOCAL  INFILE 'file.csv'
REPLACE INTO TABLE content_field_mycckfield
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
(vid, nid, field_mycckfield_value);

At the end of the script, it counts how many records imported, reports success and errors.
The file is below public, and all the jazz.

Are there any other steps I am missing? Anything I should be aware of or be cautious of?

Should I have it optimize or defragment this table after every run? Or every (x) of runs?

Should I have it first imported into a temp_ table to normalize the data, then have it copied/moved into TABLE content_field_mycckfield?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

離人涙 2024-10-05 02:45:16

10,000 条记录很大,但就 MySQL 而言并不庞大,而且该表足够简单,我认为您不需要任何优化。如果表中的数据可靠并且您的 .csv 始终格式良好,那么不会出现太多问题。

另一个问题是您的导入过程是否抛出错误。如果 .csv 可能包含不正确的列引用、丢失的逗号等,那么您测试临时表中所有内容的想法肯定是一个好主意。

我唯一能做的其他事情是(按照神经质的顺序)

  • 过夜或每当您的站点未使用时执行此操作
  • 让 PHP 脚本捕获错误并通过电子邮件将每次运行的结果发送给您
  • 让脚本备份表,运行 .csv ,检查是否有错误,如果有错误,则向您发送电子邮件并同时恢复备份

希望以上内容有所帮助!

10,000 records is big but not massive in MySQL terms and the table is simple enough that I don't think you need any optimisation. If the data in the table is reliable and your .csv is always well formed then there's not a lot to go wrong.

The separate issue is whether your import process is throwing errors. If there is even the remotest chance that the .csv could contain incorrect column references, lost commas etc then your idea to test everything in a temp table is certainly a good one.

The only other things I can thing of are (in order of neuroticism)

  • Perform this operation overnight or whenever your site is unused
  • Have the PHP script catch errors and email you the results of each run
  • Have the script backup the table, run the .csv, check for errors and if errors then email you and simultaneously restore the backup

Hope any of that helps!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文