用于解析 CSV 表值以填充一对多表的 PHP 或 C# 脚本

发布于 2024-09-05 17:56:57 字数 1115 浏览 6 评论 0原文

我正在寻找一个示例，说明如何在一个表的字段中拆分逗号分隔的数据，并使用这些单独的元素填充第二个表，以便创建一对多关系数据库模式。这可能非常简单，但让我举一个例子：

我将从一个表中的所有内容开始，Widgets，它有一个“状态”字段来包含具有该小部件的状态：

表：WIDGET

===============================
| id | unit | states          |
===============================
|1   | abc  | AL,AK,CA        |
-------------------------------
|2   | lmn  | VA,NC,SC,GA,FL  |
-------------------------------
|3   | xyz  | KY              |
===============================

现在，我想通过代码创建第二个表，该表要连接到 WIDGET，名为 Widget_ST，其中包含小部件 id、小部件状态 id 和小部件状态名称字段，例如

表：WIDGET_ST

==============================
| w_id | w_st_id | w_st_name |
------------------------------
|1     | 1       | AL        |
|1     | 2       | AK        |
|1     | 3       | CA        |
|2     | 1       | VA        |
|2     | 2       | NC        |
|2     | 1       | SC        |
|2     | 2       | GA        |
|2     | 1       | FL        |
|3     | 1       | KY        |
==============================

我正在学习 C# 和 PHP，因此使用任何一种语言的回复都会很棒。

谢谢。

原文

I'm looking for an example of how to split-out comma-delimited data in a field of one table, and fill in a second table with those individual elements, in order to make a one-to-many relational database schema. This is probably really simple, but let me give an example:

I'll start with everything in one table, Widgets, which has a "state" field to contain states that have that widget:

Table: WIDGET

===============================
| id | unit | states          |
===============================
|1   | abc  | AL,AK,CA        |
-------------------------------
|2   | lmn  | VA,NC,SC,GA,FL  |
-------------------------------
|3   | xyz  | KY              |
===============================

Now, what I'd like to create via code is a second table to be joined to WIDGET called Widget_ST that has widget id, widget state id, and widget state name fields, for example

Table: WIDGET_ST

==============================
| w_id | w_st_id | w_st_name |
------------------------------
|1     | 1       | AL        |
|1     | 2       | AK        |
|1     | 3       | CA        |
|2     | 1       | VA        |
|2     | 2       | NC        |
|2     | 1       | SC        |
|2     | 2       | GA        |
|2     | 1       | FL        |
|3     | 1       | KY        |
==============================

I am learning C# and PHP, so responses in either language would be great.

Thanks.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

混浊又暗下来 2024-09-12 17:56:57

我编写了一些脚本来将 Stack Overflow 数据转储导入到 SQL 数据库中。正如您所描述的，我拆分了标签列表以填充多对多表。我使用类似于以下的技术：

从 WIDGET 读取一行
```
while ($row = $pdoStmt->fetch()) {
```
使用 explode() 以逗号分隔
```
$states =explode(",", $row["state"]);
```

循环元素，写入新的 CSV 文件

$stateid = array();
$stfile = fopen("states.csv", "w+");
$mmfile = fopen("manytomany.csv", "w+");
$i = 0;
foreach ($state as $st) {
    if (!array_key_exists($st, $stateid)) {
        $stateid[$st] = ++$i;
        fprintf($stfile, "%d,%s\n", $i, $st);
    }
    fprintf($mmfile, "%s,%s\n", $row["id"], $stateid[$st]);
}
fclose($stfile);
fclose($mmfile);

完成后，将 CSV 文件加载到数据库中。您可以在 mysql 客户端中执行此操作：

mysql>将数据内文件“states.csv”加载到表状态中；
mysql>将数据内文件“manytomany.csv”加载到表 WIDGET_ST 中；

这可能看起来需要大量工作，但使用 LOAD DATA 命令的运行速度比一次插入一行快 20 倍，因此如果您的数据集很大，那么这是值得的。

回复您的评论：

是的，我的数据库中也已经有数据了。事实证明，我上面展示的解决方案（转储到 CSV 文件并以标准化格式重新导入）比在分割数据的循环中执行 INSERT 语句快很多倍。

每个品牌的数据库都有自己的批量数据导入工具。请参阅我对 Optimizing big import in PHP 的回答，获取批量列表每个数据库导入解决方案。

您应该使用每个数据库提供的工具。试图保持跨平台只会让你的代码万事通，一无是处。此外，在 90% 的情况下，当人们竭尽全力使他们的代码独立于数据库时，结果证明他们从未使用多个数据库。而且无论如何你都无法实现完全的数据库独立性。

I wrote some scripts to import the Stack Overflow data dump into an SQL database. I split the tags list to populate a many-to-many table as you describe. I use a technique similar to the following:

Read a row from WIDGET
```
while ($row = $pdoStmt->fetch()) {
```
Use explode() to split on a comma
```
$states = explode(",", $row["state"]);
```

Loop over elements, writing to a new CSV file

$stateid = array();
$stfile = fopen("states.csv", "w+");
$mmfile = fopen("manytomany.csv", "w+");
$i = 0;
foreach ($state as $st) {
    if (!array_key_exists($st, $stateid)) {
        $stateid[$st] = ++$i;
        fprintf($stfile, "%d,%s\n", $i, $st);
    }
    fprintf($mmfile, "%s,%s\n", $row["id"], $stateid[$st]);
}
fclose($stfile);
fclose($mmfile);

When you're done, load the CSV files into the database. You can do this in the mysql client:

mysql> LOAD DATA INFILE 'states.csv' INTO TABLE STATES;
mysql> LOAD DATA INFILE 'manytomany.csv' INTO TABLE WIDGET_ST;

It may seem like a lot of work, but using the LOAD DATA command runs 20x faster than inserting one row at a time, so it's worthwhile if your data set is large.

Re your comment:

Right, I also have data in a database already. It turns out that the solution I show above, dumping to CSV files and re-importing in normalized format, is many times faster than doing INSERT statements inside the loop that splits the data.

Each brand of database has its own tool for importing bulk data. See my answer to Optimizing big import in PHP for a list of bulk import solutions per database.

You should use the tools provided by each database. Trying to remain cross platform only makes your code Jack of all trades, master of none. Besides, in 90% of cases when people bend over backwards to make their code database-independent, it turns out they never use more than one database. And you can't achieve complete database independence anyway.

回复收藏 0 原文

~没有更多了~