需要 SQL 语句关注表的组合，但条目始终具有唯一 ID

发布于 2024-09-05 02:20:31 字数 2166 浏览 7 评论 0原文

我需要SQL代码来解决表组合问题，如下所述：

表旧数据：表旧

    name     version    status    lastupdate      ID
    A        0.1        on        6/8/2010        1
    B        0.1        on        6/8/2010        2
    C        0.1        on        6/8/2010        3
    D        0.1        on        6/8/2010        4
    E        0.1        on        6/8/2010        5
    F        0.1        on        6/8/2010        6
    G        0.1        on        6/8/2010        7

表新数据：表新

    name     version    status    lastupdate     ID         
    A        0.1        on        6/18/2010                
                                                           #B entry deleted
    C        0.3        on        6/18/2010                #version_updated
    C1       0.1        on        6/18/2010                #new_added
    D        0.1        on        6/18/2010                
    E        0.1        off       6/18/2010                #status_updated
    F        0.1        on        6/18/2010                
    G        0.1        on        6/18/2010                
    H        0.1        on        6/18/2010                #new_added
    H1       0.1        on        6/18/2010                #new_added

新数据和旧日期的差异：

B条目已删除

C条目版本已更新

E条目状态已更新

C1/H /H1 条目新增

我想要的是始终将 ID - 名称映射关系保留在旧数据表中，无论数据后来如何更改，即名称始终有一个与其绑定的唯一 ID 号。

如果条目有更新，则更新数据，如果条目是新添加的，则插入到表中，然后给出新分配的唯一ID。如果该条目已被删除，请删除该条目，并且以后不要重复使用该 ID。

但是，我只能使用 SQL 和简单的 select 或 update 语句，那么我编写这样的代码可能太难了，那么我希望有专业知识的人可以给出指导，不需要详细了解 SQL 变体的不同之处，标准 sql 代码如下样品就足够了。

提前致谢！

Rgs

======== 我在这里列出了我的sql草案，但不确定它是否有效，请有专业知识的人发表评论，谢谢！

1.将旧表复制为 tmp 以进行存储更新

创建表 tmp 作为 select * from old

中的“名称”相同

2.update into tmp 其中旧表和新表更新 tmp 其中名称（从新名称中选择名称）

3.将不同的“名称”（旧名称与新名称）插入 tmp 并分配新 ID

插入 tmp（名称版本状态上次更新 ID）设置 idvar = max(从 tmp 中选择 max(id)) + 1 从中选择 * (选择new.name new.version new.status new.lastupdate new.ID 从旧到新其中旧名称 <> new.name)

4.从tmp表中删除已删除的条目（如B）

delete from tmp 在哪里（选择？？？）

原文

I need SQL code to solve the tables combination problem, described on below:

Table old data: table old

    name     version    status    lastupdate      ID
    A        0.1        on        6/8/2010        1
    B        0.1        on        6/8/2010        2
    C        0.1        on        6/8/2010        3
    D        0.1        on        6/8/2010        4
    E        0.1        on        6/8/2010        5
    F        0.1        on        6/8/2010        6
    G        0.1        on        6/8/2010        7

Table new data: table new

    name     version    status    lastupdate     ID         
    A        0.1        on        6/18/2010                
                                                           #B entry deleted
    C        0.3        on        6/18/2010                #version_updated
    C1       0.1        on        6/18/2010                #new_added
    D        0.1        on        6/18/2010                
    E        0.1        off       6/18/2010                #status_updated
    F        0.1        on        6/18/2010                
    G        0.1        on        6/18/2010                
    H        0.1        on        6/18/2010                #new_added
    H1       0.1        on        6/18/2010                #new_added

the difference of new data and old date:

B entry deleted

C entry version updated

E entry status updated

C1/H/H1 entry new added

What I want is always keeping the ID - name mapping relationship in old data table no matter how data changed later, a.k.a the name always has an unique ID number bind with it.

If entry has update, then update the data, if entry is new added, insert to the table then give a new assigned unique ID. If the entry was deleted, delete the entry and do not reuse that ID later.

However, I can only use SQL with simple select or update statement then it may too hard for me to write such code, then I hope someone with expertise can give direction, no details needed on the different of SQL variant, a standard sql code as sample is enough.

Thanks in advance!

Rgs

========
I listed my draft sql here, but not sure if it works, some one with expertise pls comment, thanks!

1.duplicate old table as tmp for store updates

create table tmp as
select * from old

2.update into tmp where the "name" is same in old and new table

update tmp
where name in (select name from new)

3.insert different "name" (old vs new) into tmp and assign new ID

insert into tmp (name version status lastupdate ID)
set idvar = max(select max(id) from tmp) + 1
select * from
(select new.name new.version new.status new.lastupdate new.ID
from old, new
where old.name <> new.name)

4. delete the deleted entries from tmp table (such as B)

delete from tmp
where
(select ???)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

孤凫 2024-09-12 02:22:03

为什么不使用 UUID 呢？为插件生成一次，并将其合并/保留到插件中，而不是数据库中。既然您提到了 python，那么下面是如何生成它：

import uuid
UID = str(uuid.uuid4()) # this will yield new UUID string

当然它不能保证全局唯一性，但是您在项目中获得相同字符串的机会非常低。

Why don't you use a UUID for this? Generate it once for a plug-in, and incorporate/keep it into the plug-in, not into the DB. Now that you mention python, here's how to generate it:

import uuid
UID = str(uuid.uuid4()) # this will yield new UUID string

Sure it does not guarantee global uniqueness, but chances you get the same string in your project is pretty low.

回复收藏 0 原文

避讳 2024-09-12 02:21:54

如果我根据两个表中的注释很好地理解了您的需要，我认为如果您不合并或更新旧表，您可以简化很多问题，因为您需要的是新表和旧表中的 ID，当它们存在并且当它们不存在时新的ID，对吧？

新记录：新表已经有新记录 - 好的（但他们需要一个新的 ID）
已删除的记录：它们不在新表中 - 好的
更新记录：已在新表中更新 - 确定（需要从旧表复制 ID）
未修改的记录：已经在新表中 - 好的（需要从旧表复制ID）

所以你唯一需要做的就是：
(a) 当 ID 存在时，将旧表中的 ID 复制到新表中
(b) 当旧表中不存在新ID时，在新表中创建新ID
(c) 将新表复制到旧表。

(a) UPDATE new SET ID = IFNULL((SELECT ID FROM old WHERE new.name = old.name),0);

(b) 更新新的 SET ID = FUNCTION_TO GENERATE_ID(new.name) WHERE ID = 0;

由于我不知道您使用的是哪个 SQL 数据库，因此在 (b) 中您可以使用 sql 函数根据数据库生成唯一 id。对于SQL Server，newid（），对于postgresql（不是太旧的版本），now（）似乎是一个不错的选择，因为它的精度看起来足够（但在其他数据库中，例如MySQL，因为我认为精度仅限于秒）

编辑：抱歉，我没有看到你使用 sqlite 和 python。在这种情况下，您可以使用 python 中的 str(uuid.uuid4()) 函数（uuid 模块）来生成 uuid 并将 ID 填充到步骤 (b) 中 ID = 0 的新表中。这样，您就可以在需要时加入 2 个独立的数据库，而不会出现 ID 冲突。

回复收藏 0 原文

浪荡不羁 2024-09-12 02:21:48

起草的方法，我不知道它是否工作正常......

CREATE TRIGGER auto_next_id
在表上插入每一行之后
开始
更新表 SET uid = max(uid) + 1 ;
结尾;

回复收藏 0 原文

葮薆情 2024-09-12 02:21:37

这可以在 Informix 中运行并准确地给出您需要的显示。人们可能会认为，相同或相似的内容应该可以在 MySQL 中使用。这里的技巧是将所有名称的并集放入临时表中，并对其进行左连接，以便可以比较其他两个的值。

SELECT DISTINCT name FROM old
UNION
SELECT DISTINCT name FROM new
INTO TEMP _tmp;

SELECT 
  CASE WHEN b.name IS NULL THEN ''
       ELSE aa.name
       END AS name, 
  CASE WHEN b.version IS NULL THEN ''
       WHEN a.version = b.version THEN a.version 
       ELSE b.version
       END AS version,
  CASE WHEN a.status = b.status THEN a.status 
       WHEN b.status IS NULL THEN ''
       ELSE b.status
       END AS status,
  CASE WHEN a.lastupdate = b.lastupdate THEN a.lastupdate 
       WHEN b.lastupdate IS NULL THEN null
       ELSE b.lastupdate
       END AS lastupdate,
  CASE WHEN a.name IS NULL THEN '#new_added'
       WHEN b.name IS NULL THEN '#' || aa.name || ' entry deleted'
       WHEN a.version  b.version THEN '#version_updated'
       WHEN a.status  b.status THEN '#status_updated'
       ELSE ''
  END AS change
  FROM _tmp aa
  LEFT JOIN old a
         ON a.name = aa.name
  LEFT JOIN new b
         ON b.name = aa.name;

This works in Informix and gives exactly the display you require. Same or similar should work in MySQL, one would think. The trick here is to get the union of all names into a temp table and left join on that so that the values from the other two can be compared.

SELECT DISTINCT name FROM old
UNION
SELECT DISTINCT name FROM new
INTO TEMP _tmp;

SELECT 
  CASE WHEN b.name IS NULL THEN ''
       ELSE aa.name
       END AS name, 
  CASE WHEN b.version IS NULL THEN ''
       WHEN a.version = b.version THEN a.version 
       ELSE b.version
       END AS version,
  CASE WHEN a.status = b.status THEN a.status 
       WHEN b.status IS NULL THEN ''
       ELSE b.status
       END AS status,
  CASE WHEN a.lastupdate = b.lastupdate THEN a.lastupdate 
       WHEN b.lastupdate IS NULL THEN null
       ELSE b.lastupdate
       END AS lastupdate,
  CASE WHEN a.name IS NULL THEN '#new_added'
       WHEN b.name IS NULL THEN '#' || aa.name || ' entry deleted'
       WHEN a.version  b.version THEN '#version_updated'
       WHEN a.status  b.status THEN '#status_updated'
       ELSE ''
  END AS change
  FROM _tmp aa
  LEFT JOIN old a
         ON a.name = aa.name
  LEFT JOIN new b
         ON b.name = aa.name;

回复收藏 0 原文

够钟 2024-09-12 02:21:23

注意 - 如果您担心性能，您可以跳过整个答案:-)

如果您可以重新设计 2 个表 - 一个包含数据，另一个包含名称 - ID 链接。像

table_original

name     version    status    lastupdate
A        0.1        on        6/8/2010
B        0.1        on        6/8/2010
C        0.1        on        6/8/2010
D        0.1        on        6/8/2010
E        0.1        on        6/8/2010
F        0.1        on        6/8/2010
G        0.1        on        6/8/2010

和 name_id

name     ID 
A        1 
B        2 
C        3 
D        4 
E        5 
F        6 
G        7

这样的东西当你用新的数据集获取 table_new 时

TRUNCATE table_original
INSERT INTO name_id （来自 table_new 的名称不在 name_id 中）
将 table_new 复制到 table_original

注意：我认为这里的删除有点含糊不清

如果该条目已被删除，则删除该条目
条目，并且以后不要重复使用该 ID。

如果名称 A 被删除，并且它在以后的一组更新中再次出现，您是否想要 a.重复使用标记为 A 或 b 的原始 ID。生成新的ID？

如果是b.您需要删除一列吗？在 name_id 和最后一步

4 中。设置已删除？ = Y 其中名称不在 table_original 中

并且 2. 将排除已删除？ = Y 条记录。

您也可以在没有 name_id 表的情况下执行相同的操作，因为您从 table_old 中唯一需要的是名称 - ID 链接的逻辑。您需要的其他所有内容都在 table_new 中，

Note - if you are concerned about performance you can skip this whole answer :-)

If you can redesign have 2 tables - one with the data and other with the name - ID linkage. Something like

table_original

name     version    status    lastupdate
A        0.1        on        6/8/2010
B        0.1        on        6/8/2010
C        0.1        on        6/8/2010
D        0.1        on        6/8/2010
E        0.1        on        6/8/2010
F        0.1        on        6/8/2010
G        0.1        on        6/8/2010

and name_id

name     ID 
A        1 
B        2 
C        3 
D        4 
E        5 
F        6 
G        7

When you get the table_new with the new set of data

TRUNCATE table_original
INSERT INTO name_id (names from table_new not in name_id)
copy table_new to table_original

Note : I think there's a bit of ambiguity about the deletion here

If the entry was deleted, delete the
entry and do not reuse that ID later.

If name A gets deleted, and it turns up again in a later set of updates do you want to a. reuse the original ID tagged to A, or b. generate a new ID?

If it's b. you need a column Deleted? in name_id and a last step

4 . set Deleted? = Y where name not in table_original

and 2. would exclude Deleted? = Y records.

You could also do the same thing without the name_id table based on the logic that the only thing you need from table_old is the name - ID links. Everything else you need is in table_new,

回复收藏 0 原文

梦魇绽荼蘼 2024-09-12 02:21:18

让我从最后开始：

在 #4 中，您将删除 tmp 中的所有行；你想说的是WHERE tmp.name NOT IN (SELECT name FROM new);同样，#3 不是正确的语法，但如果是，它会尝试插入所有行。

关于#2，为什么不使用自动增量在 ID 上？

关于 #1，如果您的 tmp 表与新表相同，则查询 #2-#4 没有任何意义，除非您以某种方式更改（更新、插入、删除）new 表。

但是（！），如果您确实更新了表新并且它在ID上有一个自动增量字段并且您正确更新了表（使用ID）从应用程序中，那么您的整个过程是不必要的（！）。

因此，重要的是您不应该将系统设计为像上面那样工作。

要了解从应用程序端更新数据库中数据的概念，请查看示例此处 (php/mysql)。

另外，要使查询语法正确，请执行 SET、INSERT、DELETE 和 SELECT 命令的基本版本（无法解决此问题）。

回复收藏 0 原文

掩于岁月 2024-09-12 02:21:11

您从未提及您正在使用什么 DBMS，但如果您使用的是 SQL Server，那么一个非常好的方法是 SQL MERGE 语句。请参阅：http://www.mssqltips.com/tip.asp?tip=1704< /a>

MERGE 语句的作用基本上如下
单独的插入、更新和删除
语句都在同一个内
陈述。您指定一个“来源”
记录集和“目标”表，以及
两者之间的连接。那么你
指定数据修改的类型
这将发生在记录
两个数据之间匹配或
不匹配。 MERGE 非常有用，
尤其是在加载时
数据仓库表，可以是
非常大并且需要特定的
当行是或时要采取的操作
不存在。

例子：

MERGE Products AS TARGET
USING UpdatedProducts AS SOURCE 
ON (TARGET.ProductID = SOURCE.ProductID) 
--When records are matched, update 
--the records if there is any change
WHEN MATCHED AND TARGET.ProductName <> SOURCE.ProductName 
OR TARGET.Rate <> SOURCE.Rate THEN 
UPDATE SET TARGET.ProductName = SOURCE.ProductName, 
TARGET.Rate = SOURCE.Rate 
--When no records are matched, insert
--the incoming records from source
--table to target table
WHEN NOT MATCHED BY TARGET THEN 
INSERT (ProductID, ProductName, Rate) 
VALUES (SOURCE.ProductID, SOURCE.ProductName, SOURCE.Rate)
--When there is a row that exists in target table and
--same record does not exist in source table
--then delete this record from target table
WHEN NOT MATCHED BY SOURCE THEN 
DELETE
--$action specifies a column of type nvarchar(10) 
--in the OUTPUT clause that returns one of three 
--values for each row: 'INSERT', 'UPDATE', or 'DELETE', 
--according to the action that was performed on that row
OUTPUT $action, 
DELETED.ProductID AS TargetProductID, 
DELETED.ProductName AS TargetProductName, 
DELETED.Rate AS TargetRate, 
INSERTED.ProductID AS SourceProductID, 
INSERTED.ProductName AS SourceProductName, 
INSERTED.Rate AS SourceRate; 
SELECT @@ROWCOUNT;
GO

You never mentioned what DBMS you are using but if you are using SQL Server, one really good one is the SQL MERGE statement. See: http://www.mssqltips.com/tip.asp?tip=1704

The MERGE statement basically works as
separate insert, update, and delete
statements all within the same
statement. You specify a "Source"
record set and a "Target" table, and
the join between the two. You then
specify the type of data modification
that is to occur when the records
between the two data are matched or
are not matched. MERGE is very useful,
especially when it comes to loading
data warehouse tables, which can be
very large and require specific
actions to be taken when rows are or
are not present.

Example:

MERGE Products AS TARGET
USING UpdatedProducts AS SOURCE 
ON (TARGET.ProductID = SOURCE.ProductID) 
--When records are matched, update 
--the records if there is any change
WHEN MATCHED AND TARGET.ProductName <> SOURCE.ProductName 
OR TARGET.Rate <> SOURCE.Rate THEN 
UPDATE SET TARGET.ProductName = SOURCE.ProductName, 
TARGET.Rate = SOURCE.Rate 
--When no records are matched, insert
--the incoming records from source
--table to target table
WHEN NOT MATCHED BY TARGET THEN 
INSERT (ProductID, ProductName, Rate) 
VALUES (SOURCE.ProductID, SOURCE.ProductName, SOURCE.Rate)
--When there is a row that exists in target table and
--same record does not exist in source table
--then delete this record from target table
WHEN NOT MATCHED BY SOURCE THEN 
DELETE
--$action specifies a column of type nvarchar(10) 
--in the OUTPUT clause that returns one of three 
--values for each row: 'INSERT', 'UPDATE', or 'DELETE', 
--according to the action that was performed on that row
OUTPUT $action, 
DELETED.ProductID AS TargetProductID, 
DELETED.ProductName AS TargetProductName, 
DELETED.Rate AS TargetRate, 
INSERTED.ProductID AS SourceProductID, 
INSERTED.ProductName AS SourceProductName, 
INSERTED.Rate AS SourceRate; 
SELECT @@ROWCOUNT;
GO

回复收藏 0 原文

~没有更多了~