为一对多表实现查找或插入
我有 2 个表,tracklist
和 track
,其中 tracklist
有许多 track
。在某些时候,我会收到引用曲目列表的用户输入,并且我需要创建该曲目列表,或者返回现有的曲目列表(这是因为曲目列表对用户完全透明)。
我的天真的解决方案是找到所有包含 n
曲目的曲目列表,然后加入 track
与 tracklist
n
次,根据用户输入数据检查每个连接。例如,有 2 个曲目:
SELECT tracklist.id FROM tracklist
JOIN track t1 ON tracklist.id = t1.tracklist
JOIN track_name tn1 ON t1.name = tn1.id
JOIN track t2 ON tracklist.id = t2.tracklist
JOIN track_name tn2 ON t2.name = tn2.id
WHERE tracklist.track_count = '20'
AND (t1.position = 1 AND tn1.name = 'Pancakes' AND t1.artist_credit = '42' AND t1.recording = 1)
AND (t2.position = 2 AND tn2.name = 'Waffles' AND t2.artist_credit = '9001' AND t2.recording = 2)
但是,这确实不能很好地扩展到大型曲目列表。我的非常基本的计时表明,对于 10 个曲目的曲目列表,这可能需要 >500 毫秒,对于具有 100 个曲目的曲目列表,大约需要 7 秒。虽然后者是一种边缘情况,但我使用的任何算法都需要能够至少扩展到这一点。
然而我仍然坚持其他解决方案。我唯一能想到的另一件事是选择所有具有 n
曲目的曲目列表及其所有曲目,然后在应用程序代码中进行比较。但是,如果可以的话,我真的很想将其保留在数据库服务器上。
这是我正在使用的架构:
CREATE TABLE track
(
id SERIAL,
recording INTEGER NOT NULL, -- references recording.id
tracklist INTEGER NOT NULL, -- references tracklist.id
position INTEGER NOT NULL,
name INTEGER NOT NULL, -- references track_name.id
artist_credit INTEGER NOT NULL, -- references artist_credit.id
length INTEGER CHECK (length IS NULL OR length > 0),
edits_pending INTEGER NOT NULL DEFAULT 0,
last_updated TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
CREATE TABLE track_name (
id SERIAL,
name VARCHAR NOT NULL
);
CREATE TABLE tracklist
(
id SERIAL,
track_count INTEGER NOT NULL DEFAULT 0,
last_updated TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
有什么建议吗?
I have 2 tables, tracklist
and track
, where tracklist
has many tracks
. At some points, I will receive user input which refers to a list of tracks, and I need to either create that tracklist, or return an existing tracklist (this is because tracklists are meant to be entirely transparent to users).
My naive solution to this was to find all tracklists with n
tracks, and join track
against tracklist
n
times, checking each join against the user input data. For example, with 2 tracks:
SELECT tracklist.id FROM tracklist
JOIN track t1 ON tracklist.id = t1.tracklist
JOIN track_name tn1 ON t1.name = tn1.id
JOIN track t2 ON tracklist.id = t2.tracklist
JOIN track_name tn2 ON t2.name = tn2.id
WHERE tracklist.track_count = '20'
AND (t1.position = 1 AND tn1.name = 'Pancakes' AND t1.artist_credit = '42' AND t1.recording = 1)
AND (t2.position = 2 AND tn2.name = 'Waffles' AND t2.artist_credit = '9001' AND t2.recording = 2)
However, this really doesn't scale well to large tracklists. My very rudimentary timing shows this can take >500ms for 10 track tracklists, and ~7s for tracklists with 100 tracks. While the latter is an edge case, whatever algorithm I use needs to be able to scale at least up to this.
I'm stuck on other solutions however. The only other thing I can think of is to select all tracklists with n
tracks, and all their tracks, and then do the comparison in application code. However, I'd really like to keep this on the database server if I can.
Here is the schema I am working with:
CREATE TABLE track
(
id SERIAL,
recording INTEGER NOT NULL, -- references recording.id
tracklist INTEGER NOT NULL, -- references tracklist.id
position INTEGER NOT NULL,
name INTEGER NOT NULL, -- references track_name.id
artist_credit INTEGER NOT NULL, -- references artist_credit.id
length INTEGER CHECK (length IS NULL OR length > 0),
edits_pending INTEGER NOT NULL DEFAULT 0,
last_updated TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
CREATE TABLE track_name (
id SERIAL,
name VARCHAR NOT NULL
);
CREATE TABLE tracklist
(
id SERIAL,
track_count INTEGER NOT NULL DEFAULT 0,
last_updated TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
Any suggestions?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
由于我的语法可能不完全正确,并且没有机会对其进行测试,因此接下来是我要实现的目标的书面描述:
我建立一个查询,返回您拥有的曲目列表。一旦我构建了这个查询,我就会检查这些曲目的曲目列表是否都相同。如果是,即查询中只有一个曲目列表,那么这就是您需要的曲目列表。如果查询中没有曲目列表,或者有多个曲目列表,则您拥有的曲目集不对应于任何单个现有曲目列表,因此您需要创建一个新的曲目列表。如果有必要,此查询不会处理实际的创建。我不确定它将如何处理退化情况 - 查询中根本没有轨道;或者没有列出任何曲目的曲目列表。
As I may not have the syntax exactly correct, and have had no opportunity to test it, a written description of what I am trying to achieve is next:
I build up a query returning the list of tracks that you have. Once I have built this query I am checking whether the tracklists for these tracks are all the same. If they are, ie there is only one tracklist in the query, then this is the tracklist you require. If there are no tracklists in the query, or there is more than one, then the set of tracks you have do not correspond to any single existing tracklist, so you need to create a new tracklist. This query does not deal with the actual creation, if it proves necessary. I am not sure how it will deal with degenerate cases - there are no tracks at all in the query; or there are no tracklists listed for any of the tracks.