SQlite3优化:将外部文件名存储在数据库中?或者只是有大量的行?

发布于 2024-11-09 07:26:18 字数 746 浏览 0 评论 0原文

我是一个没有计算机科学背景的新手。所以请原谅我可能说的任何蠢话。我正在开展一个太阳能监控项目,以监控我公司安装的太阳能发电系统的功率输出。我正在编写一个客户端,它将每 15 分钟查询我们每个监控客户的逆变器(功率输出、电压输出、电流输出、系统错误/故障等,这构成一个“读数”),只要他们拥有他们的系统——这意味着每个客户每年大约有 35,000 个读数。所以我正在考虑用以下两种方式之一来组织我的 sqlite3 数据库。

(1) 让数据库有两个表,一个表包含常规客户信息(姓名、电子邮件等),另一个表更大,其中每一行代表一次阅读,并包含客户 ID 和阅读时间戳作为标识符。这意味着每个客户每年将向这个更大的表中添加大约 35,000 行。 (超过两年的数据将被削减并存档。)

(2) 将所有读数存储在 csv 文件中(每个客户一个 csv 文件),并将 csv 文件名与常规客户信息一起存储在我的表中

。提供一个网站(如果这对选项有任何影响的话,建立在轨道上),客户将能够在其中查看他们的功率输出数据。我想最大限度地减少登录时加载输出数据所需的时间。我基本上不清楚我的计算机打开并从文本文件中逐行读取与打开、查找(基于客户 ID)并从巨大的 sqlite3 中读取数据所需的时间表——因此我很难知道如何判断上述两个选项。另外,我在衡量 sqlite3 的局限性时遇到了困难,尽管已经阅读了一些相关内容(我认为我没有背景来理解我所做的阅读,因为它似乎说数百行数百万行就可以了当我读到其他人的评论时,似乎说的恰恰相反。)。我也愿意接受完全不同的选择,因为我现在还没有结婚。任何能让加载速度更快的东西。非常感谢!

I am a newbie with no comp sci background. So please forgive me for whatever dumb stuff I may say. I am working on a solar power monitoring project to monitor the power output of the solar power systems my company installs. I am writing a client that will query the inverter (for power output, voltage output, current output, system errors/faults, etc--which constitutes one "reading") of each of our monitoring customers every 15 minutes for as long as they have their system--which means roughly 35k readings per year per customer. So I was thinking of organizing my sqlite3 database in one of the two following ways.

(1) Have the database be two tables, one table with regular customer information (name, email, etc) and another much bigger table where each row represents one reading and includes the customer id and timestamp of reading as identifiers. Which means roughly 35k rows will be being added to this bigger table per customer per year. (Data more than two years old will be pared down and archived.)

OR

(2) Store all readings in a csv file (one csv file per customer) and store the csv file name in my table with regular customer information

This database will be serving a website (built on rails if that makes any difference for options) where customers will be able to view their power output data. I want to minimize the amount of time it will take to load their output data on login. I basically don't have a clear idea of the amount of time it would take for my computer to open and read in lines from a text file versus open, look for (based on customer id) and read in the data from a huge sqlite3 table--and therefore am having trouble knowing how to judge between the two options above. Also I'm having trouble gauging the limits of sqlite3 where it functions optimally despite having read some about it (I don't think I have the background to understand the reading I did because it seems to say 100s of millions of rows are just fine when I read other people's comments seeming to say just the opposite.). I am also open to a completely different option as I'm not married to anything right now. Whatever makes things load faster. Thanks so much in advance!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

无人接听 2024-11-16 07:26:18

如果您要对其进行任何类型的重复数据挖掘,将解析后的数据存储在 sqlite 中肯定会节省时间。 CSV 解析开销几乎会立即耗尽您节省的任何数据库空间/时间。

至于效率,你得测试一下。没有一条硬性规定说“使用这个数据库”或“使用那个数据库”。这始终是“取决于场景”。在这种情况下,SQLite 可能非常适合您,但对于工作负载略有不同的其他人来说毫无用处。

Storing the parsed data in sqlite would definitely be a timesaver if you're doing any kind of repeated data mining on it. CSV Parsing overhead would almost instantly eat up any database space/time savings you'd gain.

As for efficiency, you'd have to test it. There's no one hard fast rule that says "use this database" or "use that database". It's ALWAYS a "depends on the scenario". SQLite may be perfect for you in this case, but be useless for someone else with a slightly different workload.

ゝ偶尔ゞ 2024-11-16 07:26:18

一般来说,只要查询的列已建立索引,SQL 应用程序就可以很好地处理大型数据集。您应该将它们保存在同一个数据库中。从数据库获取数据所需的时间比解析 CSV 文件要少得多。创建数据库的目的是存储和检索数据,而 CSV 文件则不然。

我使用 MySQL 数据库,每个表有数千万行,查询在不到一秒的时间内返回结果。 SQLite 可能会更快。

只需确保为要搜索的内容创建索引即可。

SQL applications in general do very well with large data sets, as long as the columns being queried are indexed. You should keep them in the same database. It will take a huge lot less to obtain the data from the database than for parsing CSV files. Databases are created with the purpose of storing and retrieving data, CSV files are not.

I use MySQL databases with tens of millions of rows per table and queries return results in fractions of a second. SQLite might be faster.

Just make sure you create indexes for what you will be searching.

始终不够爱げ你 2024-11-16 07:26:18

我会选择选项 1,但使用 PostgreSQL 等数据库服务器而不是 SQLite。

SQLite 将在更新时锁定表,因此如果您频繁读写表,可能会遇到锁定问题。 SQLite 更适合桌面或智能手机上的单用户应用程序。

您可以轻松拥有数百万行,而不会造成任何问题。

I would do option 1, but use a database server such as PostgreSQL instead of SQLite.

SQLite will lock the table on update so you may run into locking issues if you read and write to the table a lot. SQLite is better suited for single user applications on the desktop or in a smartphone.

You can easily have millions of rows without it causing any problems.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文