数据集和数据库有什么区别?

发布于 2024-12-10 07:38:11 字数 83 浏览 1 评论 0原文

数据集和数据库有什么区别?如果它们不同那么如何?

为什么现在使用数据库很难管理海量数据?!

请独立于任何编程语言来回答。

What is the difference between a dataset and a database ? If they are different then how ?

Why is huge data difficult to be manageusing databases today?!

Please answer independent of any programming language.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

习ぎ惯性依靠 2024-12-17 07:38:11

在美式英语中,数据库通常意味着“有组织的数据集合”。数据库通常受数据库管理系统的控制,数据库管理系统是管理多用户对数据库的访问的软件。 (通常,但不一定。一些简单的数据库只是用 awk 和 Python 等解释性语言处理的文本文件。)

在我最熟悉的 SQL 世界中,数据库包括表、视图、存储过程等内容、触发器、权限和数据。

同样,在美式英语中,数据集通常指的是选择并按行和列排列的数据,供统计软件处理。数据可能来自数据库,但也可能不是。

In American English, database usually means "an organized collection of data". A database is usually under the control of a database management system, which is software that, among other things, manages multi-user access to the database. (Usually, but not necessarily. Some simple databases are just text files processed with interpreted languages like awk and Python.)

In the SQL world, which is what I'm most familiar with, a database includes things like tables, views, stored procedures, triggers, permissions, and data.

Again, in American English, dataset usually refers to data selected and arranged in rows and columns for processing by statistical software. The data might have come from a database, but it might not.

东风软 2024-12-17 07:38:11

数据库

这两个术语的定义并不总是明确的。一般来说,数据库是使用数据库管理系统(DBMS)组织和访问的一组数据。数据库通常(但并非总是)由多个链接在一起的组成,这些表经常被不同的用户同时访问、修改和更新。

剑桥词典:

计算机中保存的一组结构化数据,尤其是
可通过多种方式访问​​。

Merriam-webster

通常是专门为快速组织而组织的大量数据集合
搜索和检索(如通过计算机)

数据集(或数据集)

数据集有时指单个数据库表的内容,但这是一个相当严格的定义。一般来说,顾名思义,是一组(或集合)数据,因此存在图像数据集,例如 Caltech-256 对象类别数据集 或视频,例如 监控视频事件识别的大规模基准数据集。数据集的目的通常是为了分析而设计的,而不是为了不同用户的持续更新而设计的,因此代表数据集合的结束或特定时间的快照。

牛津词典:

相关信息集的集合,其组成
单独的元素,但可以由计算机作为一个单元进行操作。

'所有医院必须提供每个患者的标准数据集
详情'

剑桥词典

独立信息集的集合,被视为
计算机单个单元

Database

The definition of the two terms is not always clear. In general a database is a set of data organized and accessible using a database management system (DBMS). Databases usually, but not always, are composed of several tables linked together often accessed, modified and updated by various users often simultaneously.

Cambridge dictionary:

A structured set of data held in a computer, especially one that is
accessible in various ways.

Merriam-webster

a usually large collection of data organized especially for rapid
search and retrieval (as by a computer)

Data set (or dataset)

A data set sometimes refer to the contents of a single database table, but this is quite a restrictive definition. In general, as the name suggests, is a set (or collection) of data hence there are datasets of images like Caltech-256 Object Category Dataset or videos e.g. A large-scale benchmark dataset for event recognition in surveillance video. A data set purpose is usually designed for the analysis rather to a continual update form different users, hence represent the end of a collection of data or a snapshot of a specific time.

Oxford dictionary:

A collection of related sets of information that is composed of
separate elements but can be manipulated as a unit by a computer.

‘all hospitals must provide a standard data set of each patient's
details’

Cambridge dictionary

a collection of separate sets of information that is treated as a
single unit by a computer

两人的回忆 2024-12-17 07:38:11

数据集是数据...通常在表中,也可以是 XML 或其他类型的数据,但它只是数据...它实际上没有做任何事情。

正如您所知,数据库是数据集的容器,通常具有内置的基础设施以与其交互。

对于我的工作来说,管理海量数据并不难。我猜你是在问一个与研究相关的问题?

A dataset is the data... usually in a table or can be XML or other types of data however it's only data... it doesn't really do anything.

And as you know a database is a container for the dataset usually with built in infrastructure around it to interact with it.

Huge data isn't hard to manage for what I do. I guess you're asking a study related question?

转身以后 2024-12-17 07:38:11

数据集只是一组数据(可能与某人相关,也可能与其他人无关),而数据库是组织和存储数据或数据集的软件/硬件组件。两者实际上是不同的东西。

海量数据需要更多的基础设施和组件(硬件和软件)或计算能力和计算能力。用于有效存储或检索数据的存储。更大的数据意味着更多的组件,因此难度更大。现代数据库提供了良好的基础设施来处理大量数据(读/写),检查微软的数据湖管理,它广泛管理关系数据或数据集。

Dataset is just a set of data (maybe related to someone and may not be for others ) whereas Database is a software/hardware component that organizes and stores data or dataset. Both are different things practically.

Huge data needs more infrastructure and components (hardware & software) or computing power & storage for efficient storage or retrieval of data's . More huge data means more components hence difficult. Modern days database provides good infrastructure to handle huge data's processing (both read/write) , check datalake management by Microsoft which manages relational data or dataset extensively.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文