数据集和数据库有什么区别?
数据集和数据库有什么区别?如果它们不同那么如何?
为什么现在使用数据库很难管理海量数据?!
请独立于任何编程语言来回答。
What is the difference between a dataset and a database ? If they are different then how ?
Why is huge data difficult to be manageusing databases today?!
Please answer independent of any programming language.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
在美式英语中,数据库通常意味着“有组织的数据集合”。数据库通常受数据库管理系统的控制,数据库管理系统是管理多用户对数据库的访问的软件。 (通常,但不一定。一些简单的数据库只是用 awk 和 Python 等解释性语言处理的文本文件。)
在我最熟悉的 SQL 世界中,数据库包括表、视图、存储过程等内容、触发器、权限和数据。
同样,在美式英语中,数据集通常指的是选择并按行和列排列的数据,供统计软件处理。数据可能来自数据库,但也可能不是。
In American English, database usually means "an organized collection of data". A database is usually under the control of a database management system, which is software that, among other things, manages multi-user access to the database. (Usually, but not necessarily. Some simple databases are just text files processed with interpreted languages like awk and Python.)
In the SQL world, which is what I'm most familiar with, a database includes things like tables, views, stored procedures, triggers, permissions, and data.
Again, in American English, dataset usually refers to data selected and arranged in rows and columns for processing by statistical software. The data might have come from a database, but it might not.
数据库
这两个术语的定义并不总是明确的。一般来说,数据库是使用数据库管理系统(DBMS)组织和访问的一组数据。数据库通常(但并非总是)由多个链接在一起的表组成,这些表经常被不同的用户同时访问、修改和更新。
剑桥词典:
Merriam-webster
数据集(或数据集)
数据集有时指单个数据库表的内容,但这是一个相当严格的定义。一般来说,顾名思义,是一组(或集合)数据,因此存在图像数据集,例如 Caltech-256 对象类别数据集 或视频,例如 监控视频事件识别的大规模基准数据集。数据集的目的通常是为了分析而设计的,而不是为了不同用户的持续更新而设计的,因此代表数据集合的结束或特定时间的快照。
牛津词典:
剑桥词典
Database
The definition of the two terms is not always clear. In general a database is a set of data organized and accessible using a database management system (DBMS). Databases usually, but not always, are composed of several tables linked together often accessed, modified and updated by various users often simultaneously.
Cambridge dictionary:
Merriam-webster
Data set (or dataset)
A data set sometimes refer to the contents of a single database table, but this is quite a restrictive definition. In general, as the name suggests, is a set (or collection) of data hence there are datasets of images like Caltech-256 Object Category Dataset or videos e.g. A large-scale benchmark dataset for event recognition in surveillance video. A data set purpose is usually designed for the analysis rather to a continual update form different users, hence represent the end of a collection of data or a snapshot of a specific time.
Oxford dictionary:
Cambridge dictionary
数据集是数据...通常在表中,也可以是 XML 或其他类型的数据,但它只是数据...它实际上没有做任何事情。
正如您所知,数据库是数据集的容器,通常具有内置的基础设施以与其交互。
对于我的工作来说,管理海量数据并不难。我猜你是在问一个与研究相关的问题?
A dataset is the data... usually in a table or can be XML or other types of data however it's only data... it doesn't really do anything.
And as you know a database is a container for the dataset usually with built in infrastructure around it to interact with it.
Huge data isn't hard to manage for what I do. I guess you're asking a study related question?
数据集只是一组数据(可能与某人相关,也可能与其他人无关),而数据库是组织和存储数据或数据集的软件/硬件组件。两者实际上是不同的东西。
海量数据需要更多的基础设施和组件(硬件和软件)或计算能力和计算能力。用于有效存储或检索数据的存储。更大的数据意味着更多的组件,因此难度更大。现代数据库提供了良好的基础设施来处理大量数据(读/写),检查微软的数据湖管理,它广泛管理关系数据或数据集。
Dataset is just a set of data (maybe related to someone and may not be for others ) whereas Database is a software/hardware component that organizes and stores data or dataset. Both are different things practically.
Huge data needs more infrastructure and components (hardware & software) or computing power & storage for efficient storage or retrieval of data's . More huge data means more components hence difficult. Modern days database provides good infrastructure to handle huge data's processing (both read/write) , check datalake management by Microsoft which manages relational data or dataset extensively.