将 SQLite 数据库转换为三元组存储

发布于 2024-10-14 02:25:18 字数 67 浏览 4 评论 0原文

有人可以描述一下将 SQLite 数据库转换为三元组存储所需的步骤吗?

有没有一个工具可以完成这个任务?

Can somebody please describe the steps neccessary to convert a SQLite Database to a triple store?

Is there a tool that can accomplish the task?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

Oo萌小芽oO 2024-10-21 02:25:18

这是一个比我问它时看起来更复杂的问题,但简单的答案是你完全规范化你的数据库。完全规范化后,每个表代表一个谓词,一列值代表主语,一列值代表宾语。在此基础上你可以将任意sql数据库转换为三元组存储。

This is a more complicated question then it seemed when I asked it, but the simple answer is that you normalize your database completely. After it is completely normalized each table stands for a predicate, one columns values represent the subject and one columns values represent the object. You can convert an arbitrary sql database to a triplestore on this basis.

层林尽染 2024-10-21 02:25:18

函数转换为三元组将任何类型的关系数据转换为三元组格式:

   def transform_to_triple(source,db_name,table,result):
    #get the list of relations for the selected DB
    max_records = 100
    response = []
    x_print = lambda *x : response.append("(%s)\n" %("".join(["%s"%(v) for v in x])))

    id = 1

    x_print(id,',(db_name:string),',db_name)
    logger.info("(%s,(db_name,string), %s)" %(id,db_name))

    tables = []
    table_list = [table,]
    for i, _table in enumerate(table_list):
        _table_id = id + i + 1
        x_print(id,',(rel:id),', _table_id)
        logger.info("(%s,(rel, id), %s)" %(id, _table_id))

        _schema = get_column_list(source, db_name,_table)
        tables.append((_table_id, _table, _schema))
    for _table in tables:
        _table_id = _table[0]
        x_print(_table_id,',(rel_name:string),',_table[1])
        for j,row in enumerate(result):
            #lets assume there is always less than 10 k tuples in a table
            _tuple_id = _table_id * max_records + j + 1
            x_print(_table[0],',(tuple:id),', _tuple_id)
            logger.info("(%s,(tuple, id), %s)" %(_table[0],_tuple_id))
        for j,row in enumerate(result):
            _tuple_id = _table_id * max_records + j + 1
            for k,value in enumerate(row):
                x_print(_tuple_id, ",(%s : %s)," %(_table[2][k][0], _table[2][k][1]), value)   
    return "%s" %("".join(response))

get_column_list 函数返回数据库表中的列列表:

def get_column_list(src_name,db_name,table_name):
     cur = get_connect() #Connecting with tool DB
     query = '''select db_name, host, user_name, password from "DataSource" where src_name = '%s' and db_name = '%s' '''%(src_name, db_name)
     cur.execute(query)
     data  = cur.fetchall()
     (db, host, username, password) = data[0]
     _module = get_module(src_name)
     cursor = _module.get_connection(db, host, username, password)
     try:
          _column_query = _module.COLUMN_LIST_QUERY %(db_name, table_name)
     except TypeError, e:
          try:
               _column_query = _module.COLUMN_LIST_QUERY %(table_name)
          except TypeError, e:
               _column_query = _module.COLUMN_LIST_QUERY

     cursor.execute(_column_query)
     column_list = cursor.fetchall()
     return column_list

The function transform to triple convert any kind of relational data into triple format:

   def transform_to_triple(source,db_name,table,result):
    #get the list of relations for the selected DB
    max_records = 100
    response = []
    x_print = lambda *x : response.append("(%s)\n" %("".join(["%s"%(v) for v in x])))

    id = 1

    x_print(id,',(db_name:string),',db_name)
    logger.info("(%s,(db_name,string), %s)" %(id,db_name))

    tables = []
    table_list = [table,]
    for i, _table in enumerate(table_list):
        _table_id = id + i + 1
        x_print(id,',(rel:id),', _table_id)
        logger.info("(%s,(rel, id), %s)" %(id, _table_id))

        _schema = get_column_list(source, db_name,_table)
        tables.append((_table_id, _table, _schema))
    for _table in tables:
        _table_id = _table[0]
        x_print(_table_id,',(rel_name:string),',_table[1])
        for j,row in enumerate(result):
            #lets assume there is always less than 10 k tuples in a table
            _tuple_id = _table_id * max_records + j + 1
            x_print(_table[0],',(tuple:id),', _tuple_id)
            logger.info("(%s,(tuple, id), %s)" %(_table[0],_tuple_id))
        for j,row in enumerate(result):
            _tuple_id = _table_id * max_records + j + 1
            for k,value in enumerate(row):
                x_print(_tuple_id, ",(%s : %s)," %(_table[2][k][0], _table[2][k][1]), value)   
    return "%s" %("".join(response))

get_column_list function returns the list of columns within a database tables:

def get_column_list(src_name,db_name,table_name):
     cur = get_connect() #Connecting with tool DB
     query = '''select db_name, host, user_name, password from "DataSource" where src_name = '%s' and db_name = '%s' '''%(src_name, db_name)
     cur.execute(query)
     data  = cur.fetchall()
     (db, host, username, password) = data[0]
     _module = get_module(src_name)
     cursor = _module.get_connection(db, host, username, password)
     try:
          _column_query = _module.COLUMN_LIST_QUERY %(db_name, table_name)
     except TypeError, e:
          try:
               _column_query = _module.COLUMN_LIST_QUERY %(table_name)
          except TypeError, e:
               _column_query = _module.COLUMN_LIST_QUERY

     cursor.execute(_column_query)
     column_list = cursor.fetchall()
     return column_list
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文