JavaScript 中 CSV 数据的数据操作(填充、最小、最大)?

发布于 2024-11-29 07:42:04 字数 1014 浏览 2 评论 0原文

我正在将不同的指标 CSV 文件加载到 JavaScript 中,例如:

CSV for population:

id,year,value
AF,1800,3280000
AF,1820,3280000
AF,1870,4207000
AG,1800,37000
AG,1851,37000
AG,1861,37000

对于每个指标文件,我需要:

  • 填补每个实体缺失的年份 (id)
  • 查找每个实体的时间跨度
  • 查找每个实体的最小值和最大值
  • 查找时间跨度对于指标
  • 查找指标的最小值和最大值

执行这些操作的廉价方法是什么?或者,是否有一个好的 JavaScript 库来执行这些常见的数据操作并将数据有效地存储在各种对象表示中?

我希望上述文件的最终表示看起来像这样:

data = {
    population : {
        entities : 
            AF : {
                data : {
                    1800 : 3280000,
                    1801 : 3280000,
                 },
                entity_meta : {
                    start : 1800,
                    end : 
                    min : 
                    max :
             },
            [...]
        indicator_meta : {
                start : 1700,
                end : 
                min : 
                max :
        }
        [...]

谢谢!

I'm loading different indicator CSV files into JavaScript, example:

CSV for population:

id,year,value
AF,1800,3280000
AF,1820,3280000
AF,1870,4207000
AG,1800,37000
AG,1851,37000
AG,1861,37000

For each indicator file I need to:

  • Gap fill missing years for each entity (id)
  • Find the time span for each entity
  • Find the min and max for each entity
  • Find the time span for the indicator
  • Find the min and max for the indicator

What is an inexpensive way of performing these operations? Alternatively, is there a good JavaScript library for performing these kind of common data operations and storing the data effectively in various object representations?

I'd like the final representation of the above file to look something like:

data = {
    population : {
        entities : 
            AF : {
                data : {
                    1800 : 3280000,
                    1801 : 3280000,
                 },
                entity_meta : {
                    start : 1800,
                    end : 
                    min : 
                    max :
             },
            [...]
        indicator_meta : {
                start : 1700,
                end : 
                min : 
                max :
        }
        [...]

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

﹎☆浅夏丿初晴 2024-12-06 07:42:04

假设您有 2d 数组中的 CSV 数据:

var data = [[AF,1800,3280000],
[AF,1820,3280000],
[AF,1870,4207000],
[AG,1800,37000],
[AG,1851,37000],
[AG,1861,37000]]

在这个示例中,我将使用 jQuery 实用函数,因为它将使工作变得更加容易,而无需任何实际开销。

// we will loop thru all the rows
// if the id does not belong to the entities then we will add the property.
// if the property does exist then we update the values

var entities = {}
$.each(data, function (i, n) {

    // set property
    if (!entities[n[0]]) {
        entities[n[0]] = {
            data : {
                n[1]: n[2]
            },
            entity_meta: {
                start: n[1],
                end: n[1]
                min: n[1]
                max: n[1]
            }
        }

    // update property
    } else {

        // add new data property
        entities[n[0]]['data'][n[1]] = n[2];

        // if the property should change then update it
        if ( entities[n[0]]['entity_meta']['min'] > n[1] ) {
             entities[n[0]]['entity_meta']['min'] = n[1];
        }
    }
});

这显然不是全部代码,但它应该清楚地解释应该采取的方法。

也不是说你想要的最终对象结构非常过于复杂,你应该在适当的地方使用数组,特别是对于实体数据

Lets Assume that you have the CSV data in a 2d array:

var data = [[AF,1800,3280000],
[AF,1820,3280000],
[AF,1870,4207000],
[AG,1800,37000],
[AG,1851,37000],
[AG,1861,37000]]

For this example I will use jQuerys utility functions as it will make the job a lot easier without any real overhead.

// we will loop thru all the rows
// if the id does not belong to the entities then we will add the property.
// if the property does exist then we update the values

var entities = {}
$.each(data, function (i, n) {

    // set property
    if (!entities[n[0]]) {
        entities[n[0]] = {
            data : {
                n[1]: n[2]
            },
            entity_meta: {
                start: n[1],
                end: n[1]
                min: n[1]
                max: n[1]
            }
        }

    // update property
    } else {

        // add new data property
        entities[n[0]]['data'][n[1]] = n[2];

        // if the property should change then update it
        if ( entities[n[0]]['entity_meta']['min'] > n[1] ) {
             entities[n[0]]['entity_meta']['min'] = n[1];
        }
    }
});

That obviously isn't all the code but it should explain clearly the approach that should be taken.

Also not that your intended final object structure is very much over complicated you should really use arrays where appropriate, especially for entities and data.

甜妞爱困 2024-12-06 07:42:04

使用 jQuery AJAX 获取 CSV 文件。

$.get("test_csv.csv", function(result){
    csvParseAndCalc(result);
});

使用简单的 JavaScript 解析 CSV 并执行计算

// assumes your sample data is how all data will look
// proper csv parsing (by the spec) is not used is favor is speed
function csvParseAndCalc(result) {
var entities = {};
var indicator_meta = {"start":null, "end":null, "min":null, "max":null};
var rows = result.split('\n'); //your data doesnt need proper (to spec) csv parsing
// run calculations ignore header row
for(var i=1; i<rows.length; i++) {
    var r = rows[i].split(',');
    var id = r[0];
    var yr = parseInt(r[1]);
    var val = parseInt(r[2]);
    var entity = entities[id];
    var edata;
    var emeta;
    // create entity if it doesn't exist
    if(entity == null) {
        entities[id] = { "data": {}, "entity_meta": {"start":null, "end":null, "min":null, "max":null} };
        entity = entities[id];
    }
    // entity data
    edata = entity.data;
    edata[yr] = val;
    // entity meta
    emeta = entity.entity_meta
    if(emeta.start == null || emeta.start > yr) emeta.start = yr;
    if(emeta.end == null || emeta.end < yr) emeta.end = yr;
    if(emeta.min == null || emeta.min > val) emeta.min = val;
    if(emeta.max == null || emeta.max < val) emeta.max = val;
    // calc indicator_meta
    if(indicator_meta.start==null || indicator_meta.start > yr)
        indicator_meta.start = yr;
    if(indicator_meta.end==null || indicator_meta.end < yr)
        indicator_meta.end = yr;
    if(indicator_meta.min==null || indicator_meta.min > val)
        indicator_meta.min = val;
    if(indicator_meta.max==null || indicator_meta.max < val)
        indicator_meta.max = val;
}
// fill gaps on entity data
for(var id in entities) {
    var entity = entities[id];
    var emeta = entity.entity_meta;
    var edata = entity.data;
    for(var i=emeta.start + 1; i<emeta.end; i++) {
        if(edata[i] == null) edata[i] = edata[i-1];
    }
}
return {"population": {"entities":entities, "indicator_meta":indicator_meta} };
}

Use jQuery AJAX to get the CSV file.

$.get("test_csv.csv", function(result){
    csvParseAndCalc(result);
});

Use a simple JavaScript to parse the CSV and perform the calculations

// assumes your sample data is how all data will look
// proper csv parsing (by the spec) is not used is favor is speed
function csvParseAndCalc(result) {
var entities = {};
var indicator_meta = {"start":null, "end":null, "min":null, "max":null};
var rows = result.split('\n'); //your data doesnt need proper (to spec) csv parsing
// run calculations ignore header row
for(var i=1; i<rows.length; i++) {
    var r = rows[i].split(',');
    var id = r[0];
    var yr = parseInt(r[1]);
    var val = parseInt(r[2]);
    var entity = entities[id];
    var edata;
    var emeta;
    // create entity if it doesn't exist
    if(entity == null) {
        entities[id] = { "data": {}, "entity_meta": {"start":null, "end":null, "min":null, "max":null} };
        entity = entities[id];
    }
    // entity data
    edata = entity.data;
    edata[yr] = val;
    // entity meta
    emeta = entity.entity_meta
    if(emeta.start == null || emeta.start > yr) emeta.start = yr;
    if(emeta.end == null || emeta.end < yr) emeta.end = yr;
    if(emeta.min == null || emeta.min > val) emeta.min = val;
    if(emeta.max == null || emeta.max < val) emeta.max = val;
    // calc indicator_meta
    if(indicator_meta.start==null || indicator_meta.start > yr)
        indicator_meta.start = yr;
    if(indicator_meta.end==null || indicator_meta.end < yr)
        indicator_meta.end = yr;
    if(indicator_meta.min==null || indicator_meta.min > val)
        indicator_meta.min = val;
    if(indicator_meta.max==null || indicator_meta.max < val)
        indicator_meta.max = val;
}
// fill gaps on entity data
for(var id in entities) {
    var entity = entities[id];
    var emeta = entity.entity_meta;
    var edata = entity.data;
    for(var i=emeta.start + 1; i<emeta.end; i++) {
        if(edata[i] == null) edata[i] = edata[i-1];
    }
}
return {"population": {"entities":entities, "indicator_meta":indicator_meta} };
}
坏尐絯℡ 2024-12-06 07:42:04

也许,YUI 对一些批量操作会有帮助。 http://yuilibrary.com/yui/docs/dataschema/dataschema-text.html

Maybe, YUI would be helpful for some bulk operations. http://yuilibrary.com/yui/docs/dataschema/dataschema-text.html

旧伤慢歌 2024-12-06 07:42:04

有 javascript sql 数据库库。我想到了 TaffyDB。

There are javascript sql database libraries. TaffyDB comes to mind.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文