添加条件和数学解析器的函数

发布于 2024-09-10 14:33:27 字数 4117 浏览 9 评论 0原文

我构建了一个基于二叉树的数学表达式解析器,它非常适合“正常”数学,例如:(3.5 * 2) ^ 1 / (1 << 6)。但是,我想稍微扩展它以添加一个三元选择运算符,镜像 C: {expr} ? {true-expr} : {false-expr}。我还想添加函数,例如 sin(x)ave(...)

然而,我不知道如何处理这个问题(由于评估的工作方式),我也无法在网络上找到任何涵盖此问题的内容,至少以非语法方式(我想避免语法解析器生成器)为此,如果可能的话)。

我的解析器当前的工作原理是评估中缀表达式并立即将其转换为树,然后可以从那里评估树,即:它是标准表达式树。

目前我的评估器看起来像这样:

struct Node
{
    int nType;
    union
    {
        unsigned long dwOperator;
        BOOL bValue;
        int nValue; //for indices, args & functions
        number_t fValue;
        char* szValue; //for string literals to pass to functions
    };

    Node* pLeft;
    Node* pRight;
};

number_t EvaluateTree(Node* pNode)
{
    if(pNode == NULL)
        return 0.0f;

    int nType = pNode->nType;
    if(nType == TOKEN_OPERATOR)
    {
        number_t fLeft = EvaluateTree(pNode->pLeft);
        number_t fRight = EvaluateTree(pNode->pRight);
        switch(pNode->dwOperator)
        {
            case '+': return fLeft + fRight;
            case '-': return fLeft - fRight;
            case '*': return fLeft * fRight;
            case '/': return fLeft / fRight;
            case '^': return pow(fLeft,fRight);
            case '_': return pow(fLeft,1.0f/fRight); 
            case '%': return fmod(fLeft,fRight);

            //case '?': return bSelect = ?;
            //case ':': return (bSelect) ? fLeft : fRight;

            //case '>': return fLeft > fRight;
            //case '<': return fLeft < fRight;
            //case '>=': return fLeft >= fRight;
            //case '<=': return fLeft <= fRight;
            //case '==': return fLeft == fRight;
            //case '!=': return fLeft != fRight;
            //case '||': return fLeft || fRight;
            //case '&&': return fLeft && fRight;

            case '&': return static_cast<number_t>(static_cast<unsigned long>(fLeft) & static_cast<unsigned long>(fRight));
            case '|': return static_cast<number_t>(static_cast<unsigned long>(fLeft) | static_cast<unsigned long>(fRight));
            case '~': return static_cast<number_t>(~static_cast<unsigned long>(fRight));
            case '>>': return static_cast<number_t>(static_cast<unsigned long>(fLeft) >> static_cast<unsigned long>(fRight));
            case '<<': return static_cast<number_t>(static_cast<unsigned long>(fLeft) << static_cast<unsigned long>(fRight));

            default:  
                {
                    printf("ERROR: Invalid Operator Found\n");
                    return 0.0f;
                }
        }
    }
    else if(nType == TOKEN_NUMBER)
        return pNode->fValue;
    else if(nType == TOKEN_CALL)
        return CreateCall(pNode); //not implemented
    else if(nType == TOKEN_GLOBAL)
        return GetGlobal(pNode);
    else if(nType == TOKEN_ARGUMENT)
        return GetArgument(pNode);
    else if(nType == TOKEN_STRING)
        return 0.0f;

    return 0.0f;
}

关于如何完成此任务的任何提示/指针/建议或有用的链接?


一小组示例(根据要求):

我已经有工作

输入:2 * (3 ^ 1.5) - 4 / (1 << 3)

输出:按顺序: 2.0 * 3.0 ^ 1.5 - 4.0 / 1.0 << 3.0

预购:- * 2.0 ^ 3.0 1.5 / 4.0 << 1.0 3.0

订单后:2.0 3.0 1.5 ^ * 4.0 1.0 3.0 << / -

结果:9.892304

我想添加什么

输入:(GetDay() == 31) ? -15.5 : 8.4

输出:8.4

31 号输出:-15.5

输入:max([0],20) (其中 [0] 表示参数 0,[0] = 35)

输出:20

输入:(GetField('employees','years_of_service',[0]) >= 10) ? 0.15 : 0.07 (其中 [0] 是参数 0,[0] 设置为有效索引)

输出(如果员工的years_of_service 小于 10:0.15

否则输出: 0.07

C 启发的添加,除了参数不是按名称传递,而是按索引传递,并且字符串是用单引号而不是双引号转义的。

它基本上是数学,带有一些受 我希望对其进行字节码编译或 JIT,因为我计划将其用于游戏或依赖数学的程序等,其中输入集数据是恒定的,但输入集可以更改,但它被频繁使用,所以它需要“快”,并且需要可供非程序员使用。

I have a binary tree based mathematical expression parser I built, which works great for 'normal' math, like: (3.5 * 2) ^ 1 / (1 << 6). however, I would like to expand it a little to add a ternary selection operator, mirroring the one from C: {expr} ? {true-expr} : {false-expr}. I would also like to add functions, like sin(x) or ave(...).

I however have no clue to how the handle this (due to the way the evaluation works), nor can I find anything on the web that covers this, atleast in a non-grammer based way (I'd like to avoid grammer parser generators for this, if possible).

My parser current works by evaluating an infix expression and immediatly converting it to a tree, then from there the tree can be evaluated, ie: its you bog standard expression tree.

currently my evaluator looks like so:

struct Node
{
    int nType;
    union
    {
        unsigned long dwOperator;
        BOOL bValue;
        int nValue; //for indices, args & functions
        number_t fValue;
        char* szValue; //for string literals to pass to functions
    };

    Node* pLeft;
    Node* pRight;
};

number_t EvaluateTree(Node* pNode)
{
    if(pNode == NULL)
        return 0.0f;

    int nType = pNode->nType;
    if(nType == TOKEN_OPERATOR)
    {
        number_t fLeft = EvaluateTree(pNode->pLeft);
        number_t fRight = EvaluateTree(pNode->pRight);
        switch(pNode->dwOperator)
        {
            case '+': return fLeft + fRight;
            case '-': return fLeft - fRight;
            case '*': return fLeft * fRight;
            case '/': return fLeft / fRight;
            case '^': return pow(fLeft,fRight);
            case '_': return pow(fLeft,1.0f/fRight); 
            case '%': return fmod(fLeft,fRight);

            //case '?': return bSelect = ?;
            //case ':': return (bSelect) ? fLeft : fRight;

            //case '>': return fLeft > fRight;
            //case '<': return fLeft < fRight;
            //case '>=': return fLeft >= fRight;
            //case '<=': return fLeft <= fRight;
            //case '==': return fLeft == fRight;
            //case '!=': return fLeft != fRight;
            //case '||': return fLeft || fRight;
            //case '&&': return fLeft && fRight;

            case '&': return static_cast<number_t>(static_cast<unsigned long>(fLeft) & static_cast<unsigned long>(fRight));
            case '|': return static_cast<number_t>(static_cast<unsigned long>(fLeft) | static_cast<unsigned long>(fRight));
            case '~': return static_cast<number_t>(~static_cast<unsigned long>(fRight));
            case '>>': return static_cast<number_t>(static_cast<unsigned long>(fLeft) >> static_cast<unsigned long>(fRight));
            case '<<': return static_cast<number_t>(static_cast<unsigned long>(fLeft) << static_cast<unsigned long>(fRight));

            default:  
                {
                    printf("ERROR: Invalid Operator Found\n");
                    return 0.0f;
                }
        }
    }
    else if(nType == TOKEN_NUMBER)
        return pNode->fValue;
    else if(nType == TOKEN_CALL)
        return CreateCall(pNode); //not implemented
    else if(nType == TOKEN_GLOBAL)
        return GetGlobal(pNode);
    else if(nType == TOKEN_ARGUMENT)
        return GetArgument(pNode);
    else if(nType == TOKEN_STRING)
        return 0.0f;

    return 0.0f;
}

Any tips/pointers/advice or useful links on how I can accomplish this?


A small set of examples (as requested):

What I already have working

Input: 2 * (3 ^ 1.5) - 4 / (1 << 3)

Output: In-Order: 2.0 * 3.0 ^ 1.5 - 4.0 / 1.0 << 3.0

Pre-Order: - * 2.0 ^ 3.0 1.5 / 4.0 << 1.0 3.0

Post-Order: 2.0 3.0 1.5 ^ * 4.0 1.0 3.0 << / -

Result: 9.892304

What I want to add

Input: (GetDay() == 31) ? -15.5 : 8.4

Output: 8.4

Output on the 31st: -15.5

Input: max([0],20) (where [0] denotes argument 0, and [0] = 35)

Output: 20

Input: (GetField('employees','years_of_service',[0]) >= 10) ? 0.15 : 0.07 (where [0] is argument 0, and [0] is set to a valid index)

Output (if years_of_service for the emplyee is less than 10: 0.15

else Output: 0.07

Its basically math with some C inspired additions, except arguments aren't passed by name, but rather index, and strings are escaped by single quotes instead doubles.

When once I have the final bit done, I'm hoping to either bytecode compile or JIT it, as I'm planing to use this for things like games or math reliant programs, where the input set data is constant, but the input set can change, but its being used frequently, so it needs to be 'fast', and it needs to be usable by non-programmers.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

此生挚爱伱 2024-09-17 14:33:27

正确的做法是?和 : 取决于解析器生成的树。我会假装解析器生成一棵树,例如

      ?
  b       :
        t   f

首先,您不需要在切换之前评估树,并且大多数地方您将类似的内容更改

fLeft + fRight;

EvaluateTree(pNode->pLeft) + EvaluateTree(pNode->pRight);

用所有各种运算符替换的 + 。

对于 ?: 你做...

case ':': return 0.0f; /* this is an error in the parse tree */
case '?': if (!(pNode && pNode->pLeft && pNode->pRight &&
                pNode->pRight->pLeft && pNode->pRight->pRight))
             /* another error in the parse tree */
             return 0.0f;
          return EvaluateBool(pNode->pLeft) ?
                   EvaluateTree(pNode->pRight->pLeft) :
                   EvaluateTree(pNode->pRight->pRight) ;

对于 EvaluateBool 的定义,你有几个选择。 C 方式或多或少是

BOOL EvaluateBool(Node* pNode)
{
    return (EvaluateTree(pNode) == 0.0) ? FALSE : TRUE;
}

那么你需要 '<' 的定义以及返回 0.0 表示 false 和其他任何值表示 true 的朋友。值 -1 是一个非常流行的真值,尽管通常用于将布尔值存储在整数中。

更结构化的方法是移动所有运算符,例如“<”将布尔值返回到 EvaluateBool 的主体中,并使其或多或少像 EvaluateTree 一样工作。

最后,您也可以更改节点(和解析器)的定义以具有最多三个子树,而不是使三元运算符 ?: 使用两个节点,然后大多数运算符将有两棵树,但 ?: 将有三个。也许像这样

case '?': return EvaluateBool(pNode->pLeft) ?
                   EvaluateTree(pNode->pMiddle) : 
                   EvaluateTree(pNode->pRight) ;

但是你将不得不重写你的前序、中序、后序树遍历。

第二部分,功能。一种方法是将函数的名称存储在 szValue 中。另一个是根据函数的不同,nType 有一堆不同的值。您必须在解析器中选择一些规则,并在解释器中使用它。你可以做类似的事情...

else if(nType == TOKEN_CALL)
    return EvaluateFunc(pNode);

然后 EvaluateFunc 可能看起来像这样

number_t EvaluateFunc(Node* pNode)
{
    if ((pNode == NULL) || (pNode->szValue == NULL))
        return 0.0f;
    if (0 == strcmp('cos', pNode->szValue))
        return my_cos(EvaluateTree(pNode->pLeft));
    else if (0 == strcmp('gcd', pNode->szValue))
        return my_gcd(EvaluateTree(pNode->pLeft),
                      EvaluateTree(pNode->pRight));
    /* etc */
    else /* unknown function */ return 0.0f;
}

看起来是一个有趣的项目,享受吧!

The correct thing to do for ? and : depends on the tree produced by the parser. I will pretend the parser generates a tree like

      ?
  b       :
        t   f

First you need to not evaluate the trees before the switch, and most places you change something like

fLeft + fRight;

into

EvaluateTree(pNode->pLeft) + EvaluateTree(pNode->pRight);

With + replaced by all the various operators.

For ?: you do ....

case ':': return 0.0f; /* this is an error in the parse tree */
case '?': if (!(pNode && pNode->pLeft && pNode->pRight &&
                pNode->pRight->pLeft && pNode->pRight->pRight))
             /* another error in the parse tree */
             return 0.0f;
          return EvaluateBool(pNode->pLeft) ?
                   EvaluateTree(pNode->pRight->pLeft) :
                   EvaluateTree(pNode->pRight->pRight) ;

For a definition of EvaluateBool you have a couple choices. The C way is more or less

BOOL EvaluateBool(Node* pNode)
{
    return (EvaluateTree(pNode) == 0.0) ? FALSE : TRUE;
}

Then you need definitions for '<' and friends that return 0.0 for false, and anything else for true. The value -1 is a very popular true value, though generally for storing bools in ints.

The more structured way is to move all the operators like '<' that return booleans into the body of EvaluateBool, and make it work more-or-less like EvaluateTree does.

Finally, instead of making the ternary operator ?: use two nodes, you could also change the definition of the node (and the parser) to have up to three sub trees, then most operators would have two trees, but ?: would have three. Maybe something like

case '?': return EvaluateBool(pNode->pLeft) ?
                   EvaluateTree(pNode->pMiddle) : 
                   EvaluateTree(pNode->pRight) ;

But then you'll have to rewrite your pre-order, in-order, post-order tree traversals.

Second part, functions. One way you could do it is store the name of the function in szValue. Another is have a bunch of different values for nType depending on the function. You will have to pick some rule in the parser, and use it here in the interpreter. You could do something like...

else if(nType == TOKEN_CALL)
    return EvaluateFunc(pNode);

Then EvaluateFunc could look something like

number_t EvaluateFunc(Node* pNode)
{
    if ((pNode == NULL) || (pNode->szValue == NULL))
        return 0.0f;
    if (0 == strcmp('cos', pNode->szValue))
        return my_cos(EvaluateTree(pNode->pLeft));
    else if (0 == strcmp('gcd', pNode->szValue))
        return my_gcd(EvaluateTree(pNode->pLeft),
                      EvaluateTree(pNode->pRight));
    /* etc */
    else /* unknown function */ return 0.0f;
}

Looks like a fun project, enjoy!

简单气质女生网名 2024-09-17 14:33:27

我认为你应该改变你的“Node”结构以拥有一个子数组,而不是“pLeft”和“pRight”。像 sin() 这样的函数有一个参数/子函数。条件(三元)运算符具有三个参数/子项。

I think you should change your "Node" struct to have an array of children, instead of "pLeft" and "pRight". A function like sin() has one argument/child. The conditional (ternary) operator has three arguments/children.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文