top of page

Easily Binding C++ Functions to a Dynamic Runtime

  • Writer: Eldan Ben Haim
    Eldan Ben Haim
  • Sep 13, 2021
  • 6 min read

Updated: Sep 16, 2021

Bracez is a little JSON editor for Mac I've been working on in the past couple of years (most of the time I've been taking breaks from it). One day both me and it will feel we're complete enough to publish it to the huge crowd that sits there, anticipating -- finally -- a JSON editor. Until that day comes, it serves as a nice petri dish to experiment on and practice my C++ and Objective-C -- both of which I don't get to write often at work these days.


Now, I have a healthy appetite to creating mini programming languages, runtime environments and parsers. And so, imagine how happy I was when the Bracez Corp. head of PM told me, the Resident Programmer, that the market has spoken and clearly stated that not only we're going to redefine the entire development market by introducing a JSON editor, we're also going to be the first team that writes yet another JSON Path implementation.


Since I know better than hand-coding a parser, I searched for a smallish C++ parser combinator library, and soon enough found Keean Schupke's aptly named Parser-Combinators library. It took a couple of evenings before I had a working JSON path parser and an executable AST model. The library is really fun to work with, doing exactly what you expect it to do. The only thing that I left out was support for invoking functions in JSON path expressions — for example $.items[(sin(3.14))] (you gotta have support for trigonometric functions in a JSON path!)


An executable AST abstraction will typically be declared like so (pardon my const-ness):


class JsonPathExpressionNode {
  public:
      virtual ~JsonPathExpressionNode() {}
      
      virtual JsonPathExpressionNodeEvalResult evaluate(JsonPathExpressionNodeEvalContext &context) = 0;
  } ;

Where evaluate() is the execution workhorse. Function invocations would be implemented as a node class that holds the sub-expressions for arguments, and some representation of the function itself. This representation can be, for example, another abstract class with an invoke() function -- receiving the results of the sub-expressions as the set of parameters and returning the evaluation result. For our JSON path thingie, this looks something along these lines:



class JsonPathExpressionFunction {
  public:
      virtual JsonPathExpressionNodeEvalResult invoke(const std::list<JsonPathExpressionNodeEvalResult> &args) = 0;
  
      // I felt really smart naming this function.
      virtual int arity() = 0;
  
      static std::map<std::string, JsonPathExpressionFunction*> functions;
  };
  
  class JsonPathExpressionNodeFunctionInvoke: public JsonPathExpressionNode {
  public:
      JsonPathExpressionNodeFunctionInvoke(JsonPathExpressionFunction *fn, std::list<std::unique_ptr<JsonPathExpressionNode>> &&args);
      
      virtual JsonPathExpressionNodeEvalResult evaluate(JsonPathExpressionNodeEvalContext &context);
  
  
  private:
      JsonPathExpressionFunction *_fn;
      std::list<std::unique_ptr<JsonPathExpressionNode>> _args;
  };
  
  JsonPathExpressionNodeEvalResult JsonPathExpressionNodeFunctionInvoke::evaluate(JsonPathExpressionNodeEvalContext &context) {
      std::list<JsonPathExpressionNodeEvalResult> evaledArgs;
      
      std::transform(_args.begin(), _args.end(), back_inserter(evaledArgs), [&context](std::unique_ptr<JsonPathExpressionNode>& arg) {
          return arg->evaluate(context);
      });
      
      return _fn->invoke(evaledArgs);
  }

The code above is pretty much self explanatory, I guess. Now we only need to implement a bunch of JsonPathExpressionFunction derivatives, write the logic to compute the function result, and register the instance in the static JsonPathExpressionFunction::functions map so that when we actually parse a function invocation we can lookup the function by its name.


For example, consider the ever-useful cos(x) function. An implementation for that could look like the following:


class JsonPathCosFunction : public JsonPathExpressionFunction {
  public:
      virtual JsonPathExpressionNodeEvalResult invoke(const std::list<JsonPathExpressionNodeEvalResult> &args) {
         double arg = args.front().getNumericValue();
         return JsonPathExpressionNodeEvalResult.fromDouble(cos(arg));
      }
  
      virtual int arity() { return 1; }
  } cosFunction;


That's quite a low signal-to-noise ratio we got there, isn't it? Won't it be nice if we didn't need all this boilerplate code? Specifically it would be cool if we could create a JsonPathExpressionFunction-derived object that receives a function (e.g., std::function<>) as a parameter, and auto-magically implements invoke() to (a) extract the arguments from the sequence of sub-expression evaluation results, (b) convert them to the types of the corresponding parameters in the target function, and (c) invoke the target function?


Turns out that with a little bit of modern C++ and template-fu we can do just that. First, let's start with defining a set of template functions that take a target data type and try to convert a JsonPathExpressionNodeEvalResult to that type:


template <class T>
inline T convertToArgType(const JsonPathExpressionNodeEvalResult &r) {
    throw JsonPathEvalError("Cannot convert parameter");
}


template<>
inline const JsonPathExpressionNodeEvalResult &convertToArgType<const JsonPathExpressionNodeEvalResult&>(const JsonPathExpressionNodeEvalResult &r) {
    return r;
}


template<>
inline double convertToArgType<double>(const JsonPathExpressionNodeEvalResult &r) {
    return r.getNumericValue();
}

And so on and so forth. By default, if we didn't provide a conversion for a type, an exception is thrown if we try to convert to that type. Otherwise one of the template specializations will get invoked. So much for the easy part.


Now let's look at the code to actually take the array of arguments and transform them to a sequence of parameters. We're doing this using C++ parameter blocks. These represent a variadic set of template types, and the language defines a surprisingly friendly set of constructs to deal with them.


We start with defining these two template utility classes:



template<class ... FuncArgs>
class JsonPathExpressionFunctionAdapter: public JsonPathExpressionFunction {

// ... There's some code here we'll discuss later

// Specialization "A"
template<size_t AC, class Iter, class Func, class ... ArgTypes>
class call_builder {
public:
    inline static typename Func::result_type call(const Func &f, Iter &s, const Iter &e, const ArgTypes& ... args) {
        return call_builder<AC-1, Iter, Func, ArgTypes..., typename Iter::value_type>::call(f, s, e, args..., *(s++));
    }
};

// Specialization "B"
template<class Iter, class Func, class ... ArgTypes>
class call_builder<0, Iter, Func, ArgTypes...> {
public:
    inline static typename Func::result_type call(const Func &f, Iter &s, const Iter &e, const ArgTypes& ... args) {
        if(s != e) {
            throw JsonPathEvalError("Improper number of arguments passed to function.");
        }
        
        return f(convertToArgType<FuncArgs>(args)...);
    }
};
    
}

Template class call_builder defined above implements a single function, call(), that receives:

  • A function to invoke

  • An iterator into a collection of arguments to pass to the function

  • The end iterator of the collection of arguments to pass to the function

  • A set of parameters to pass to the function when invoking it

Note that ... ArgTypes template parameter above: this is a declaration of a parameter block. The function signature makes reference to this argument block, defining a pattern that transforms each parameter type T to a const reference to T (const T&).


Also note that the templates are parameterized by an integer AC. This parameter is going to represent the number of arguments that are yet to be extracted from the arguments sequence. This should start with the number of arguments the function expects (since no argument was extracted yet).


Specialization "A" of the class does the following:

  • Extracts the argument pointed to by the current argument iterator and advances the iterator to the next arg (*s++).

  • Invokes call_builder::call with the parameters passed to it, appending the extracted argument. The call_builder::call version that's invoked has the AC parameter decremented by 1 to indicate there's one less parameter to invoke.


Let's follow this through: A call to call_builder<2, ...>::call(f, s, e) will invoke call_builder<1, ...>::call(f, (s+1), e, *s), which will invoke call_builder<0, ...>::call(f, (s+2), e, *s, *(s+1)). Note how the list of arguments passed in the first call was used to generate a sequence of parameters passed to the function in the last one.


When call_builder<0...>::call is invoked, we're looking at specialization "B" of the class. This specialization basically invokes f, our target function, passing the set of arguments extracted to it. Note how we're again using a parameter block pattern -- transforming each argument arg_i of the variadic args passed to the function into an expression of the form convertToArgType<T_i>(arg_i), where T_i is one of the types in FuncArgs. FuncArgs is a template parameter representing the set of parameter types the target function is expected to receive. Note how call_builder<0...>::call also makes sure that the correct number of parameters was passed in the array (actually Specialization "A" should also include some guard to make sure s doesn't go past e).


With these utilities in place, we can now move to implement JsonPathExpressionFunctionAdapter:


virtual JsonPathExpressionNodeEvalResult invoke(const std::list<JsonPathExpressionNodeEvalResult> &args) {
  auto iter = args.begin();
  return call_builder<sizeof...(FuncArgs),
                      decltype(iter),
                      FunctionType>::call(_function, iter, args.end());
}

virtual int arity() {
  return sizeof...(FuncArgs);
}

typedef std::function<JsonPathExpressionNodeEvalResult (FuncArgs...)> FunctionType;    
FunctionType _function;


invoke() is now really simple... It just bootstraps the sequence of call_builder::call calls, initializing AC to the number of arguments the target function expects. And arity() just returns that argument count.


Finally, we can turn to our important uber-feature and implement a few functions invocable from JSON paths:



namespace JsonPathExpressionNodeFunctions {
  JsonPathExpressionNodeEvalResult cos(double arg) {
      return JsonPathExpressionNodeEvalResult::doubleResult(::cos(arg));
  }

  JsonPathExpressionNodeEvalResult sin(double arg) {
      return JsonPathExpressionNodeEvalResult::doubleResult(::sin(arg));
  }

  JsonPathExpressionNodeEvalResult tan(double arg) {
      return JsonPathExpressionNodeEvalResult::doubleResult(::tan(arg));
  }
}

std::map<std::string, JsonPathExpressionFunction*> JsonPathExpressionFunction::functions {
  { "cos", AdaptToJsonPathExpressionFunction(JsonPathExpressionNodeFunctions::cos) },
  { "sin", AdaptToJsonPathExpressionFunction(JsonPathExpressionNodeFunctions::sin) },
  { "tan", AdaptToJsonPathExpressionFunction(JsonPathExpressionNodeFunctions::tan) }
};

Of course we could enhance this with automatic conversion of the return value as well. But I guess we've proven our point by now ;)


Finally, you may be wondering, given this contrived chain of function calls that’s going on what’s the performance impact we’re paying here for cleaner code. The good news is that nothing significant: note how the call_builder::call function is defined as inline. In my experiment, compiling this code using XCode/clang/LLVM, the compiler generates inlined assembly code that's more-or-less equivalent to the code that would be generated with the manually-coded boilerplate code above. This entire chain of call_builder::call calls is inlined away.



Recent Posts

See All

6 Comments


Avner Gideoni
Avner Gideoni
Sep 16, 2021

ye, what he said

Like

frishrash
frishrash
Sep 15, 2021

Very nice. Your appetite to do all this in C++ is nothing less than romantic!


Now, a killer feature for Bracez Corp. head of PM - multi-select primitives (for multi-cursor editing) for all values of keys (not) matching an expression. Something like Jayway's predicates, but that can filter based on (1) key name and (2) not only array elements. Why would anyone want that? for example, so you can quickly edit and redact everything under $.personalInfo except keys matching "*name*".

Like
Eldan Ben Haim
Eldan Ben Haim
Sep 16, 2021
Replying to

Nice! Bracez already supports these predicates, and indeed not only for array elements. Adding filtering by key-name shouldn't be too hard and I added it to the backlog. Multi-select cursors are not always intuitive, definitely when there are plenty of them, but I guess replacing values for matched JSON nodes makes sense. Thanks!

Like

Doron Ben Ari
Doron Ben Ari
Sep 15, 2021

Great write up. I just wonder what is the REAL reason you need to call Cos(x) from JSON path. What are you up to ?!

PS: Dude, you have proven your point loooong ago ;-)

Like
Eldan Ben Haim
Eldan Ben Haim
Sep 16, 2021
Replying to

How do you mean? Arrays in JSON are cos(0.5*pi)-based; so the indices are cos(0.5*pi), cos(0), ...

Like

Avner Gideoni
Avner Gideoni
Sep 14, 2021

OMG. BTW, I understood the most of it.

Like
follow me
  • LinkedIn
  • Twitter

Thanks for submitting!

© 2021 by Eldan Ben-Haim. All rights reserved.

bottom of page