Saturday, February 25, 2017

Node js C++ Addon Overload Resolution - v1.0.0

Module compiles and all tests are passing cross platform, windows/linux/arm (raspberry pi) were tested!




Lessons learned:
- gcc is a lot less forgiving than visual c++, cryptic error messages when there are circular references between header files.
- gcc 6 is missing some minor c++17 implementations (std::size for example).
- c++ 11 (and up) - windows header includes some headers by default, in gcc you have to include them explicitly, like cmath.
- node-gyp adds c++11 compilation flag which kills gcc's ability to compile c++ 14. it can be removed but why was it even there...?

edit 2017-02-26:
- gcc multi-line comment warning makes sense when you realize the backslash at the end ( \ ) might be used to comment the line following the comment.
- gcc doesn't like extra (and useless) typename keyword
- gcc doesn't like you not using typename where it should be clear its a type definition. 
- gcc doesn't like to mix definitions from headers and code files. 

and many more... 

I guess gcc makes you write better code :-)

OpenCV Node js Bindings - HOG Demo

This is a simple demo showing that the addon is starting to work, many of the basic objects are implemented and working, including Rect, Point, Matrix, and more.

The HOG demo is actually a ported version of pedestrian detection demo from opencv_extras, since the CPU version of VideoCapture only plays mpeg files, I've modified utility.ts (which is ported from utility.h/cpp) to use my own ffmpeg bindings instead.

The porting work I've done attempts to include the GPU APIs as well, but until I'll implement cuda/GPU work in node-alvision, its only a stub, but the CPU version works as well as I could see, the code runs pretty slow since I've tested it in in debug mode while developing, but it should run faster in release and these APIs could run in node js threadloop, so in theory it could be a lot faster, but it remains to be seen if it will be.

ported typescript file is pedestrian_detection.ts


Saturday, February 18, 2017

Node js C++ Addon Overload Resolution - v0.7.0

Another minor release but with a major new type!

- AsyncCallback - this is a new type the module exposes, it allows a stand alone thread to do callbacks into v8 by queuing the call data and signalling libuv to execute it next time the event loop executes, by far the most useful feature of this release!

Note that there is an option to change the callback from weak to strong and by that making node js wait until the callback is destroyed before exiting making it useful in many situations.

- expose get_type to classes outside the node-overload resolution - while its very beneficial to hide most of the type information from classes using this module, in some cases such as property accessors which don't pass through the module but still need to make sure the data types passed to them is the one expected, get_type exposes this functionality.

- fix array convertible checks - when using arrays, the old method wasn't checking the correct conversion is possible from array types to other array types. while it is possible to convert array to number (in JavaScript anything is possible ;-) ), it should not be considered as a valid convertible but now it is also allowed to convert an array of derived to array of base.

Saturday, February 4, 2017

Node js C++ Addon Overload Resolution - v0.5.0

After being stuck with a few tests for node-alvision, I've decided to see why they take so much time.

From what I could gather while doing performance analysis, the major bottlenecks were the functions that determine the appropriate overload in general and the ones that do the actual type analysis and convertible checks.

Doing type analysis in v8 is not so straight forward as it seems, numbers are convertible to strings and back, almost anything can be converted to boolean and on top of that, checking if a certain object belongs to a C++ class also needs some work.

What the POC did is go over all the registered types and then determine which type the class belongs to but that can be shortened to checking only the names of the prototype chain, so if we have a type system of 230 types give or take, it will go over 3-5 the most after the change.

Now, if we have a function with 10 parameters and with 10 overloads, it would go over 100 type checks the most. so I've took out the arguments checks out of the overload matching function and created another object called function_arguments, which queried the argument type only once instead of multiple times and returned a cached result.

Another performance improvement is the array type checks, which went over the entire array and checked each element type, instead of it, I've decided to make it a little less reliable for the sake of performance, so now it will check 10 the most, so if we have a 1000 items in the array, it will skip every 100 items and check only their types.

So all of these are actually a sort of type system, which I refactored it into a new type_system class.

While it did boost the performance, I wasn't pleased with the boost, so instead of waiting about a minute for results, it would take only 20 seconds.

I went ahead and analyzed the bottlenecks further and found it that while the type checks themselves are no longer an issue, going over 10-20 overload for certain functions is still time consuming and after doing it once, if I know the argument types, there's no reason to do it again for the same function/classes.

This is where function_rank_cache class came into play, it caches the correct function given the same conditions apply, same class/function and argument types.

This improved the performance greatly and now I'm left with about 5 seconds of actual testing time for the said test.

All of these improvements were made in Debug, compiling it as Release improved things beyond my expectations for the moment.

In the end, what helped the most is doing only the work necessary :-)