Following from my previous articles on setting up an Alchemy development environment in Flex Builder 3 and passing/returning objects to/from C++, I wanted to test some of the claims of the speed increase possible with the use of this tool.
With the particular interest in 3D Flash applications, I wanted to test specific mathematical operations using vectors and matrices, namely cross products, normalisation, rotation matrix calculations and vector transformations.
I had initially aimed to create a mathematical library where these functions can be calculated on native ActionScript objects – for example create a function to calculate the cross product of two vectors. However one aspect of Alchemy that became immediately evident is that the cost of marshaling data through the AS3-C++ API is horrendously expensive. This is quite normal so I guess I was naive to expect good results from this. But to give you an example of how expensive this is, for a simple iterative calculation of the cross product of two vectors followed by a normalisation: if the mathematical functions are performed in C++, ie iteratively calling the Alchemy compiled functions, the result is about 1000 times slower than natively performing the calculations in AS3!
So, my first advice is: limit the number of Alchemy calls!!
Anyway, in this article I’ll concentrate on performing pure C++ speed tests (called from AS3) – so computationally intensive calculations performed in a single Alchemy call – compared to the equivalent pure AS3 speed tests.
The tests performed here concentrate on vector and matrix operations. I’ve therefore created very simple Vector and Matrix classes in C++. The Vector class is used to perform dot product, normalisation and cross product operations as shown below.
Vector3D.h :
#ifndef VECTOR3D_H_ #define VECTOR3D_H_ #include "AS3.h" class Vector3D { public : Vector3D(); Vector3D(double x, double y, double z); Vector3D(const AS3_Val& as3Vector); virtual ~Vector3D(); double dot(const Vector3D& v) const; Vector3D cross(const Vector3D& v) const; double modulus() const; Vector3D normalise() const; void setX(double x); void setY(double y); void setZ(double z); double getX() const; double getY() const; double getZ() const; private : double _x; double _y; double _z; }; #endif /*VECTOR3D_H_*/
Vector3D.cpp :
#include "Vector3D.h" #include <cmath> Vector3D::Vector3D() : _x(0), _y(0), _z(0) { } Vector3D::Vector3D(const AS3_Val& as3Vector) { AS3_ObjectValue(as3Vector, "x:DoubleType, y:DoubleType, z:DoubleType", &_x, &_y, &_z); } Vector3D::Vector3D(double x, double y, double z) : _x(x), _y(y), _z(z) { } Vector3D::~Vector3D() { } double Vector3D::dot(const Vector3D& v) const { return v._x*_x + v._y*_y + v._z*_z; } Vector3D Vector3D::cross(const Vector3D& v) const { Vector3D result; result._x = _y*v._z - _z*v._y; result._y = _z*v._x - _x*v._z; result._z = _x*v._y - _y*v._x; return result; } double Vector3D::modulus() const { return std::sqrt(_x*_x + _y*_y + _z*_z); } Vector3D Vector3D::normalise() const { double mod = modulus(); return Vector3D(_x/mod, _y/mod, _z/mod); } void Vector3D::setX(double x) { _x = x; } void Vector3D::setY(double y) { _y = y; } void Vector3D::setZ(double z) { _z = z; } double Vector3D::getX() const { return _x; } double Vector3D::getY() const { return _y; } double Vector3D::getZ() const { return _z; }
One point, specific to Alchemy, is in one of the constructors for the Vector3D: the properties are extracted from the passed AS3 Vector3D object, as discussed in my previous article.
The Matrix3D C++ class is as follows.
Matrix3D.h :
#ifndef MATRIX3D_H_ #define MATRIX3D_H_ #include "Vector3D.h" class Matrix3D { public : Matrix3D(); virtual ~Matrix3D(); void setRotationX(double degrees); void setRotationY(double degrees); void setRotationZ(double degrees); void setIdentity(); Vector3D transformVector(const Vector3D& vector) const; private : double _M00; double _M01; double _M02; double _M10; double _M11; double _M12; double _M20; double _M21; double _M22; }; #endif /*MATRIX3D_H_*/
Matrix3D.cpp :
#include "Matrix3D.h" #include <cmath> Matrix3D::Matrix3D() : _M00(1), _M01(0), _M02(0), _M10(0), _M11(1), _M12(0), _M20(0), _M21(0), _M22(1) { } Matrix3D::~Matrix3D() { } void Matrix3D::setIdentity() { _M00 = 1; _M01 = 0; _M02 = 0; _M10 = 0; _M11 = 1; _M12 = 0; _M20 = 0; _M21 = 0; _M22 = 1; } void Matrix3D::setRotationX(double degrees) { setIdentity(); double radians = degrees / 180 * M_PI; _M11 = cos(radians); _M12 = -sin(radians); _M21 = sin(radians); _M22 = cos(radians); } void Matrix3D::setRotationY(double degrees) { setIdentity(); double radians = degrees / 180 * M_PI; _M00 = cos(radians); _M02 = sin(radians); _M20 = -sin(radians); _M22 = cos(radians); } void Matrix3D::setRotationZ(double degrees) { setIdentity(); double radians = degrees / 180 * M_PI; _M00 = cos(radians); _M01 = -sin(radians); _M10 = sin(radians); _M11 = cos(radians); } Vector3D Matrix3D::transformVector(const Vector3D& vector) const { Vector3D result; result.setX(_M00*vector.getX() + _M01*vector.getY() + _M02*vector.getZ()); result.setY(_M10*vector.getX() + _M11*vector.getY() + _M12*vector.getZ()); result.setZ(_M20*vector.getX() + _M21*vector.getY() + _M22*vector.getZ()); return result; }
One of the objectives of using the Matrix3D class is to test the performance of the trigonometric functions. A common source of intensive calculations in 3D graphics is the rotation of vectors so this provides a useful test directly aimed at this field.
Two tests are to be examined: one for cross product calculations and another for matrix transformations. These are defined in the main.cpp file.
#include "AS3.h" #include "Vector3D.h" #include "Matrix3D.h" AS3_Val speedTest1(void* self, AS3_Val args) { // Declare AS3 variables AS3_Val as3Vector1; AS3_Val as3Vector2; // Extract variables from arguments array AS3_ArrayValue(args, "AS3ValType, AS3ValType", &as3Vector1, &as3Vector2); // Create native C++ objects with AS3 parameters Vector3D vector1(as3Vector1); Vector3D vector2(as3Vector2); Vector3D vector3; // Speed test : calculate cross products and normalise for (int i = 0; i < 1000000; i++) { vector3 = vector1.cross(vector2); vector3 = vector3.normalise(); vector1 = vector2; vector2 = vector3; } // Obtain a class descriptor for the AS3 Vector3D class AS3_Val vector3DClass = AS3_NSGet(AS3_String("flash.geom"), AS3_String("Vector3D")); AS3_Val params = AS3_Array(""); // Construct a new AS3 Vector3D object with empty parameters AS3_Val result = AS3_New(vector3DClass, params); // Set the x, y and z properties of the AS3 Vector3D object, casting as appropriate AS3_Set(result, AS3_String("x"), AS3_Number(vector3.getX())); AS3_Set(result, AS3_String("y"), AS3_Number(vector3.getY())); AS3_Set(result, AS3_String("z"), AS3_Number(vector3.getZ())); // Release what's no longer needed AS3_Release(params); AS3_Release(vector3DClass); // return the AS3 Vector return result; } AS3_Val speedTest2(void* self, AS3_Val args) { // Declare AS3 variable AS3_Val as3Vector; // Extract variables from arguments array AS3_ArrayValue(args, "AS3ValType", &as3Vector); // Create native C++ object with AS3 parameters Vector3D vector(as3Vector); Vector3D copy = vector; Matrix3D rotationX; Matrix3D rotationY; Matrix3D rotationZ; // Speed test : calculate rotation matrices and transform vector for (int i = 0; i < 1000; i++) { vector = copy; for (double ang = 0; ang < 180; ang++) { rotationX.setRotationX(ang); rotationY.setRotationY(ang); rotationZ.setRotationZ(ang); vector = rotationX.transformVector(vector); vector = rotationY.transformVector(vector); vector = rotationZ.transformVector(vector); } } // Obtain a class descriptor for the AS3 Vector3D class AS3_Val vector3DClass = AS3_NSGet(AS3_String("flash.geom"), AS3_String("Vector3D")); AS3_Val params = AS3_Array(""); // Construct a new AS3 Vector3D object with empty parameters AS3_Val result = AS3_New(vector3DClass, params); // Set the x, y and z properties of the AS3 Vector3D object, casting as appropriate AS3_Set(result, AS3_String("x"), AS3_Number(vector.getX())); AS3_Set(result, AS3_String("y"), AS3_Number(vector.getY())); AS3_Set(result, AS3_String("z"), AS3_Number(vector.getZ())); // Release what's no longer needed AS3_Release(params); AS3_Release(vector3DClass); // return the AS3 Vector return result; } /** * Main entry point for Alchemy compiler. Declares all functions available * through the Alchemy bridge. */ int main() { // Declare all methods exposed to AS3 typed as Function instances AS3_Val speedTest1Method = AS3_Function(NULL, speedTest1); AS3_Val speedTest2Method = AS3_Function(NULL, speedTest2); // Construct an object that contains references to all the functions AS3_Val result = AS3_Object("speedTest1:AS3ValType, speedTest2:AS3ValType", speedTest1Method, speedTest2Method); // Release what's no longer needed AS3_Release(speedTest1Method); AS3_Release(speedTest2Method); // Notify the bridge of what has been created -- THIS DOES NOT RETURN! AS3_LibInit(result); // Should never get here! return 0; }
For an explanation of the code and the C++ API of Alchemy, I’ll refer you to my previous article on passing and returning objects to and from C++ using Alchemy.
The first test, speedTest1, performs 1,000,000 times the cross product of two vectors (initially passed by AS3) followed by a normalisation. The resulting vector is used in the following iteration. At the end of all the iterations, the final vector is returned to AS3.
The second test, speedTest2, calculates rotation vectors around the x, y and z axes. A vector (initially passed by AS3), is then rotated by each matrix individually. This is repeated for 180 steps, increasing the angle of rotation by 1 degree at a time. This again is repeated for a total of 1,000 iterations. The final vector is returned to AS3.
Let’s have a look now at the ActionScript class that calls these tests, and the equivalent pure AS3 tests.
package { import cmodule.vector.CLibInit; import flash.display.Sprite; import flash.display.StageAlign; import flash.display.StageScaleMode; import flash.geom.Matrix3D; import flash.geom.Vector3D; import flash.text.TextField; import flash.text.TextFieldAutoSize; import flash.utils.getTimer; public class AlchemySpeedTest extends Sprite { private var vectorUtils:Object; public function AlchemySpeedTest() { // Set up the stage stage.align = StageAlign.TOP_LEFT; stage.scaleMode = StageScaleMode.NO_SCALE; // Create the Alchemy bridge to C++ methods var loader:CLibInit = new CLibInit; vectorUtils = loader.init(); // Create a text field var timerText:TextField = new TextField(); timerText.autoSize = TextFieldAutoSize.LEFT; addChild(timerText); // Initialise a timer var time0:int = getTimer() // Perform the speed test var vector:Vector3D = speedTest1(); //var vector:Vector3D = speedTest2(); //var vector:Vector3D = nativeSpeedTest1(); //var vector:Vector3D = nativeSpeedTest2(); // Calculate the elapsed time var time1:int = getTimer() var totalTime:int = time1 - time0; // Display elapsed time and final vector timerText.text = "Time taken = " + totalTime + " vector = (" + vector.x + ", " + vector.y + ", " + vector.z + ")"; } /** * Speed test using C++ to iteratively calculate the cross products of two vectors */ private function speedTest1():Vector3D { var vector1:Vector3D = new Vector3D(0.123, 0.456, 0.789); var vector2:Vector3D = new Vector3D(0.987, 0.654, 0.321); return vectorUtils.speedTest1(vector1, vector2); } /** * Speed test using C++ to iteratively calculate rotation matrices and apply these to a vector */ private function speedTest2():Vector3D { var vector:Vector3D = new Vector3D(0.123, 0.456, 0.789); return vectorUtils.speedTest2(vector); } /** * Speed test using AS3 to iteratively calculate the cross products of two vectors */ private function nativeSpeedTest1():Vector3D { var vector1:Vector3D = new Vector3D(0.123, 0.456, 0.789); var vector2:Vector3D = new Vector3D(0.987, 0.654, 0.321); var vector3:Vector3D; var time0:int = getTimer() for (var i:int = 0; i < 1000000; i++) { vector3 = vector1.crossProduct(vector2); vector3.normalize(); vector1 = vector2; vector2 = vector3; } return vector3; } /** * Speed test using AS3 to iteratively calculate rotation matrices and apply these to a vector */ private function nativeSpeedTest2():Vector3D { var vector:Vector3D = new Vector3D(0.123, 0.456, 0.789); var copy:Vector3D = vector.clone(); var rotationX:Matrix3D = new Matrix3D(); var rotationY:Matrix3D = new Matrix3D(); var rotationZ:Matrix3D = new Matrix3D(); for (var i:int = 0; i < 1000; i++) { vector = copy.clone(); for (var ang:Number = 0; ang < 180; ang++) { rotationX.identity(); rotationX.appendRotation(ang, Vector3D.X_AXIS); rotationY.identity(); rotationY.appendRotation(ang, Vector3D.Y_AXIS); rotationZ.identity(); rotationZ.appendRotation(ang, Vector3D.Z_AXIS); vector = rotationX.transformVector(vector); vector = rotationY.transformVector(vector); vector = rotationZ.transformVector(vector); } } return vector; } } }
Without going into too many details of the code, you’ll see that in the constructor we can choose one of four tests: speedTest1 and speedTest2, as discussed above, and nativeSpeedTest1 and nativeSpeedTest2 which perform the same calculations but using pure ActionScript classes. The time taken to perform the calculations is then displayed along with the final vector so that we can be sure that the results are the same in a TextField.
To make reasonable comparisons I’ve tried to make the object creation in both ActionScript and C++ relatively equal: creating objects takes time so can obfuscate the obtained timing results. If you find any glaring differences between the C++ and ActionScript versions then please let me know and I’ll modify this post.
If you’d like to take a look at the whole project (set up using automake and ant as shown in my previous article on setting up a development environment for Alchemy in Flex Builder 3) then you can find all the files here.
Results
The native and Alchemy speed tests were compared initially to ensure that they both produce the same results. One surprising result was that for the matrix rotation test, the resulting vector diverged progressively as the number of iterations increased. This is presumably because of rounding errors being different between C++ (which uses double floating point values) and ActionScript. To limit this, you’ll notice that the vector is reset before the inner iteration over the 180 angles.
More important are the timing results… and the winner is… !
For speedTest1 (vector cross product and normalisation) I obtained the following:
Alchemy : 1309ms (averaged from 4 runs: 1346, 1285, 1284, 1322)
Native : 1192ms (averaged from 4 runs: 1232, 1147, 1176, 12123)
For speedTest2 (rotation matrix creation and vector transformation) the following times were obtained:
Alchemy : 814ms (averaged from 4 runs: 803, 826, 814, 813)
Native : 792ms (averaged from 4 runs: 774, 787, 789, 816)
Conclusion
As you can see, even with computationally intensive calculations, native ActionScript beats Alchemy compiled C++. Shame – I was expecting huge improvements! And don’t forget that calling Alchemy code is very expensive – these tests have minimised this cost.
But is it really surprising? After all, we’re not executing natively compiled C++ code: we’re executing C++ bytecode compiled for the ActionScript virtual machine. Plus the native ActionScript functions have already been optimised.
Going to some extent to explain this, this article at Automata Studios on Understanding Adobe Alchemy (who used Alchemy to port OggVorbis to ActionScript 3) provides very interesting reading. As they say in the article:
“… Knowing that Alchemy is just spitting out the same AVM2 bytecode that Flash and Flex spit out it is pretty confusing how Alchemy code could be faster than standard ActionScript. In fact, it is not faster across the board – just in specific types of operations and when the length of a task can be used to overcome Alchemy’s intrinsic overhead….
And also:
“… Now, what are these operations that Alchemy does so well? Memory access and function calls. Alchemy compiled code utilizes new bytecodes added to FP10 for working with ByteArrays – which as you’ll remember are what make up the “RAM” in Alchemy. …”
So the result seems somewhat less attractive than that claimed by Adobe (“… Ideally suited for computation-intensive use cases (…) performance can be considerably faster than ActionScript 3.0 …”) and much more specific to the type of operations being performed.
The tests shown here are of course very limited in their scope: the idea is to provoke some discussion about where Alchemy can be beneficial rather than just stating that Alchemy will produce pure gold in all situations.
One area which may be of interest is that of green threads as stated in the above article. However these threads are platform independent and are executed in the virtual machine rather on the native OS. This limitation means that the benefits of multi-core processors cannot be tapped into… so can they really produce reasonable results when calculations are performed in parallel?
Anyway, I hope this has been of interest and of some use – as always comments, suggestions and questions are welcome!
